Introduction to Python and NumPy I

Last modified by Xwiki VePa on 2024/02/07 07:37

This tutorial is continued in part two: Introduction to Python and NumPy II

Overview


In this laboratory exercise you will be introduced to using the Python computer language and some helpful Python libraries for math and plotting. As mentioned in lecture, I've chosen to use the Python instead of MATLAB for this course, mainly because Python is free and open source, and will work just as well as MATLAB in many instances. Python is an easy to learn, powerful programming language that is widely used and very flexible. It is an interpreted language, meaning you'll be able to enter commands at a prompt and the computer will execute the commands when you press the Enter key. This makes testing code and learning efficient and easy. Python can be run from the command line on Linux- or Mac-based computers, but we will be using the Enthought Canopy software instead. Canopy comes pre-configured with many different scientific and mathematical libraries, as well as the ability to easily make plots, edit scripts and make good use of Python. As a student, you can make an account on the Enthought website and download a free copy of the academic version of Canopy. It is pre-installed on all of the machines in D211, but you may find it helpful to have a personal copy for use outside of class. It is easy to install and should work without much effort. Below are instructions for launching Canopy on the computers in room D211, followed by a series of tutorial tasks to get you familiar with Python and the numerical libraries included in NumPy.

Launching Canopy


To get started, you will need to launch and configure the Enghought Canopy software package.

  1. Start by opening a Terminal window by clicking on the Dash Home icon at the top left corner of the screen and entering "terminal" into the search box. Click on the Terminal icon.

    Dash_home.png

    After Terminal opens, you might want to right-click on the icon and select 'Lock to Launcher'

  2. You can launch Canopy by entering the text /usr/local/Canopy/canopy into the Terminal window.

    Launching_canopy.png

    We can make an alias to make it easier to launch Canopy in the future by typing

    echo "alias Canopy=/usr/local/Canopy/canopy" >> ~/.bashrc

    in a terminal window. Thus, Canopy can be launched from the Terminal by simply typing Canopy.

  3. If this is the first time you have run Canopy, you will be prompted to configure your Canopy environment. When the prompt appears, click Continue.

    Configuring_Canopy.png

    Config_progress.png
  4. Configuration can take up to 15 minutes. After the configuration completes, select No when asked whether Canopy should be your default Python environment, then click Start using Canopy.

    Making_Canopy_default_Python.png
  5. The main Canopy window should now appear, as shown below. Click on the Editor button to get started.

    Canopy_window_w_Editor_selected.png
  6. You should now see the Canopy editor window. The window has three panels. Along the left side is a file browser. The upper panel on the right side is a text editor panel, which will allow you to create, modify and save Python source code. The lower panel is an interactive Python environment (IPython), where you can enter commands that will be processed by the Python interpreter and the results will be displayed. At this point, you're ready to get started learning a bit of Python! In the future, you can launch Canopy by typing /usr/local/Canopy/canopy in a Terminal window.

    Canopy_Editor.png

Getting started in Python


With the Canopy editor window open, we can now start tinkering. To get started, we'll look at an example of how to quickly make a plot, then review what has actually happened with the Python command.

>>> plot([1,2,3,4,5],'o-')

I will be using the generic command prompt text >>> to refer to the Python command prompt. In Canopy, the prompt is in the format nIn []:, where n in the number of the command in the list of commands that have been run in the current Canopy session.

If you copy and past the command above (excluding the prompt text '>>>'), you should see the following plot when you press Enter.

demo_plot.png

So, what happened? Well, a few things. First, we've used the plot function to produce a plot of the numbers in the list [1,2,3,4,5] as a function of their index value, or position in the list. The plot function can be used to produce 2D data plots in Canopy. The values enclosed in the brackets [ ] comprise a Python list. Python lists can contain any kind of data stored in a common structure, in this case a list of whole numbers, or integers. The index value refers to the position in the list, starting from 0. For example, the value 3 in the list is located at index value 2, whereas list value 5 is at index value 4. So what about the 'o-' after the comma? In this case, we've told the plot command to use circle symbols (o) and a solid line (-) to plot the data. If this isn't 100% clear at this point, don't worry.

The plot command used above is actually part of the Matplotlib Python library. Matplotlib is automatically loaded in Enthought Canopy, making it easy to quickly generate plots.

Getting help

There are several ways to get help in Python. Personally, I typically go to Google and enter the command or issue for which I'm seeking help. Python is widely used, and thus a Google search can often be the most efficient way to find information.

Within Python itself, there are also some very convenient ways to get help. The most obvious is the help function. To get started, you can type

>>> help()

This will bring up a general help browser, in which you can look up functions, keywords or other Python topics. To get help on specific functions, you can enter the function name with the help function. For example, to get help on the plot function, you can type

>>> help(plot)

Many Python functions also have documentation strings that can be used for help. To output the doc string for the plot function, you could type

>>> print(plot.__doc__)

In case it is unclear above, the doc string associated with a function is given by adding a period, two underscores, the text 'doc' and two more underscores (.__doc__).

Python variables

As we have seen in lectures, variables in Python are quite flexible and can be used to reference many different kinds of data. Let's say we wanted to add a few numbers together.

>>> 2 + 2                                        # Add two numbers
4                                                # Their sum is reported
>>> _ + 5                                        # Add 5 to the previous sum
9                                                # 9, as expected

As we can see above, Python makes a decent calculator. One important feature to notice is that in interactive Python sessions, the most recently returned value is stored in the Python variable _ (an underscore). You can use this shortcut, for example to add to the previous sum, as shown. Also note that I've included some comments at the ends of the lines to describe what is going on. Recall that comments start with the # character and the text that follows does not get executed by Python. Now let's calculate a sum and assign it to a variable.

>>> x = 2 + 2                                    # Store the value of 2 + 2 as the variable x
>>> x
4
>>> x = x + 2.5                                  # Add 2.5 to the previous x value
>>> x
6.5
>>> y = 2**2 + math.log(math.pi) * math.sin(x)   # A slightly more complicated example
>>> y
4.246254279407689
>>> theta = math.acos(-1)                        # Note trigonometry is done in radians
>>> theta
3.141592653589793

In these cases, we've stumbled upon something new, the Python math library. Many common math functions (sine, cosine, logarithms, etc.) are contained in the math library, as well as certain math constants like pi. Functions in the math library can be accessed by including math. before the name of the function. Thus, to get the sine value of variable x you type math.sin(x) .

You can find the functions that are contained within a Python library in Canopy by typing the library name followed by a period and then hitting the Tab key. I suggest you give this a shot for the math library.

Data types

Among the many convenient features of Python is dynamic data typing. To understand this, we first need to see what data types are by way of some examples.

>>> var1 = "Text"
>>> var2 = 1
>>> var3 = 4.94
>>> var4 = True

As you can see above, we've assigned different types of data to four different variables. The first variable, var1, contains text, while variables 2 and 3, var2 and var3, contain numbers, and variable 4, var4, is a boolean variable (True or False). These differences are important because some types of data can or cannot interact with one another. For example, you would have no problem if you tried to calculate var2/var3, but Python would give you an error message if you tried to divide var1 by var3. Python doesn't know how to divide text by a number. Feel free to try this if you don't believe me .

You can use the type function to check the type of data stored in a given variable. Consider the examples from above.

>>> type(var1)
str
>>> type(var2)
int
>>> type(var3)
float
>>> type(var4)
bool

As you can see, Python knows the data type of each variable. str refers to a character string for var1, int tells us var2 is an integer (or whole number), the float type for var3 indicates it is a floating point (or decimal) number and bool for var4 tells us it is a boolean variable. In each case, these types are dynamically assigned by Python when the variable is defined, hence dynamic data typing. Most of the time you won't have any data type problems, but the two examples below illustrate features you should keep in mind.

>>> var5 = 2
>>> var6 = var2 / var5
>>> var6
0
>>> type(var6)
int
>>> var7 = var3 / var5
>>> var7
2.47
>>> type(var7)
float

What we can see here is that dividing an integer by another integer returns a value that is also an integer. In this case, since the integer 1 cannot be divided by 2 as a whole number, the resulting value is 0 for var6. For var7, we see that dividing a floating point number by an integer returns a floating point number. The same thing would be true if you divide an integer by a floating point number. As you can see the data type matters.

Matrix math with NumPy

NumPy is a scientific computing package for Python that provides MATLAB-like arrays and many other numerical features. Like Matplotlib, it is included in Canopy and automatically available at the prompt. See for yourself:

>>> array = np.array([1,2,3,4,5])

So, what's going on here? We've defined a new variable array and the contents appear to be the same list we had used in the first example plot [1,2,3,4,5]. That is partly true. Here, we are using the NumPy array type to create an array with the values [1,2,3,4,5]. The NumPy functions are available in the NumPy library (np), and creating an array is done using np.array. You might be wondering how a Python list and an NumPy array differ, and why you would want to use one over the other. The example below shows some differences.

>>> a = [1,2,3,4,5]
>>> a
[1,2,3,4,5]
>>> type(a)
list
>>> b = np.array([1,2,3,4,5])
>>> b
array([1, 2, 3, 4, 5])
>>> type(b)
numpy.ndarray

First, we can see that the list a has the same values in it as the array b, but they have different types; a is a list, b is an array. Lists can contain a mixture of values of any of the four data types mentioned in the previous section (str, int, float, bool), but all of the values in an array must be of the same type, typically int or float. This is the main difference between lists and arrays. In general, array operations are much faster for large arrays than their equivalent lists.

Now that we know a bit about arrays, let's look at some ways we can fill arrays with values and interact with them. We'll start by defining an array x that ranges from 0 to 3*π, calculating and plotting values that vary as a function of x. This can be done in two ways.

>>> x = np.arange(0, 3*math.pi, 0.1)     # x from 0 to 3*π by constant increments of 0.1
>>> x = np.linspace(0, 3*math.pi, 100)   # x from 0 to 3*π in 100 equal increments
>>> beta = 0.5
>>> y = beta*np.sin(x)
>>> plot(x,y)

sin_curve_0_to_3pi.png

Hopefully your plot looks similar. What you can see in this example is we've defined an array x with either a constant increment between values in the array or a fixed number of values across the range. y is a function of the sine of x and our plot confirms this.

Be aware that when you must use the NumPy sine function np.sin when calculating with array values

More fun help with matrices

We've already seen how to make 1-dimensional matrices with NumPy, but we can make 2-D matrices as well. There are two different ways to do this.

>>> x = np.array([[1, 4, 7], [2, 5, 8], [3, 6, 9]])

or

>>> x = np.array([1, 4, 7, 2, 5, 8, 3, 6, 9])
>>> x = x.reshape((3, 3))

In both cases, a 3x3 matrix x is created.

>>> x
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In the first case, the values on each row were enclosed in brackets while a second set of brackets enclosed the entire array. In the second case, we created a 1x9 array the same way we had previously and used the reshape function to convert it to a 3x3 array. If we'd like to extract a specific value from our array x, we can do that by referencing a specific location in the matrix. The format for array indexing is x(rows,columns). Thus,

>>> y = x[2,0]    # Row index 2, column index 0
>>> y
3

We can extract values from an entire column using the colon (:) character.

>>> y = x[:,1]
>>> y
array([4, 5, 6])

Notice that the middle column has been extracted.

Let's check out a bit more matrix math. We'll start with a new array

>>> a = np.array([1, 2, 3, 4, 5])
>>> a
array([1, 2, 3, 4, 5])
>>> a = a + 1                       # Add 1 to the array values
>>> a
array([2, 3, 4, 5, 6])              # Each value increases by 1

Let's create another array, b, and add it to a.

>>> b = np.array([5, 4, 3, 2, 1])
>>> a + b
array([7, 7, 7, 7, 7])

As you can see, the result is simply the sum of the values in a and b, added for each array index location.

Multiplication of values in arrays can be done in two ways: Either by multiplying each value in the two arrays by their corresponding index, or by proper matrix multiplication (dot product).

>>> a * b
array([10, 12, 12, 10,  6])
>>> np.dot(a,b)
50

Matrix multiplication using NumPy works just the opposite of MATLAB. If you have used MATLAB in the past, be careful.

Plotting in more detail

As we have seen above, 2-D plots can be created using the Matplotlib function plot. We can plot two arrays by typing plot(x,y). The plot function can take multiple arguments, allowing you to plot a 1-D array as a function of its index value by typing plot(a), or a 2-D array using separate line colors for each column by typing plot(x). You can also include optional arguments that allow you to format the plot. As an example, consider the following

>>> plot(a,b,'k--')

If you have the same values still stored in arrays a and b from earlier, you should now see a plot of a black dashed line with a slope of -1. In this case the 'k--' indicates a black line (k) should be drawn with dashes (--). Some common color choices and line styles/symbols are shown below. Many more are listed in the documentation for plot, which you can access using help(plot).

    ==========  ========  
   character   color
   ==========  ========
   'b'         blue
   'g'         green
   'r'         red
   'k'         black
   ==========  ========


   ================    ===============================
   character           description
   ================    ===============================
   '-'                 solid line style
   '--'                dashed line style
   ':'                 dotted line style
   '.'                 point marker
   'o'                 circle marker
   's'                 square marker
   'x'                 x marker
   ================    ===============================

In addition to formatting the plots, we can also use the plot command to plot multiple lines at the same time.

>>> y1 = np.random.rand(10)          # np.random.rand(n) generates an array of n random numbers
>>> y2 = np.linspace(1,0,10)
>>> x = np.arange(10)                # np.arange(n) creates an array of n numbers in increasing order from 0
>>> plot(x, y1, 'ks', x, y2, 'r-')
>>> title('Test of plot function')   # Add a title
>>> xlabel('X Axis')                 # Label the x axis
>>> ylabel('Y Axis')                 # Label the y axis

two_plots_at_once.png

Another useful plot function is errorbar. This example will produce a plot of the random values y1 with an errorbar of ±0.1.

>>> errorbar(x, y1, np.ones(10)*0.1)   # np.ones(n) generates a 1-D array of length n filled with values of 1
>>> title('Test of errorbar')
>>> xlabel('X axis'); ylabel('Y axis')

errorbar.png

Adding text labels to plots

As you have likely seen, titles and axis labels can be easily added to the plots generated by Matplotlib. You can also add text at a given location on a plot using the text function.

>>> text(1,0.5,'Here is some text')

This will add the text 'Here is some text' at position x=1, y=0.5 on the plot.

Changing the axes

The plotting commands make their best guess for the axis ranges that will produce a good looking plot, but often you may want to change the axis ranges manually. The axis function will provide you with the current axis ranges for a given plot and a means for changing them.

>>> axis ()
(0.0, 9.0, -0.20000000000000001, 1.2000000000000002)
>>> axis([2, 7, -0.2, 1.0])
[2, 7, -0.2, 1.0]

When you change the axis ranges, you enter the ranges as [xmin xmax ymin ymax]. Note that this will allow you to flip the orientation of the axes, if desired. For example [2, 7, 1.0, -0.2] would reverse the y-axis in the example above.

Creating script .py files

While you're welcome to type commands into the Python prompt to perform your calculations, it is often helpful to save a list of command to a single file that can be read in Python to execute a series of commands. These files are known as .py files, and essentially, they contain a list of exactly what should be done to produce desired output. In Canopy, it may be easiest to start by simply copying commands you've typed into a new .py file. You can create a new .py file either by clicking on the Create a new file button in the Canopy editor, or going up to the menubar and selecting FileNewPython file. Below is an example of a simple .py file that you can copy and paste into a new document in the Canopy editor. Save the file as colorsines.py and click on the sideways green triangle at the top of the Canopy editor to test. You should see colorful sine curves.

#!/usr/bin/python
# -*- coding: latin-1 -*-
# colorsines.py
#
# This program produces colorful sine curves
#
# dwhipp - 11.3.14
 
# Import NumPy
from pylab import *
 
def main():
   # Define x from -2π to 2π in 100 steps
    x = np.linspace(-2*math.pi,2*math.pi,100)
   # Define y1 as the sine of x
    y1 = np.sin(x)
   # Define y2, y3, y4, y5 as y1 multiplied by 0.8, 0,6, 0.4, 0.2
    y2 = y1 * 0.8
    y3 = y1 * 0.6
    y4 = y1 * 0.4
    y5 = y1 * 0.2
   # Plot y1, y2, y3, y4 and y5 as a function of x in different colors
    plot(x, y1, 'k-', x, y2, 'r-', x, y3, 'b-', x, y4, 'g-', x, y5, 'm-')
   # Add a plot title and axis labels
    title('My script plot')
    xlabel('x axis')
    ylabel('y axis')
   # Display the plot
    show()
 
main()

Here we see a few new things. First, Canopy normally automatically sets up NumPy and Matplotlib, but we need to import those functions to use them in a script with the command from pylab import *. Second, we've placed the array definitions and plotting stuff in a defined function called main. We simply define a new function within the code, then call the function at the end to execute the commands. When you write your first python scripts, it is a good idea to follow this design. Lastly, in order to display a plot in a script you must use the show() function. Any formatting of the plot must be listed prior to the show() function.

When you define a function in Python, all of the commands that are part of the function must be indented to the same level. Typically, this indentation is 4 spaces.

Exercises


The exercises below are intended to help you become more familiar with the use of Python and NumPy. For each of the exercises, you are asked to submit a diary of commands that were run to generate the desired output, some code and/or a plot, and a few paragraphs summarizing the exercise. You can produce a diary of commands you have entered using the history command in Python. If you type history, you will see all of the commands you have entered during this Python session. Select the relevant commands for the exercise, copy them and paste them into a new text file using the Canopy editor.

Exercise 1 - Fun with matrices

Create three 3x3 matrices a, b and c with the following values:

Failed to execute the [mathinline] macro. Cause: [Parameter [body] is mandatory]. Click on this message for details.

Please provide your diary (command history) and solutions to the following problems.

  1. Add the values in matrix a to matrix b
  2. Subtract matrix a from matrix c
  3. Multiply the values in matrix b by 4
  4. Multiply the values in matrix b with matrix c. I want the product of the values in each location of the matrix, as shown in the example below for a 2x2 matrix.
    Example:

    Failed to execute the [mathinline] macro. Cause: [Parameter [body] is mandatory]. Click on this message for details.

For this exercise, provide a command history for the problems above.

Exercise 2 - Plotting everything but the kitchen sinc help

For this exercise you need to produce a Python script .py file that plots the sinc function from 0 to 8π. The sinc function is equal to the sin(π*x)/(π*x). This function is the continuous inverse Fourier transform of the rectangular pulse of width 2π and height 1. You can create a new .py file by clicking on FileNewPython file in the menubar. Your script should

  1. Create a 1-D array x that goes from 0 to 8π by increments of 0.1 and a variable y that is the sinccancel.
  2. Plot the values of y as a function of x with grid lines in the background of the plot (see help(grid) for guidance), label the axes appropriately and give the plot a title.
  3. Contain comments on each line of the code describing what the code does.

For this exercise, please submit a printout of the Python script file you've written and a printout of the plot the .py file creates.

Exercise 3 - "Decoding" other people's Python code

Typically, you don't start writing a Python code from scratch when you want to do calculations and/or real science. Most of the time someone else has already written a code that is similar to what you want to do, and your goal is to modify the code to suit your needs. Step one is to figure out what the code is supposed to do before you start tinkering. For this exercise, download the example Python script Mohr_circle_D-P_2D.pyand save a copy on the Desktop or in another directory. If you navigate to the file in the Canopy editor using the panel on the left, you can double-click on the file and then click the sideways green triangle at the top of the Canopy window to run the code. Please answer the following questions:

  1. What is the purpose of this code, what does it do, and does it appear to work properly?
  2. What are the commands you recognize in the code?
  3. What are the commands you do not recognize?
  4. Are the comments sufficient for "decoding" this code as a new user? Why or why not?

For this exercise, please submit a printout of your typed responses to the questions above.