The objectives of this lecture are to:

- Introduce the fundamental package for scientific computing with Python --
`NumPy`

. - Creating
`NumPy`

n-dimensional arrays. - Binary and unary operations on arrays.

*Much of the lecture is based on this tutorial: http://wiki.scipy.org/Tentative_NumPy_Tutorial*

The NumPy library provides the necessary functionality to use Python as a powerful scientific computing language. From NumPy's website, it provides,

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, discrete Fourier transforms, and random number capabilities

In this lecture we will learn about items 1-2 and how to apply them to manipulation of numeric n-dimensional arrays. Note that these are *not* matrices, but instead a more general type of data that contains many scalar values which may be "addressed", as in a matrix.

We may form n-dimensional arrays, which have type `ndarray`

, for any type of numeric data (`int`

, `float`

, `complex`

). By default, NumPy will assume that the datatype is `float`

unless otherwise specified.

Let's create a 2-dimensional array by importing `NumPy`

and using its equivalent function to Python's built-in `range()`

function, `arange()`

,

In [1]:

```
# import all of NumPy's objects into the interpeter-level namespace
from numpy import *
# create an array of float values from 0->14
a = arange(15)
print(a)
# "reshape" the array to be 2-dimensional, 3 rows and 5 columns, the number of elements are conserved!
a = a.reshape(3, 5)
print(a)
print(a[0,2]) # "index" into the array to display a single item
```

Does the method in which `NumPy`

displays the array make sense? It looks like a list of lists where the first index (row) indexes into the top-level list and the second indexes into the sub-list. It should be clear how this could be generalized into higher-dimensional arrays.

Before we go into more detail, let's explore the attributes of the `ndarray`

object,

In [2]:

```
type(a)
```

Out[2]:

In [3]:

```
# the array "shape" is a tuple which specifies the size of each dimension
a.shape
```

Out[3]:

In [4]:

```
# the array "ndim" is an integer which specifies the dimensionality of the array
a.ndim
```

Out[4]:

In [5]:

```
# the array "dtype" is an data-type object which specifies type of data which the array stores
a.dtype
```

Out[5]:

In [6]:

```
# the array "size" is an integer which specifies the total number of items in the array
a.size
```

Out[6]:

`ndarray`

.

`NumPy`

n-dimensional arrays¶There are several approaches to creating an `ndarray`

, the most basic of which is using the `array()`

function,

In [7]:

```
# a 1x4 array
a = array([1, 2, 3, 4])
print(a)
```

As you can see, we are using the same list syntax as we saw in the previous example. A single list of values will create a one-dimensional array of those values.

To create higher-dimensional arrays, we can nest lists,

In [8]:

```
# a 3x4 array
a = array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(a)
```

`array()`

function takes as input nest *sequences* to create arrays of the same dimension of the degree of nesting. In fact, we do not need to use lists, we can use tuples as well

In [9]:

```
b = array( [ (1.5,2,3), (4,5,6) ] )
b
```

Out[9]:

Note what happens when I do not `print()`

the array, the object is still displayed in an understandable format. In this case the interpreter converted all integers into floats,

The type of the `array`

can also be explicitly specified at creation time:

In [10]:

```
c = array( [ [1,2], [3,4] ], dtype=complex )
c
```

Out[10]:

The function `zeros()`

creates an array full of zeros, the function `ones()`

creates an array full of ones, and the function `empty()`

creates an array whose initial content is random and depends on the previous state of the memory that was allocated. Why might you want to use `empty()`

instead of `ones()`

or `zeros()`

?

By default, the `dtype`

of the created array is `float64`

.

In [11]:

```
# note that zeros() and ones() input argument is a tuple which specifies the shape of the output
# array
# create a 3x4 array of zeros using the default datatype (float)
zeros( (3,4) )
```

Out[11]:

In [ ]:

```
# create a 2x3x4 array of ones using the int16 datatype (16-bit integer)
ones( (3,3,4), dtype=int16 )
```

In [12]:

```
# create a 2x3 array that is uninitialized
empty( (2,3) )
```

Out[12]:

*sequences* of numbers, NumPy provides a function analogous to `range()`

that returns arrays instead of lists,

In [13]:

```
# create an array of the sequence of numbers starting with 10 and ending with 30
# (but not including) incremented by 5.
arange( 10, 30, 5 )
```

Out[13]:

In [14]:

```
# what happens when one or more of the arguments is a float?
arange( 10, 30, 5.)
```

Out[14]:

In [ ]:

```
# thus a float may be used as any argument!
arange( 0, 2, 0.3 )
```

`arange()`

is used with floating point arguments, it is not always possible to predict the length of the resulting array due to floating point precision. For this reason, NumPy provides `linspace()`

and `logspace()`

that instead require the number of elements instead of the increment,

In [ ]:

```
# Unlike arange(), linspace() outputs arrays with datatype float by default which is more
# natural for scientific computing
linspace( 0, 2, 10)
```

In [ ]:

```
# linspace() is very useful for plotting functions, as we will see in the last lecture
x = linspace( 0, 2*pi, 10 )
sin(x)
```

In [ ]:

```
# logspace() outputs a logarithmic scale, by default the base is 10, but any value can be
# used, this functionality will be very useful later in your coursework!
# the first two arguments are the power to which the base is raised, not the actual value
# in the case, the output will be a logarithmic scale sequence starting at 10**0 and ending
# (but not including) 10**3 with 10 points.
logspace(0, 3, 10)
```

Printing small arrays is a relatively simple task, even if they are high dimensional. By default, NumPy decomposes arrays of dimension more than 2 into a two-dimensional representation.

In [ ]:

```
# a 2D array example, arange only outputs 1D arrays so we must reshape()
b = arange(12)
b = b.reshape(4, 3)
print(b)
```

In [ ]:

```
# a 3D array example using a more compact way to immediately reshape
# the 1D array returned by arange()
c = arange(24).reshape(2,3,4)
print(c)
```

In [ ]:

```
# a large 1D array
print(arange(10000))
```

In [ ]:

```
# a large 2D array
print(arange(10000).reshape(100,100))
```

Now that we have a basic understanding of creating NumPy arrays, we can now start using arithmetic operators on them. NumPy applies arithmetic operators on arrays *elementwise* and (typically) creates a new array initialized with the resulting values.

In [ ]:

```
a = array( [20,30,40,50] )
b = arange( 4 )
print(a, b)
# this operation will be successful even though a and b have different shapes
c = a - b
c
```

In [ ]:

```
# raising an array to a power results in a new array with each element raised to a power
a**2
```

In [ ]:

```
# passing an array to the sin() function returns a new array of the values of the sin() applied
# to each element of a
sin(a)
```

In [ ]:

```
# here is an interesting case, now the comparison is applied to each element of `a` and a new
# array is created with the results of these comparisons!
a < 35
```

In [ ]:

```
A = array( [[1,1],
[0,1]] )
B = array( [[2,-1],
[3,4]] )
print(A + B)
print(A - B)
print(A * B)
print(A / B)
```

In [ ]:

```
# A and B have different shapes, so NumPy tries to find sequences of sub-arrays of the smaller
# array in the larger array.
A = array( [[1, 1, 1],
[1, 0, 1]] )
B = array( [2, 0, 1] )
# `A` is the larger array, thus NumPy determines if there are sub-arrays in `A` with the same
# shape as `B` and then applies the subtraction to these subarrays!
# The array resulting from these types of arithmetic operations has the shape of the *larger*
# array
print(A - B)
print(B - A)
```

Broadcasting is not always successful,

In [ ]:

```
A = array( [[1, 1, 1],
[1, 0, 1]] )
B = array( [2, 0] )
A - B
```

Can you formulate the rule that NumPy uses to determine if arithmetic operations broadcast or not? Use the error from the previous evaluation to help your thought process. In the successful evaluation the shapes of the arrays were,

`A.shape = (2,3)`

`B.shape = (3,)`

and for the unsuccessful evaluation,

`A.shape = (2,3)`

`B.shape = (2,)`

What is the difference between the two sets of arrays? In the first case, NumPy evaluates the arithmetic operation elementwise where the element is a 1D array of size 3! Thus NumPy "sees" two elements of size B in array A.

In the second case, NumPy cannot find sub-arrays of A of the same shape as B. Thus you might have been able to deduct the general rule that NumPy uses for broadcasting arithmetic operations:

**1)** Compare the dimensions of A and B,

`A.dim = a`

`B.dim = b`

**2)** Assuming a > b, compare the last b values of the shapes of A and B, if they are equal the arithmetic operation broadcasts, else it does not.

Let's try a few examples,

In [ ]:

```
# An arithmetic operation should broadcast for arrays of shape (a,b,c) and (b,c)
A = arange(12).reshape(2,3,2)
B = arange(6).reshape(3,2)
print(A)
print("----------")
print(B)
```

In [ ]:

```
print(A + B)
print("-"*8) # a more compact way of printing a divider!
print(A * B)
```

Many unary operations, such as computing the sum of all the elements in the array, are implemented as methods of the `ndarray`

class,

In [ ]:

```
A = arange(6).reshape((2,3))
A
```

In [ ]:

```
A.sum()
```

In [ ]:

```
A.min()
```

In [ ]:

```
A.max()
```

`axis`

parameter you may apply the operation over each a sub-array element resulting from treating the values over each element in the axis specified,

In [ ]:

```
B = arange(12).reshape(3,4)
B
```

In [ ]:

```
# sum over the first axis (3, 4)->(4,)
B.sum(axis=0)
```

In [ ]:

```
# sum over the second axis (3, 4)->(3,)
B.sum(axis=1)
```

In [ ]:

```
B = arange(3)
print(B)
exp(B)
```

In [ ]:

```
sqrt(B)
```