What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Learning NumPy Array

Chapter 1. Getting Started with NumPy

Let's get started. We will install NumPy and related software on different operating sytems and have a look at some simple code that uses NumPy. As mentioned in the Preface, SciPy is closely related to NumPy, so you will see the name SciPy appearing throughout the chapter. At the end of this chapter, you will find pointers on how to find additional information online if you get stuck or are uncertain about the best way to solve problems.

In this chapter, we shall learn the following skills:

Installing Python, SciPy, Matplotlib, IPython, and NumPy on Windows, Linux, and Macintosh
Writing simple NumPy code
Adding arrays
Making use of online resources and help

Installing NumPy, Matplotlib, SciPy, and IPython on Windows

Installing NumPy on Windows is a necessary but, fortunately, straightforward task that we will cover in detail. You only need to download an installer, and a wizard will guide you through the installation steps. It is recommended that Matplotlib, SciPy, and IPython be installed. However, this is not required to enjoy this book. The actions we will take are as follows:

Download a NumPy installer for Windows from the SourceForge website at http://sourceforge.net/projects/numpy/files/.
Choose the appropriate version. In this example, we chose numpy-1.8.0-win32-superpack-python2.7.exe.
Open the EXE installer by double-clicking on it.
Now, we can see a description of NumPy and its features, as shown in the previous screenshot. Click on the Next button.
If you have Python installed, it should automatically be detected. If it is not detected, maybe your path settings are wrong. At the end of this chapter, resources are listed in case you have problems installing NumPy.
In this example, Python 2.7 was found. Click on the Next button if Python is found, otherwise, click on the Cancel button and install Python (NumPy cannot be installed without Python). Click on the Next button. This is the point of no return. Well, kind of, but it is best to make sure that you are installing to the proper directory and so on and so forth. Now the real installation starts. This may take a while.
Install SciPy and Matplotlib with the Enthought distribution at http://www.enthought.com/products/epd.php.
Note
The situation around installers is rapidly evolving. Other alternatives exist in various stage of maturity (see http://www.scipy.org/install.html). It might be necessary to put the msvcp71.dll file in your C:\Windows\system32 directory. You can get it at http://www.dll-files.com/dllindex/dll-files.shtml?msvcp71. A Windows IPython installer is available on the IPython website (see http://ipython.scipy.org/Wiki/IpythonOnWindows).

Installing NumPy, Matplotlib, SciPy, and IPython on Linux

Installing NumPy and related recommended software on Linux depends on the distribution you have. We will discuss how you would install NumPy from the command line although you could probably use graphical installers; it depends on your distribution (distro). The commands to install Matplotlib, SciPy, and IPython are the same—only the package names are different. Installing Matplotlib, SciPy, and IPython is recommended, but optional.

Most Linux distributions have NumPy packages. We will go through the necessary steps for some of the popular Linux distros:

Run the following instructions from the command line for installing NumPy on Red Hat:
```
yum install python-numpy
```
To install NumPy on Mandriva, run the following command-line instruction:
```
urpmi python-numpy
```
To install NumPy on Gentoo, run the following command-line instruction:
```
sudo emerge numpy
```
To install NumPy on Debian or Ubuntu, we need to type the following:
```
sudo apt-get install python-numpy
```

The following table gives an overview of the Linux distributions and corresponding package names for NumPy, SciPy, Matplotlib, and IPython:

Linux distribution	NumPy	SciPy	Matplotlib	IPython
Arch Linux	`python-numpy`	`python-scipy`	`python-matplotlib`	`ipython`
Debian	`python-numpy`	`python-scipy`	`python-matplotlib`	`ipython`
Fedora	`numpy`	`python-scipy`	`python-matplotlib`	`ipython`
Gentoo	`dev-python/numpy`	`scipy`	`matplotlib`	`ipython`
OpenSUSE	`python-numpy, python-numpy-devel`	`python-scipy`	`python-matplotlib`	`ipython`
Slackware	`numpy`	`scipy`	`matplotlib`	`ipython`

Installing NumPy, Matplotlib, and SciPy on Mac OS X

You can install NumPy, Matplotlib, and SciPy on the Mac with a graphical installer or from the command line with a port manager, such as MacPorts or Fink, depending on your preference.

Note

We can get a NumPy installer from the SourceForge website at http://sourceforge.net/projects/numpy/files/. Similar files exist for Matplotlib and SciPy. Just change numpy in the previous URL to scipy or matplotlib. IPython didn't have a GUI installer at the time of writing. Download the appropriate DMG file as shown in the following screenshot; usually the latest one is the best. Another alternative is the SciPy Superpack (https://github.com/fonnesbeck/ScipySuperpack). Whichever option you choose, it is important to make sure that updates which impact the system Python library don't negatively influence the already installed software by not building against the Python library provided by Apple.

We will install NumPy with a GUI installer using the following steps:

Open the DMG file as shown in the following screenshot (in this example, numpy-1.8.0-py2.7-python.org-macosx10.6.dmg):
Double-click on the icon of the opened box, that is, the one having a subscript that ends with .mpkg. We will be presented with the welcome screen of the installer.
Click on the Continue button to go to the Read Me screen, where we will be presented with a short description of NumPy, as shown in the following screenshot:
Click on the Continue button to go to the License screen.
Read the license, click on the Continue button, and then on the Accept button, when prompted to accept the license. Continue through the next screens and click on the Finish button at the end.

Alternatively, we can install NumPy, SciPy, Matplotlib, and IPython through the MacPorts route or with Fink. The following installation steps install all these packages. We only need NumPy for the tutorials in this book, so please omit the packages you are not interested in.

To install with MacPorts, type the following command:

sudo port install py-numpy py-scipy py-matplotlib py-ipython

Fink also has packages for NumPy: scipy-core-py24, scipy-core-py25, and scipy-core-py26. The SciPy packages are: scipy-py24, scipy-py25, and scipy-py26. We can install NumPy and the other recommended packages that we will be using in this book for Python 2.6 with the following command:
```
fink install scipy-core-py26 scipy-py26 matplotlib-py26
```

NumPy arrays

After going through the installation of NumPy, it's time to have a look at NumPy arrays. NumPy arrays are more efficient than Python lists when it comes to numerical operations. NumPy arrays are in fact specialized objects with extensive optimizations. NumPy code requires less explicit loops than the equivalent Python code. This is based on vectorization.

If we go back to high school mathematics, then we should remember the concepts of scalars and vectors. The number 2 for instance is a scalar. When we add 2 and 2, we are performing scalar addition. We can form a vector out of a group of scalars. In Python programming terms, we will then have a one-dimensional array. This concept can of course be extended to higher dimensions. Performing an operation on two arrays such as addition can be reduced to a group of scalar operations. In straight Python, we will do that with loops going through each element in the first array and adding it to the corresponding element in the second array. However, this is more verbose than the way it is done in mathematics. In mathematics, we treat the addition of two vectors as a single operation. That's the way NumPy arrays do it too and there are certain optimizations using low-level C routines, which make these basic operations more efficient. We will cover NumPy arrays in more detail in the next chapter.

Adding arrays

Imagine that we want to add two vectors called a and b. A vector is used here in the mathematical sense, which means a one-dimensional array. We will learn in Chapter 4, Simple Predictive Analytics with NumPy, about specialized NumPy arrays that represent matrices. The vector a holds the squares of integers 0 to n, for instance. If n is equal to 3, then a contains 0, 1, or 4. The vector b holds the cubes of integers 0 to n, so if n is equal to 3, then the vector b is equal to 0, 1, or 8. How would you do that using plain Python? After we come up with a solution, we will compare it with the NumPy equivalent.

The following function solves the vector addition problem using pure Python without NumPy:

def pythonsum(n):
   a = range(n)
   b = range(n)
   c = []

   for i in range(len(a)):
       a[i] = i ** 2
       b[i] = i ** 3
       c.append(a[i] + b[i])

   return c

The following is a function that achieves the same with NumPy:

def numpysum(n):
  a = numpy.arange(n) ** 2
  b = numpy.arange(n) ** 3
  c = a + b
  return c

Notice that numpysum() does not need a for loop. Also, we used the arange function from NumPy, which creates a NumPy array for us with integers 0 to n. The arange function was imported; that is why it is prefixed with numpy.

Now comes the fun part. Remember that it is mentioned in the Preface that NumPy is faster when it comes to array operations. How much faster is Numpy, though? The following program will show us by measuring the elapsed time in microseconds, for the numpysum and pythonsum functions. It also prints the last two elements of the vector sum. Let's check that we get the same answers when using Python and NumPy:

#!/usr/bin/env/python

import sys
from datetime import datetime
import numpy as np

"""
 This program demonstrates vector addition the Python way.
 Run from the command line as follows
     
  python vectorsum.py n
 
 where n is an integer that specifies the size of the vectors.

 The first vector to be added contains the squares of 0 up to n.
 The second vector contains the cubes of 0 up to n.
 The program prints the last 2 elements of the sum and the elapsed time.
"""

def numpysum(n):
   a = np.arange(n) ** 2
   b = np.arange(n) ** 3
   c = a + b

   return c

def pythonsum(n):
   a = range(n)
   b = range(n)
   c = []

   for i in range(len(a)):
       a[i] = i ** 2
       b[i] = i ** 3
       c.append(a[i] + b[i])

   return c
   
size = int(sys.argv[1])

start = datetime.now()
c = pythonsum(size)
delta = datetime.now() - start
print "The last 2 elements of the sum", c[-2:]
print "PythonSum elapsed time in microseconds", delta.microseconds

start = datetime.now()
c = numpysum(size)
delta = datetime.now() - start
print "The last 2 elements of the sum", c[-2:]
print "NumPySum elapsed time in microseconds", delta.microseconds

The output of the program for the 1000, 2000, and 3000 vector elements is as follows:

$ python vectorsum.py 1000
The last 2 elements of the sum [995007996, 998001000]
PythonSum elapsed time in microseconds 707
The last 2 elements of the sum [995007996 998001000]
NumPySum elapsed time in microseconds 171
$ python vectorsum.py 2000
The last 2 elements of the sum [7980015996, 7992002000]
PythonSum elapsed time in microseconds 1420
The last 2 elements of the sum [7980015996 7992002000]
NumPySum elapsed time in microseconds 168
$ python vectorsum.py 4000
The last 2 elements of the sum [63920031996, 63968004000]
PythonSum elapsed time in microseconds 2829
The last 2 elements of the sum [63920031996 63968004000]
NumPySum elapsed time in microseconds 274

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Clearly, NumPy is much faster than the equivalent normal Python code. One thing is certain: we get the same results whether we are using NumPy or not. However, the result that is printed differs in representation. Notice that the result from the numpysum function does not have any commas. How come? Obviously we are not dealing with a Python list, but with a NumPy array. It was mentioned in the Preface that NumPy arrays are specialized data structures for numerical data. We will learn more about NumPy arrays in Chapter 2, NumPy Basics.

What you will learn

Install NumPy and discover its arrays and features

Perform data analysis and complex array operations with NumPy

Analyze time series and perform signal processing

Understand NumPy modules and explore the scientific Python ecosystem

Improve the performance of calculations with clean and efficient NumPy code

Analyze large data sets using statistical functions and execute complex linear algebra and mathematical computations

What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

$48.99

$28.99

$48.99

Total $68.97$76.97 $8.00 saved

Filter reviews by

All

Amazon verified reviews

Edward Grefenstette Aug 07, 2014

The following review was produced after being sent a free copy of the book by Packt Publishing. I have endeavoured to be objective and state my view of the book, unaltered by this."Learning NumPy Array" by Ivan Idris is an excellent book which covers a range of use cases for the Python NumPy library across different aspects of scientific computing. It gives a short introduction to NumPy classes and core functions, discusses data analysis with NumPy data structures, and then using the pandas library (built on top of NumPy, amongst other libraries), provides a succinct overview of signal processing techniques with NumPy, followed by a short tutorial of profiling and debugging NumPy code with iPython, UnitTest, and Nose. The book ends with a helpful chapter on other tools scientific computing programmers may wish to consider alongside NumPy, most importantly cython and sklearn.While the book is entitled "Learning NumPy Array", it is not exactly aimed at a beginner audience. The reader is assumed to have a decent knowledge of linear algebra, statistics, and some cursory experience with similar statistics or matrix libraries/languages such as Matlab or R. While it does introduce NumPy basics, the reader should expect to have access to the NumPy documentation while reading the book to get the most out of it. Chapters 3 through 5 are quite dense, and readers without experience with the relevant areas (statistical analysis for chapters 3-4 and signal processing for chapter 5) will need to spend time and perhaps read third-party sources to get the most out of the material. All this is to say, this is a fairly technical book, but the examples are many, are well spelled-out, and coherently explained. Readers with a bit of experience in the domains it covers, or readers who are willing to put in the extra effort to read around the topics, will get a lot out of this book, and be able to implement solutions to similar problems quickly.In addition to the core material covered in chapters 3-5, I particularly enjoyed the fairly clear tutorials for profiling, debugging and testing in chapter six. Some of this material could easily have been excluded from the book in favour of expanding the explanations and giving some background in chapters 3-5, but I found the explanations to be well written and helpful, so the inclusion in this book is appreciated.Furthermore, the final chapter is a nice starting point for newcomers to the python scientific computing world, as it presents some helpful pointers (and use examples) to other resources available. Even as a researcher who has used these tools before once or twice, at least, this chapter was a good read and reminded me of how easily these libraries play with each other.Overall, this is a good read. It will take some small effort for most readers to get through the more technical sections of this book, but there is a lot to get out of reading it. While the same material can be found in several tutorials scattered around the web, it is good to see that someone took the effort to distil all the material into one volume, provide excellent example code, and enough explanation to feed intuitions as to how to best apply the knowledge presented in it.

Amazon Verified review

alan1955 Sep 17, 2014

This is an excellent book for learning NumPy. It is for someone who has experience with python and numerical coding. What I liked was chapter two where the book covers the basics of NumPy to get you quickly up and running. All in one chapter. So if you are already trying to do some basic things with NumPy, it will save you a lot of time. The rest of the book is filled with examples. The examples illustrate things you can do with NumPy, and are arranged topically. The topics cover basic data analysis, predictive analysis, smoothing, moving averages, and sifting. The other plus is that it introduces Scipy, the scientific python programming package. If you are not familiar with it, you will get to see what it can do in a few of the examples. It is also a short book. Being short it assumes you have a working knowledge of Python and some numerical experience. Too many computer books are so long, you can never get through them. Because of this you can quickly get up to speed. Also the code for examples can be downloaded, which is very helpful.

Roberto Avilés Jul 30, 2014

This book is a 7 chapters, +140 page hands-on introduction to the power of Python’s Library, NumPy.In Chapter 1, we learn to install Python, SciPy, Matplotlib, IPython and NumPy on Windows, Linux and Macintosh machines and start writing NumPy code.Chapter 2 reviews the basics on NumPy: Data Types, Array Types, Type Conversions, Creating, Indexing, Slicing and Manipulating array Shapes. Advantages of NumPy arrays: we know items in the array are of the same type (example, dtype!) plus, NumPy arrays can perform vectorized operations on the whole array: better than using lists; NumPy uses an optimized C API for those operations which make them especially fast. We learn how to transform a multidimensional array into a one dimensional array, how to stack, split, convert, copy and view them by playing with images, doing tricks with Sudoku and audio arrays.In Chapter 3 we are ready to learn Basic Data Analysis by working on a genuine (and quite completely) data set by looking for evidence of planetary heating.Chapter 4 is about Predictive Analysis and the use of the ‘pandas’ library. Pandas have plotting subroutines and in this chapter data of previous chapter is re-examined and extended to correlating weather and stocks!Now, in Chapter 5 we focus on ‘Signal Processing Techniques’ and analyze time series. The example data set will be sunspot data which we sift and plot to show the extremes of Sun activity. Tools as ‘moving averages’ and smoothing functions are introduced and we are ready to do a forecasting using an ARMA (autoregressive moving average) model. Then we learn how to design and use a filter and the “cointegration”, a better metric to define the relatedness of two time series.In Chapter 6 the book moves into Profiling, Debugging and Testing. NumPy adds the numpy.testing package (and its utility functions) to help NumPy code the unit testing. Later we met Nose, a Python framework that eases unit testing by organizing it.Chapter 7 relates to the Scientific Python Ecosystem. Scipy is built on NumPy and adds functionality as numerical integration, interpolation, optimization, statistics, clustering with scikit-learn, the detection of corners (all with examples), the use of Cython with NumPy and compares NumPy to Blaze (a collection of libraries being built towards the goal of generalizing NumPy ‘s data model and working on distributed data.)This book is a complete hands-on guide on the use of NumPy, through worked examples and ideas, a must have for those interested in Data Analysis, Forecasting and Signal Processing Techniques.

Sujit Pal Jun 21, 2014

Numpy is a fast matrix library that is at the core of scientific computing toolkits in Python - whether you are doing statistics, machine learning or signal processing (to name just a few possibilities), chances are you are using Numpy under the covers. Because Numpy is optimized for speed and because you are often dealing with large datasets, knowing the right function to call can mean the difference between run times of a few seconds or a few hours. This book can help you write cleaner and faster Numpy code.After a quick introduction to Numpy in Chapter 1, Chapter 2 takes the reader through a quick tour of Numpy functions, describing the major ones, including differently named functions which do the same thing, one updating in place and the other creating a copy. Chapter 3-5 on is all case studies, mostly about Time Series Analysis and Signal Processing, perhaps because they are good vehicles for demonstrating array handling techniques, but a nice side effect is that it gives the reader a quick introduction to these subjects as well. Chapter 6 describes Numpy's unit testing functionality, and also covers Python unit testing frameworks (unittest package as well as nose). It also describes how to profile and debug Python code from the IPython shell. Chapter 7 quickly covers various other players in the scientific Python ecosystem, and describes how to optimize Python code by rewriting them to Cython.I found the book immensely informative and relatively easy to read. It helps to actually work through the code in a Python shell as you are reading the book, in some cases the author condenses multiple steps into a single one, obviously expecting the reader to follow along, so being able to break the steps into smaller ones can help in understanding. Using the shell also gives you access to Python's help system, so you can read about new functions as you encounter them. One minor nit - it would have been more convenient if the data used for the analysis could have been packaged with the code for the book (for people reading the book offline) - but perhaps there are copyright restrictions on such distribution.DISCLAIMER: I didn't purchase this book, a PackT representative in my social network was offering free copies to review, and I asked for one because (a) I use Numpy and wanted to learn more and (b) as a Numpy user, I felt qualified to review the book objectively.

Daniel R. Vallejo Sep 28, 2014

great

Learning NumPy Array: Supercharge your scientific Python computations by understanding how to use the NumPy library effectively

What do you get with Print?

Learning NumPy Array

Chapter 1. Getting Started with NumPy

Python

Installing NumPy, Matplotlib, SciPy, and IPython on Windows

Note

Installing NumPy, Matplotlib, SciPy, and IPython on Linux

Installing NumPy, Matplotlib, and SciPy on Mac OS X

Note

Building from source

NumPy arrays

Adding arrays

Tip

Online resources and help

Summary

Page 1 of 9

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Learning NumPy Array: Supercharge your scientific Python computations by understanding how to use the NumPy library effectively

What do you get with Print?

Contact Details

Shipping Address

Billing Address

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs