If your main goal in using SciPy is to do data exploration and analysis or scientific computations, Jupyter provides an ideal interactive environment. Using Jupyter, we can integrate computations, graphs, formatted text, and even more sophisticated media. Essentially, anything that can be inserted in a web page can be handled by Jupyter.
Running SciPy in Jupyter
Getting ready
This recipe assumes that you have a working installation of IPython and Jupyter. If you followed one of the recipes in this chapter to set up Anaconda or a standalone installation of the SciPy stack, you have all you need.
How to do it...
The following steps demonstrate how to start Jupyter and create a new notebook:
- Open a command window on the directory where you want your notebook files stored.
- Start Jupyter by running the following command in the Terminal window:
jupyter notebook
- After a few moments, the notebook will open in your web browser. The notebook starting page is known as the dashboard, and is shown in the following screenshot:
- To create a new notebook, click the New button at the top right and select Python 3 from the menu. The following screenshot shows a newly created notebook:
Code in a notebook is entered in an execution cell, which is surrounded by a green border when active. To get a taste of what working with the Jupyter notebook feels like, click on an execution cell in the notebook and enter the following code:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
With the mouse cursor still in the same execution cell, press Shift + Enter to run the cell. The preceding code displayed the magic %matplotlib inline first to tell Jupyter that we want to display plots in the notebook itself, and the next two lines imported NumPy and pyplot (an interactive plotting library provided by matplotlib).
After running the cell, the cursor automatically moves to the next cell. Enter the following code in this cell:
from scipy.stats import norm, binom
n, p = 100, 0.5
mean = n * p
sdev = np.sqrt(n * p * (1-p))
sample = np.array([binom.rvs(n, p) for _ in range(1000)])
xvalues = np.linspace(mean-3*sdev, mean+3*sdev, 200)
yvalues = norm.pdf(xvalues, loc=mean, scale=sdev)
hist = plt.hist(sample, normed=True,
color='red', lw=3, ls='dotted', alpha=0.5)
plt.plot(xvalues, yvalues, color='blue', lw=2)
plt.title('Coin toss simulation, $n={}$, $p={:5.2f}$'.format(n, p))
plt.xlabel('Number of heads')
plt.ylabel('Frequency')
None
This code simulates 100 tosses of a fair coin. The simulation is repeated 1,000 times and the results are stored in the array sample. Then a histogram of the results is plotted, together with a normal approximation, according to the central limit theorem. Pressing Shift + Enter to run the cell will produce a plot of the histogram, representing the simulation and the theoretical normal approximation of the distribution of the number of heads in the coin tosses.