Overview of the basic functions of NumPy
In short, as the name suggests, NumPy is a Python module brimming with useful functions for dealing with numbers. The Num in the first part of the name NumPy stands for numbers, and Py stands for Python. There you have it. If you have numbers and you are in Python, you know what you need to import. That is correct; you need to import NumPy, simple as that. See the following screenshot:
As you can see, we have given the alias np
to the module after importing it. You can actually assign any alias that you wish and your code would function; however, I suggest sticking with np
. I have two compelling reasons for doing so:
- First, everyone else uses this alias, so if you share your code with others, they know what you are doing throughout your project.
- Second, a lot of the time, you end up using code written by others in your projects, so consistency will make your job easier. You will see that most of the famous modules also have a famous alias, for example,
pd
for Pandas, andplt
for matplotlib.pyplot.Good practice advice
NumPy can handle all types of mathematical and statistical calculations for a collection of numbers, such as mean, median, standard deviation (std), and variance (var). If you have something else in mind and are not sure whether NumPy has it, I suggest googling it before trying to write your own. If it involves numbers, chances are NumPy has it.
The following screenshot shows the mean, for example, applied to a collection of numbers:
As shown in Figure 1.4, there are two ways to do this. The first one, portrayed in the top chunk, uses np.mean()
. This function is one of the properties of the NumPy module and can be accessed directly. The great aspect of using this approach is that you do not need to change your data type most of the time before NumPy honors your request. You can input lists, Pandas series, or DataFrames. You can see on the top chunk that np.mean()
easily calculated the mean of lst_nums
, which is of the list type. The second way, as shown in the bottom chunk, is to first use np.array()
to transform the list into a NumPy array and then use the .mean()
function, which is a property of any NumPy array. Before continuing to progress with this chapter, take a moment and use the Python type()
function to see the different types of lst_numbs
and ary_nums
, as shown in the following screenshot:
Next we will learn about four NumPy functions: np.arange()
, np.zeros()
, np.ones()
, and np.linspace()
.
The np.arange() function
This function, as shown in the following screenshot, produces a sequence of numbers with equal increments. You can see in the figure that by changing the two inputs, you can get the function to output many different sequences of numbers that are required for your analytic purposes:
Pay attention to the three chunks of code in the preceding figure to see the default behavior of np.arange()
when only one or two inputs are passed.
- When only one input is passed, as in the first chunk of code, the default of
np.arange()
is that you want a sequence of numbers from zero to the input number with increments of one. - When two inputs are passed, as in the second chunk of code, the default of the function is that you want a sequence of numbers from the first input to the second input with increments of one.
The np.zeros() and np.ones() functions
np.ones()
creates a NumPy array filled with ones, and np.zeros()
does the same thing with zeros. Unlike np.arange()
, which takes the input to calculate what needs to be included in the output array, np.zeros()
and np.ones()
take the input to structure the output array. For instance, the top chunk of the following screenshot specifies the request for an array with four rows and five columns filled with zeros. As you can see in the bottom chunk, if you only pass in one number, the output array will have only one dimension:
These two functions are excellent resources for creating a placeholder to keep the results of calculations in a loop. For instance, review the following example and observe how this function facilitated the coding.
Example – Using a placeholder to accommodate analytics
Given the grade data of 10 students, create a code using NumPy that calculates and reports their grade average.
The data of the 10 students and the solution to this example are provided in the following screenshots. Please review and try this code before progressing:
Now that you've had a chance to engage with this example, allow me to highlight a few matters about the provided solution presented in Figure 1.9:
- Notice how
np.zeros()
facilitated the solution by streamlining it significantly. After the code is done, all of the average grades are calculated and saved already. Compare the printed values before and after thefor
loop. - The
enumerate()
function in thefor
loop might sound strange to you. What that does is help the code to have both an index (i
) and the item (name
) from the collection (Names
). - The
.format()
function is an invaluable property of any string variable. If there are any symbols such as{}
in the string, this function will replace them with what has been input sequentially. # better-looking report
is a comment in the second chunk of the code. Comments are not compiled and their only purpose is to communicate something with whoever reads the source code.
The np.linspace() function
This function returns evenly spaced numbers over a specified interval. The function takes three inputs. The first two inputs specify the interval, and the third shows the number of elements that the output will have. For example, refer to the following screenshot:
In the first code block, 19 numbers are evenly spaced between 0 and 1, altogether creating an array with 21 numbers. The second gives another example. After trying out the two examples in the screenshot, try np.linspace(0,1,20)
and after investigating the results, think about why I chose 21 over 20 in my example.
np.linspace()
is a very handy function for situations where you need to try out different values to find the one that best fits your needs. The following example showcases a simple situation like that.
Example – np.linspace() to create solution candidates
We are interested in finding the value(s) that holds the following mathematical statement: .
Imagine that we don't know that the statement can be simplified easily to ascertain that either 2 or 3 will hold the statement:
So we would like to use NumPy to try out any whole numbers between -1000 and 1000 and find the answer.
The following screenshot shows Python code that provides a solution to this problem:
Please review and try this code before moving on.
Now that you've had a chance to engage with this example, allow me to highlight a couple of things:
- Notice how smart use of
np.linspace()
leads to an array with all of the numbers that we were interested in trying out. - Uncomment
#print(Candidates)
and review all of the numbers that were tried out to establish the desired answers.
This concludes our review of the NumPy module. Next, we will review another very useful Python module, Pandas.