Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
The Art of Data-Driven Business

You're reading from   The Art of Data-Driven Business Transform your organization into a data-driven one with the power of Python machine learning

Arrow left icon
Product type Paperback
Published in Dec 2022
Publisher Packt
ISBN-13 9781804611036
Length 314 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Alan Bernardo Palacio Alan Bernardo Palacio
Author Profile Icon Alan Bernardo Palacio
Alan Bernardo Palacio
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Part 1: Data Analytics and Forecasting with Python
2. Chapter 1: Analyzing and Visualizing Data with Python FREE CHAPTER 3. Chapter 2: Using Machine Learning in Business Operations 4. Part 2: Market and Customer Insights
5. Chapter 3: Finding Business Opportunities with Market Insights 6. Chapter 4: Understanding Customer Preferences with Conjoint Analysis 7. Chapter 5: Selecting the Optimal Price with Price Demand Elasticity 8. Chapter 6: Product Recommendation 9. Part 3: Operation and Pricing Optimization
10. Chapter 7: Predicting Customer Churn 11. Chapter 8: Grouping Users with Customer Segmentation 12. Chapter 9: Using Historical Markdown Data to Predict Sales 13. Chapter 10: Web Analytics Optimization 14. Chapter 11: Creating a Data-Driven Culture in Business 15. Index 16. Other Books You May Enjoy

Using NumPy for statistics and algebra

NumPy is a Python library used for working with arrays. Additionally, it provides functions for working with matrices, the Fourier transform, and the area of linear algebra. Large, multi-dimensional arrays and matrices are now supported by NumPy, along with a wide range of sophisticated mathematical operations that may be performed on these arrays. They use a huge number of sophisticated mathematical functions to process massive multidimensional arrays and matrices, as well as basic scientific computations in machine learning, which makes them highly helpful. It gives the n-dimensional array, a straightforward yet effective data structure. Learning NumPy is the first step on every Python data scientist’s path because it serves as the cornerstone on which nearly all of the toolkit’s capabilities are constructed.

The array, which is a grid of values all of the same type that’s indexed by a tuple of nonnegative integers, is the fundamental building block utilized by NumPy. Similar to how the dimensions of a matrix are defined in algebra, the array’s rank is determined by its number of dimensions. A tuple of numbers indicating the size of the array along each dimension makes up the shape of an array:

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))

A NumPy array is a container that can house a certain number of elements, all of which must be of the same type, as was previously specified. The majority of data structures employ arrays to carry out their algorithms. Similar to how you can slice a list, you can also slice a NumPy array, but in more than one dimension. Similar to indexing, slicing a NumPy array returns an array that is a view of the original array.

Slicing in Python means taking elements from one given index to another given index. We can select certain elements of an array by slicing the array using [start:end], where we reference the elements of the array from where we can start and where we want to finish. We can also define the step using [start:end:step]:

print('select elements by index:',arr[0])
print('slice elements of the array:',arr[1:5])
print('ending point of the array:',arr[4:])
print('ending point of the array:',arr[:4])

There are three different sorts of indexing techniques: field access, fundamental slicing, and advanced indexing. Basic slicing is the n-dimensional extension of Python’s fundamental slicing notion. By passing start, stop, and step parameters to the built-in slice function, a Python slice object is created. Writing understandable, clear, and succinct code is made possible through slicing. An iterable element is referred to by its position within the iterable when it is “indexed.” Getting a subset of elements from an iterable, depending on their indices, is referred to as “slicing.”

To combine (concatenate) two arrays, we must copy each element in both arrays to result by using the np.concatenate() function:

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)

Arrays can be joined using NumPy stack methods as well. We can combine two 1D arrays along the second axis to stack them on top of one another, a process known as stacking. The stack() method receives a list of arrays that we wish to connect with the axis:

arr = np.stack((arr1, arr2), axis=1)
print(arr)

The axis parameter can be used to reference the axis over which we want to make the concatenation:

arr = np.stack((arr1, arr2), axis=0)
print(arr)

The NumPy mean() function is used to compute the arithmetic mean along the specified axis:

np.mean(arr,axis=1)

You need to use the NumPy mean() function with axis=0 to compute the average by column. To compute the average by row, you need to use axis=1:

np.mean(arr,axis=0)

In the next section, we will introduce pandas, a library for data analysis and manipulation. pandas is one of the most extensively used Python libraries in data science, much like NumPy. It offers high-performance, simple-to-use data analysis tools. In contrast to the multi-dimensional array objects provided by the NumPy library, pandas offers an in-memory 2D table object called a DataFrame.

You have been reading a chapter from
The Art of Data-Driven Business
Published in: Dec 2022
Publisher: Packt
ISBN-13: 9781804611036
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image