Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Numerical Computing with Python

You're reading from   Numerical Computing with Python Harness the power of Python to analyze and find hidden patterns in the data

Arrow left icon
Product type Course
Published in Dec 2018
Publisher Packt
ISBN-13 9781789953633
Length 682 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Authors (5):
Arrow left icon
Pratap Dangeti Pratap Dangeti
Author Profile Icon Pratap Dangeti
Pratap Dangeti
Theodore Petrou Theodore Petrou
Author Profile Icon Theodore Petrou
Theodore Petrou
Allen Yu Allen Yu
Author Profile Icon Allen Yu
Allen Yu
Aldrin Yim Aldrin Yim
Author Profile Icon Aldrin Yim
Aldrin Yim
Claire Chung Claire Chung
Author Profile Icon Claire Chung
Claire Chung
+1 more Show less
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Title Page
Contributors
About Packt
Preface
1. Journey from Statistics to Machine Learning FREE CHAPTER 2. Tree-Based Machine Learning Models 3. K-Nearest Neighbors and Naive Bayes 4. Unsupervised Learning 5. Reinforcement Learning 6. Hello Plotting World! 7. Visualizing Online Data 8. Visualizing Multivariate Data 9. Adding Interactivity and Animating Plots 10. Selecting Subsets of Data 11. Boolean Indexing 12. Index Alignment 13. Grouping for Aggregation, Filtration, and Transformation 14. Restructuring Data into a Tidy Form 15. Combining Pandas Objects 1. Other Books You May Enjoy Index

Preface

Data mining, or parsing the data to extract useful insights, is a niche skill that can transform your career as a data scientist Python is a flexible programming language that is equipped with a strong suite of libraries and toolkits, and gives you the perfect platform to sift through your data and mine the insights you seek. This Learning Path is designed to familiarize you with the Python libraries and the underlying statistics that you need to get comfortable with data mining. You will learn how to use Pandas, Python's popular library to analyze different kinds of data, and leverage the power of Matplotlib to generate appealing and impressive visualizations for the insights you have derived. You will also explore different machine learning techniques and statistics that enable you to build powerful predictive models. By the end of this Learning Path, you will have the perfect foundation to take your data mining skills to the next level and set yourself on the path to become a sought-after data science professional. This Learning Path includes content from the following Packt products:

  • Statistics for Machine Learning by Pratap Dangeti
  • Matplotlib 2.x By Example by Allen Yu, Claire Chung, Aldrin Yim
  • Pandas Cookbook by Theodore Petrou

Who this book is for

If you want to learn how to use the many libraries of Python to extract impactful information from your data and present it as engaging visuals, then this is the ideal Learning Path for you. Some basic knowledge of Python is enough to get started with this Learning Path.

What this book covers

Chapter 1, Journey from Statistics to Machine Learning, introduces you to all the necessary fundamentals and basic building blocks of both statistics and machine learning. All fundamentals are explained with the support of both Python and R code examples across the chapter.

 

Chapter 2, Tree-Based Machine Learning Models, focuses on the various tree-based machine learning models used by industry practitioners, including decision trees, bagging, random forest, AdaBoost, gradient boosting, and XGBoost with the HR attrition example in both languages.

Chapter 3, K-Nearest Neighbors and Naive Bayes, illustrates simple methods of machine learning. K-nearest neighbors is explained using breast cancer data. The Naive Bayes model is explained with a message classification example using various NLP preprocessing techniques.

Chapter 4, Unsupervised Learning, presents various techniques such as k-means clustering, principal component analysis, singular value decomposition, and deep learning based deep auto encoders. At the end is an explanation of why deep auto encoders are much more powerful than the conventional PCA techniques.

Chapter 5, Reinforcement Learning, provides exhaustive techniques that learn the optimal path to reach a goal over the episodic states, such as the Markov decision process, dynamic programming, Monte Carlo methods, and temporal difference learning. Finally, some use cases are provided for superb applications using machine learning and reinforcement learning.

Chapter 6, Hello Plotting World!, covers the basic constituents of a Matplotlib figure, as well as the latest features of Matplotlib version 2.

Chapter 7, Visualizing Online Data, teaches you how to design intuitive infographics for effective storytelling through the use of real-world datasets.

Chapter 8, Visualizing Multivariate Data, gives you an overview of the plot types that are suitable for visualizing datasets with multiple features or dimensions.

Chapter 9, Adding Interactivity and Animating Plots, shows you that Matplotlib is not limited to creating static plots. You will learn how to create interactive charts and animations.

Chapter 10, Selecting Subsets of Data, covers the many varied and potentially confusing ways of selecting different subsets of data.

Chapter 11, Boolean Indexing, covers the process of querying your data to select subsets of it based on Boolean conditions.

Chapter 12, Index Alignment, targets the very important and often misunderstood index object. Misuse of the Index is responsible for lots of erroneous results, and these recipes show you how to use it correctly to deliver powerful results.

 

Chapter 13, Grouping for Aggregation, Filtration, and Transformation, covers the powerful grouping capabilities that are almost always necessary during a data analysis. You will build customized functions to apply to your groups.

Chapter 14, Restructuring Data into a Tidy Form, explains what tidy data is and why it’s so important, and then it shows you how to transform many different forms of messy datasets into tidy ones.

Chapter 15, Combining Pandas Objects, covers the many available methods to combine DataFrames and Series vertically or horizontally. We will also do some web-scraping to compare President Trump's and Obama's approval rating and connect to an SQL relational database.

To get the most out of this book

This book assumes that you know the basics of Python and R and how to install the libraries. It does not assume that you are already equipped with the knowledge of advanced statistics and mathematics, like linear algebra and so on.

The following versions of software are used throughout this book, but it should run fine with any more recent ones as well:

  • Anaconda 3–4.3.1 (all Python and its relevant packages are included in Anaconda, Python 3.6.1, NumPy 1.12.1, Pandas 0.19.2, and scikit-learn 0.18.1)
  • R 3.4.0 and RStudio 1.0.143
  • Theano 0.9.0
  • Keras 2.0.2
  • A Windows 7+, macOS 10.10+, or Linux-based computer with 4 GB RAM or above is recommended.

 

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Numerical-Computing-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The mode function was not implemented in the numpy package.". Any command-line input or output is written as follows:

>>> import numpy as np 
>>> from scipy import stats 
>>> data = np.array([4,5,1,2,7,2,6,9,3]) 
# Calculate Mean 
>>> dt_mean = np.mean(data) ; 
print ("Mean :",round(dt_mean,2)) 

New terms and important words are shown in bold.

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image