Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Applied Data Science with Python and Jupyter

You're reading from   Applied Data Science with Python and Jupyter Use powerful industry-standard tools to unlock new, actionable insights from your data

Arrow left icon
Product type Paperback
Published in Oct 2018
Publisher
ISBN-13 9781789958171
Length 192 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Alex Galea Alex Galea
Author Profile Icon Alex Galea
Alex Galea
Arrow right icon
View More author details
Toc

About the Book

Applied Data Science with Python and Jupyter teaches you the skills you need for entry-level data science. You'll learn about some of the most commonly used libraries that are part of the Anaconda distribution, and then explore machine learning models with real datasets to give you the skills and exposure you need for the real world. You'll finish up by learning how easy it can be to scrape and gather your own data from the open web so that you can apply your new skills in an actionable context.

About the Author

Alex Galea has been doing data analysis professionally since graduating with a master's in physics from the University of Guelph in Canada. He developed a keen interest in Python while researching quantum gases as part of his graduate studies. More recently, Alex has been doing web data analytics, where Python continues to play a large part in his work. He frequently blogs about work and personal projects, which are generally data-centric and usually involve Python and Jupyter Notebooks.

Objectives

  • Get up and running with the Jupyter ecosystem
  • Identify potential areas of investigation and perform exploratory data analysis
  • Plan a machine learning classification strategy and train classification models
  • Use validation curves and dimensionality reduction to tune and enhance your models
  • Scrape tabular data from web pages and transform it into Pandas DataFrames
  • Create interactive, web-friendly visualizations to clearly communicate your findings

Audience

Applied Data Science with Python and Jupyter is ideal for professionals with a variety of job descriptions across a large range of industries, given the rising popularity and accessibility of data science. You'll need some prior experience with Python, with any prior work with libraries such as Pandas, Matplotlib, and Pandas providing you a useful head start.

Approach

Applied Data Science with Python and Jupyter covers every aspect of the standard data workflow process with a perfect blend of theory, practical hands-on coding, and relatable illustrations. Each module is designed to build on the learnings of the previous chapter. The book contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.

Minimum Hardware Requirements

The minimum hardware requirements are as follows:

  • Processor: Intel i5 (or equivalent)
  • Memory: 8 GB RAM
  • Hard disk: 10 GB
  • An internet connection

Software Requirements

You'll also need the following software installed in advance:

  • Python 3.5+
  • Anaconda 4.3+
  • Python libraries included with Anaconda installation:
  • matplotlib 2.1.0+
  • ipython 6.1.0+
  • requests 2.18.4+
  • beautifulsoup4 4.6.0+
  • numpy 1.13.1+
  • pandas 0.20.3+
  • scikit-learn 0.19.0+
  • seaborn 0.8.0+
  • bokeh 0.12.10+
  • Python libraries that require manual installation:
  • mlxtend
  • version_information
  • ipython-sql
  • pdir2
  • graphviz

Installation and Setup

Before you start with this book, we'll install Anaconda environment which consists of Python and Jupyter Notebook.

Installing Anaconda

  1. Visit https://www.anaconda.com/download/ in your browser.
  2. Click on Windows, Mac, or Linux, depending on the OS you are working on.
  3. Next, click on the Download option. Make sure you download the latest version.
  4. Open the installer after download.
  5. Follow the steps in the installer and that's it! Your Anaconda distribution is ready.

Updating Jupyter and Installing Dependencies

  1. Search for Anaconda Prompt and open it.
  2. Type the following commands to update conda and Jupyter:
    #Update conda
    conda update conda
    #Update Jupyter
    conda update Jupyter
    #install packages
    conda install numpy
    conda install pandas
    conda install statsmodels
    conda install matplotlib
    conda install seaborn
  3. To open Jupyter Notebook from Anaconda Prompt, use the following command:
    jupyter notebook
    pip install -U scikit-learn

Additional Resources

The code bundle for this book is also hosted on GitHub at https://github.com/TrainingByPackt/Applied-Data-Science-with-Python-and-Jupyter.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions

Code words in text, database table names, folder names, filenames, file extensions, path names, dummy URLs, user input, and Twitter handles are shown as follows:

"The final figure is then saved as a high resolution PNG to the figures folder."

A block of code is set as follows:

y = df['MEDV'].copy()
del df['MEDV']
df = pd.concat((y, df), axis=1)

Any command-line input or output is written as follows:

jupyter notebook

New terms and important words are shown in bold. Words that you see on the

screen, for example, in menus or dialog boxes, appear in the text like this: "Click on New in the upper-right corner and select a kernel from the drop-down menu."

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at £16.99/month. Cancel anytime