Packt+ | Advance your knowledge in tech

You're reading from Applied Supervised Learning with Python Use scikit-learn to build predictive models from real-world datasets and prepare yourself for the future of machine learning

Product type Paperback

Published in Apr 2019

Publisher

ISBN-13 9781789954920

Length 404 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (2):

Benjamin Johnston

Ishita Mathur

View More author details

Table of Contents (9) Chapters

Applied Supervised Learning with Python

Preface

1. Python Machine Learning Toolkit

2. Exploratory Data Analysis and Visualization FREE CHAPTER

3. Regression Analysis

4. Classification

5. Ensemble Modeling

6. Model Evaluation

Appendix

Chapter 1: Python Machine Learning Toolkit

Activity 1: pandas Functions

Solution

Open a new Jupyter notebook.
Use pandas to load the Titanic dataset:
```
import pandas as pd

df = pd.read_csv('titanic.csv')
```
Use the head() function on the dataset as follows:
```
# Have a look at the first 5 sample of the data
df.head()
```
The output will be as follows:
Figure 1.65: First five rows
Use the describe function as follows:
```
df.describe(include='all')
```
The output will be as follows:
Figure 1.66: Output of describe()
We don't need the Unnamed: 0 column. We can remove the column without using the del command, as follows:
```
df = df[df.columns[1:]] # Use the columns
df.head()
```
The output will be as follows:
Figure 1.67: First five rows after deleting the Unnamed: 0 column

Compute the mean, standard deviation, minimum, and maximum values for the columns of the DataFrame without using describe:

df.mean()

Fare        33.295479
Pclass       2.294882
Age         29.881138
Parch        0.385027
SibSp        0.498854
Survived     0.383838...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (2)

Benjamin Johnston

Benjamin Johnston is a senior data scientist for one of the world's leading data-driven MedTech companies and is involved in the development of innovative digital solutions throughout the entire product development pathway, from problem definition to solution research and development, through to final deployment. He is currently completing his Ph.D. in ML, specializing in image processing and deep convolutional neural networks. He has more than 10 years of experience in medical device design and development, working in a variety of technical roles, and holds a first-class honors bachelor's degree in both engineering and medical science from the University of Sydney, Australia.

See other products by Benjamin Johnston

Ishita Mathur

Ishita Mathur has worked as a data scientist for 2.5 years with product-based start-ups working with business concerns in various domains and formulating them as technical problems that can be solved using data and machine learning. Her current work at GO-JEK involves the end-to-end development of machine learning projects, by working as part of a product team on defining, prototyping, and implementing data science models within the product. She completed her masters' degree in high-performance computing with data science at the University of Edinburgh, UK, and her bachelor's degree with honors in physics at St. Stephen's College, Delhi.

See other products by Ishita Mathur