Predicting who will survive on the Titanic with logistic regression
In this recipe, we will introduce logistic regression, a basic classifier. We will apply these techniques on a Kaggle dataset where the goal is to predict survival on the Titanic based on real data (see http://www.kaggle.com/c/titanic).
Note
Kaggle (http://www.kaggle.com/competitions) hosts machine learning competitions where anyone can download a dataset, train a model, and test the predictions on the website.
How to do it...
We import the standard packages:
>>> import numpy as np import pandas as pd import sklearn import sklearn.linear_model as lm import sklearn.model_selection as ms import matplotlib.pyplot as plt %matplotlib inline
We load the training and test datasets with pandas:
>>> train = pd.read_csv('https://github.com/ipython-books' '/cookbook-2nd-data/blob/master/' 'titanic_train.csv?raw=true') test = pd.read_csv('https:/...