Loading the dataset
Create and open a new IPython notebook. In the chapter's supplementary materials, you can see the file extraterrestrials.csv
. Copy it to the same folder where you created your notebook. In the first cell of your notebook, execute the magical command:
In []: %matplotlib inline
This is needed to see inline plots right in the notebook in the future.
The library we are using for datasets loading and manipulation is pandas
. Let's import it, and load the .csv
file:
In []: import pandas as pd df = pd.read_csv('extraterrestrials.csv', sep='t', encoding='utf-8', index_col=0)
Object df
is a data frame. This is a table-like data structured for efficient manipulations over the different data types. To see what's inside, execute:
In []: df.head() Out[]:
Length | Color | Fluffy | Label | |
0 | 27.545139 | Pink gold | True | Rabbosaurus |
1 | 12.147357 | Pink gold | False | Platyhog |
2 | 23.454173 | Light black | True | Rabbosaurus |
3 | 29.956698 | Pink gold | True | Rabbosaurus |
4 | 34.884065 | Light black | True | Rabbosaurus |
This prints the first five rows of the...