Basics – summary, dimensions, and structure
After reading in the data, there are certain tasks that need to be performed to get the touch and feel of the data:
- To check whether the data has read in correctly or not
- To determine how the data looks; its shape and size
- To summarize and visualize the data
- To get the column names and summary statistics of numerical variables
Let us go back to the example of the Titanic dataset and import it again. The head()
method is used to look at the first first few rows of the data, as shown:
import pandas as pd data=pd.read_csv('E:/Personal/Learning/Datasets/Book/titanic3.csv') data.head()
The result will look similar to the following screenshot:
In the head()
method, one can also specify the number of rows they want to see. For example, head(10)
will show the first 10 rows.
The next attribute of the dataset that concerns us is its dimension, that is the number of rows...