Chapter 2: Data Cleaning and Advanced Machine
Activity 2: Preparing to Train a Predictive Model for the Employee-Retention Problem
- Scroll to the
Activity A
section of thelesson-2-workbook.ipynb
notebook file. - Check the head of the table by running the following code:
%%bash head ../data/hr-analytics/hr_data.csv
Judging by the output, convince yourself that it looks to be in standard CSV format. For CSV files, we should be able to simply load the data with pd.read_csv.
- Load the data with Pandas by running
df = pd.read_csv('../data/hr- analytics/hr_data.csv')
. Write it out yourself and use tab completion to help type the file path. - Inspect the columns by printing
df.columns
and make sure the data has loaded as expected by printing the DataFramehead
andtail
withdf.head()
anddf.tail()
:Figure 2.46: Output for inspecting head and tail of columns
We can see that it appears to have loaded correctly. Based on the tail index values, there are nearly 15,000 rows...