Introducing the Human Resource Analytics Dataset
Having learned about basic data cleaning concepts and seen them implemented with pandas and scikit-learn, we'll put what we've learned into practice on a diverse dataset that has real-world context. In the following chapters, we'll model this dataset with a variety of machine learning techniques, so let's take some time to get familiar with it now. Let's imagine the following situation:
Suppose you are hired to do freelance work for a company who wants to find insights into why their employees are leaving. They have compiled a set of data they think will be helpful in this respect. It includes details of employee satisfaction levels, evaluations, time spent at work, department, and salary.
The company shares their data with you by sending you a file called hr_data.csv
and asks you what you think can be done to help stop employees from leaving.
Our aim is to apply the concepts we've discussed thus...