Preparing a SQL Server query for human resources data
As we've done with previous chapters, we must first prepare our data that we want to visualize inside Python. This dataset will focus on the human resources department for the AdventureWorks company. Our query will pull back the total number of vacation hours that each job title has available. The data for this query is available in the Employee
table of the AdventureWorks
database, and the following SQL statement will help us generate the results needed:
SELECT [JobTitle] ,sum([VacationHours]) as VacationHours FROM [AdventureWorks2014].[HumanResources].[Employee] group by [JobTitle] order by [VacationHours] asc;
The result of the SQL statement for the first ten rows can be seen in the following screenshot:
The full dataset from this query result will be the foundation of our histogram as well as our normal distribution plot.