Creating training and testing datasets
We’ll start our classification experiment with the familiar – adding our dependencies and loading a CSV file into a DataFrame
:
#r "nuget:Microsoft.Data.Analysis,0.21.1" using Microsoft.Data.Analysis; DataFrame df = DataFrame.LoadCsv("PlayerRetirement.csv");
This chapter’s dataset contains rows indicating a player’s stats for an entire season and an IsLastSeason
column, indicating whether that player did not return for the following season.
A sampling of a few rows and columns is shown in Figure 7.2:
Figure 7.2 – A sample of some of the rows and columns in the dataset
The idea of this dataset is that players will play for multiple seasons, and their stats will change from season to season until they ultimately decide it is time to move on or their team does not renew their contract.
Each row also contains the age of the player that season and the way...