Running our first Kubeflow pipeline
In this section, we will run a custom pipeline that will download a sample tabular dataset and use it as training data to build our linear regression model. The steps and instructions to be executed by the pipeline have been defined inside a YAML file. Once this YAML file has been uploaded, we would then be able to run a Kubeflow pipeline that will run the following steps:
- Download dataset: Here, we will be downloading and working with a dataset that only has 20 records (along with the row containing the header). In addition to this, we will start with a clean version without any missing or invalid values:
Figure 10.16 – A sample tabular dataset
In Figure 10.16, we can see that our dataset has three columns:
last_name
– This is the last name of the manager.management_experience_months
– This is the total number of months a manager has been managing team members.monthly_salary...