Using the Kubeflow Pipelines SDK to build ML workflows
In this section, we will build ML workflows using the Kubeflow Pipelines SDK. The Kubeflow Pipelines SDK contains what we need to build the pipeline components containing the custom code we want to run. Using the Kubeflow Pipelines SDK, we can define the Python functions that would map to the pipeline components of a pipeline.
Here are some guidelines that we need to follow when building Python function-based components using the Kubeflow Pipelines SDK:
- The defined Python functions should be standalone and should not use any code and variables declared outside of the function definition. This means that import statements (for example,
import pandas
) should be implemented inside the function, too. Here’s a quick example of how imports should be implemented:def process_data(...): import pandas as pd df_all_data = pd.read_csv(df_all_data_path...