Training and deployment
Training and deployment in KNIME Analytics Platform come in a variety of complexities; for this chapter, we’ll look at some of the simpler options. No matter how you plan to train your model, it’s good to first partition your data. To partition data in KNIME, we use the Partitioning node:
Figure 6.10 – The Partitioning node and configuration dialog
In Figure 6.10, we see the Partitioning node and its configuration dialog. You’ll notice the node has one input port to the left and two output ports to the right. The input port is our full dataset, and the two outputs are the two splits based on our configuration choices. Note that the top port aligns with the options in the configuration dialog.
There are two things to configure for this node: the size of the first partition and how it is created. A 70/30 split with 70% being for the training set is common practice, but this can vary by use case. For...