Monadic data transformation
The first step is to define a trait and a method that describe the transformation of data by the computation units of a workflow. The data transformation is the foundation of any workflow for processing and classifying a dataset, training and validating a model, and displaying results.
There are two symbolic models for defining a data processing or data transformation:
- Explicit model: The developer creates a model explicitly from a set of configuration parameters. Most deterministic algorithms and unsupervised learning techniques use an explicit model.
- Implicit model: The developer provides a training set that is a set of labeled observations (observations with expected outcome). A classifier extracts a model through the training set. Supervised learning techniques rely on a model implicitly generated from labeled data.
Error handling
The simplest form of data transformation is morphism between two types U
and V
. The data transformation enforces a contract for validating...