Augmentation categories
It is advantageous to group tabular augmentation into categories. The following concepts are new and particular to the DeltaPy library. The augmentation functions are grouped into the following categories:
- Transforming techniques can be applied for cross-section and time series data. Transforming techniques in tabular augmentation are used to modify existing rows or columns to create new, synthetic data representative of the original data. These methods can include the following:
- Scaling: Increasing or decreasing a column value to expand the diversity of values in a dataset
- Binning: Combining two or more columns into a single bucket to create new features
- Categorical encoding: Using a numerical representation of categorical data
- Smoothing: Compensating for unusually high or low values in a dataset
- Outlier detection and removal: Detecting and removing points farther from the norm
- Correlation-based augmentation: Adding new features based on correlations between...