Considering when to use deep learning and what for
Deep learning forms the basis of neural networks in ML. A neural network contains many layers of densely interconnected neurons organized into input, hidden, and output layers. Information flows through a neural network in one direction and begins with the input layer receiving raw data for model training. The hidden layer uses backpropagation to calculate the gradient of errors and optimizes the learning process.
A neuron contains an activation function to produce prediction through the output layer. Figure 1.19 shows a basic deep learning architecture:
Figure 1.19 – A basic deep learning architecture
Anomaly detection techniques are generally available in three categories:
- Supervised anomaly detection: Train an ML model with an imbalanced labeled dataset where each data instance is categorized into a normal or abnormal class. This approach is viable if ground truth or actual observation is available. An anomaly detection model determines the class of unseen data assuming outliers follow the same distribution as the training dataset. Limitations of supervised anomaly detection include scarcity of anomaly samples and challenges in identifying precise representation of the normal class.
- Semi-supervised anomaly detection: Train an ML model with a large amount of unlabeled datasets supplemented by a small set of labeled data for expected behavior. This approach assumes outliers differ from training dataset distribution. Hence, semi-supervised is more applicable than supervised for detecting outliers since anomalies are rare.
- Unsupervised anomaly detection: Train an ML model with an unlabeled dataset that contains normal and abnormal observations. This approach assumes normal and abnormal observations typically isolated in high-density and low-density regions. An anomaly detector looks for instances or potential outliers in the low-density region.
Besides identifying the class of unseen data, anomaly detection algorithms can produce anomaly scores to quantify the severity, help businesses determine an acceptable impact threshold, and manage risk tolerance levels.
The significance of anomaly detection is evident across many mission-critical domains. When choosing deep learning versus traditional anomaly detection methods, consider business objectives, data size, and training time versus trade-offs such as algorithmic scalability, model flexibility, explainability, and interpretability.
Identifying an ML problem to address a specific business problem is essential before embarking on an ML journey. Knowing the inputs, outputs, and success criteria, such as accuracy over interpretability, is critical for choosing the appropriate algorithms and lineage tracking. Consider traditional methods such as KNN and decision trees if interpretability is a higher priority to your business for regulatory compliance or auditability needs. Explore deep learning methods if high accuracy triumphs over interpretability for your use case, as XAI continues to mature for deep learning models.
Deep learning methods are capable and ideal for handling large datasets and complex problems. Deep learning can extract and correlate relationships across many interdependent features if your business aims to discover hidden patterns in large datasets. Otherwise, traditional methods might be a good start if your data size is small.
Training a deep learning anomaly detection model can be compute-intensive and time-consuming, depending on the number of parameters involved and available infrastructure, such as graphics processing units (GPUs). More computing power is needed as the size and complexity grow with deep learning models. Conversely, traditional anomaly detection methods can run and train faster on cheaper hardware, within hours.
Traditional rule-based anomaly detection methods created manually by domain experts are not scalable enough to handle high-dimensional data and are difficult to maintain. For instance, it can be challenging to develop security rules for every possible malicious behavior and keep those rules up to date. In contrast, deep learning-based anomaly detection methods are more adaptive by learning and extracting features incrementally from data in a nested hierarchy through hidden layers.
This section covered the basics and general best practices of deep learning. In the following section, we will discuss known challenges of deep learning anomaly detection and future opportunities of XAI in this field.