As we have seen in the previous chapters, several AI solutions are available to achieve certain cybersecurity goals, so it is important to learn how to evaluate the effectiveness of various alternative solutions, using appropriate analysis metrics. At the same time, it is important to prevent phenomena such as overfitting, which can compromise the reliability of forecasts when switching from training data to test data.
In this chapter, we will learn about the following topics:
- Feature engineering best practices in dealing with raw data
- How to evaluate a detector's performance using the ROC curve
- How to appropriately split sample data into training and test sets
- How to manage algorithms' overfitting and bias–variance trade-offs with cross validation
Now, let's begin our discussion of we need feature engineering by examining the very...