In this section, we will see how to put together the recipes we discussed in prior sections to build a malware detector. Our malware detector will take in both features extracted from the PE header as well as features derived from N-grams.
Building a static malware detector
Getting ready
Preparation for this recipe consists of installing the scikit-learn, nltk, and pefile packages in pip. The instructions are as follows:
pip install sklearn nltk pefile
In addition, benign and malicious files have been provided for you in the "PE Samples Dataset" folder in the root of the repository. Extract all archives named "Benign PE Samples*.7z" to a folder named "Benign PE Samples". Extract...