In the previous sections, we saw how to build a machine learning-based botnet detector. In this new project, we are going to deal with a different problem instead of defending against botnet malware. We are going to detect Twitter bots because they are also dangerous and can perform malicious actions. For the model, we are going to use the NYU Tandon Spring 2017 Machine Learning Competition: Twitter Bot classification dataset. You can download it from this link: https://www.kaggle.com/c/twitter-bot-classification/data. Import the required Python packages:
>>> import pandas as pd
>>> import numpy as np
>>> import seaborn
Let's load the data using pandas and highlight the bot and non-bot data:
>>> data = pd.read_csv('training_data_2_csv_UTF.csv')
>>> Bots = data[data.bot==1]
>> NonBots...