Complete the following exercises to practice the concepts covered in this chapter:
- Run the simulation for December 2018 into new log files without making the user base again. Be sure to run python simulate.py -h to review the command-line arguments. Set the seed to 27. This data will be used for the remaining exercises.
- Find the number of unique usernames, attempts, successes, failures, and the success/failure rates per IP address, using the data simulated in exercise #1.
- Create two subplots with failures versus attempts on the left, and failure rate versus distinct usernames on the right. Draw decision boundaries for the resulting plots. Be sure to color each data point by whether or not it is a hacker IP address.
- Build a rule-based criteria using percentage difference from the median that flags an IP address if the failures and attempts are both five times their...