Exercises
Complete the following exercises to practice the concepts covered in this chapter:
- Run the simulation for December 2018 into new log files without making the user base again. Be sure to run
python3 simulate.py -h
to review the command-line arguments. Set the seed to27
. This data will be used for the remaining exercises. - Find the number of unique usernames, attempts, successes, and failures, as well as the success/failure rates per IP address, using the data simulated from exercise 1.
- Create two subplots with failures versus attempts on the left, and failure rate versus distinct usernames on the right. Draw decision boundaries for the resulting plots. Be sure to color each data point by whether or not it is a hacker IP address.
- Build a rule-based criteria using the percentage difference from the median that flags an IP address if the failures and attempts are both five times their respective medians, or if the distinct usernames count is five times its...