Implementing a k-Nearest Neighbors classifier
One simple way to classify an item is to look at only its neighboring data. The k-Nearest Neighbors algorithm looks at k items located closest to the item in question. The item is then classified as the most common classification of its k neighbors. This heuristic has been very promising for a wide variety of classification tasks.
In this recipe, we will implement the k-Nearest Neighbors algorithm using a k-d tree data structure, which is a binary tree with special properties that allow efficient representation of points in a k-dimensional space.
Imagine we have a web server for our hip new website. Every time someone requests a web page, our web server will fetch the file and present the page. However, bots can easily hammer a web server with thousands of requests, potentially causing a denial of service attack. In this recipe, we will classify whether a web request is being made by a human or a bot.
Getting ready
Install the KdTree
, CSV
, and...