As mentioned earlier, the application we will be creating is a web browser classification application. Using the knowledge garnered in the logistic classification chapter, we will be using the SdcaLogisticRegression algorithm to take the text content of a web page, featurize the text, and provide a confidence level of maliciousness. In addition, we will be integrating this technique into a Windows 10 UWP application that mimics a web browser—effectively on navigation to a page—running the model, and making a determination as to whether the page was malicious. If found to be malicious, we redirect to a warning page. While in a real-world scenario this might prove too slow to run on every page, the benefits of a highly secured web browser, depending on the environment requirements might far outweigh the slight overhead running our model incurs.
As with previous chapters, the completed project code, sample dataset, and project...