All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletters

Free Learning

You're reading from Hands-On Machine Learning for Cybersecurity

Product type Book

Published in Dec 2018

Publisher Packt

ISBN-13 9781788992282

Pages 318 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Authors (2):

Soma Halder

Sinan Ozdemir

View More author details

Table of Contents (13) Chapters

Preface

1. Basics of Machine Learning in Cybersecurity

2. Time Series Analysis and Ensemble Modeling

3. Segregating Legitimate and Lousy URLs

4. Knocking Down CAPTCHAs

5. Using Data Science to Catch Email Fraud and Spam

6. Efficient Network Anomaly Detection Using k-means

7. Decision Tree and Context-Based Malicious Event Detection

8. Catching Impersonators and Hackers Red Handed

9. Changing the Game with TensorFlow

10. Financial Fraud and How Deep Learning Can Mitigate It

11. Case Studies

12. Other Books You May Enjoy

Leave a review - let other readers know what you think

Revisiting malicious URL detection with decision trees

We will revisit a problem that is detecting malicious URLs, and we will find a way to solve the same with decision trees. We start by loading the data:

 from urlparse import urlparse
 import pandas as pd
 urls = pd.read_json("../data/urls.json")
 print urls.shape
 urls['string'] = "http://" + urls['string']

(5000, 3)

On printing the head of the urls:

urls.head(10)

The output looks as follows:

	pred	string	truth
0	1.574204e-05	http://startbuyingstocks.com/	0
1	1.840909e-05	http://qqcvk.com/	0
2	1.842080e-05	http://432parkavenue.com/	0
3	7.954729e-07	http://gamefoliant.ru/	0
4	3.239338e-06	http://orka.cn/	0
5	3.043137e-04	http://media2.mercola.com/	0
6	4.107331e-37	http://ping.chartbeat.net/ping?h=sltrib.com&p...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at ₹800/month. Cancel anytime

Authors (2)

Soma Halder

Soma Halder is the data science lead of the big data analytics group at Reliance Jio Infocomm Ltd, one of India's largest telecom companies. She specializes in analytics, big data, cybersecurity, and machine learning. She has approximately 10 years of machine learning experience, especially in the field of cybersecurity. She studied at the University of Alabama, Birmingham where she did her master's with an emphasis on Knowledge discovery and Data Mining and computer forensics. She has worked for Visa, Salesforce, and AT&T. She has also worked for start-ups, both in India and the US (E8 Security, Headway ai, and Norah ai). She has several conference publications to her name in the field of cybersecurity, machine learning, and deep learning.

See other products by Soma Halder

Sinan Ozdemir

Sinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master's Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.

See other products by Sinan Ozdemir