What this book covers
Chapter 1, New Missions – New Tools, addresses the tools that we're going to use. It's imperative that agents use the latest and most sophisticated tools. We'll guide field agents through the procedures required to get Python 3.4. We'll install the Beautiful Soup package, which helps you analyze and extract data from HTML pages. We'll install the Twitter API so that we can extract data from the social network. We'll add PDFMiner3K so that we can dig data out of PDF files. We'll also add the Arduino IDE so that we can create customized gadgets based on the Arduino processor.
Chapter 2, Tracks, Trails, and Logs, looks at the analysis of bulk data. We'll focus on the kinds of logs produced by web servers as they have an interesting level of complexity and contain valuable information on who's providing intelligence data and who's gathering this data. We'll leverage Python's regular expression module, re
, to parse log data files. We'll also look at ways in which we can process compressed files using the gzip
module.
Chapter 3, Following the Social Network, discusses one of the social networks. A field agent should know who's communicating and what they're communicating about. A network such as Twitter will reveal social connections based on who's following whom. We can also extract meaningful content from a Twitter stream, including text and images.
Chapter 4, Dredging Up History, provides you with essential pointers on extracting useful data from PDF files. Many agents find that a PDF file is a kind of dead-end because the data is inaccessible. There are tools that allow us to extract useful data from PDF. As PDF is focused on high-quality printing and display, it can be challenging to extract data suitable for analysis. We'll show some techniques with the PDFMiner package that can yield useful intelligence. Our goal is to transform a complex file into a simple CSV file, very much similar to the logs that we analyzed in Chapter 2, Tracks, Trails, and Logs.
Chapter 5, Data Collection Gadgets, expands the field agent's scope of operations to the Internet of Things (IoT). We'll look at ways to create simple Arduino sketches in order to read a typical device; in this case, an infrared distance sensor. We'll look at how we will gather and analyze raw data to do instrument calibration.