Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Practical Data Analysis
Practical Data Analysis

Practical Data Analysis: For small businesses, analyzing the information contained in their data using open source technology could be game-changing. All you need is some basic programming and mathematical skills to do just that.

eBook
€22.99 €32.99
Paperback
€41.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Practical Data Analysis

Chapter 1. Getting Started

Data analysis is the process in which raw data is ordered and organized, to be used in methods that help to explain the past and predict the future. Data analysis is not about the numbers, it is about making/asking questions, developing explanations, and testing hypotheses. Data Analysis is a multidisciplinary field, which combines Computer Science, Artificial Intelligence & Machine Learning, Statistics & Mathematics, and Knowledge Domain as shown in the following figure:

Computer science


Computer science creates the tools for data analysis. The vast amount of data generated has made computational analysis critical and has increased the demand for skills such as programming, database administration, network administration, and high-performance computing. Some programming experience in Python (or any high-level programming language) is needed to understand the chapters.

Artificial intelligence (AI)


According to Stuart Russell and Peter Norvig:

"[AI] has to do with smart programs, so let's get on and write some."

In other words, AI studies the algorithms that can simulate an intelligent behavior. In data analysis, we use AI to perform those activities that require intelligence such as inference, similarity search, or unsupervised classification.

Left arrow icon Right arrow icon

Key benefits

  • Explore how to analyze your data in various innovative ways and turn them into insight
  • Learn to use the D3.js visualization tool for exploratory data analysis
  • Understand how to work with graphs and social data analysis
  • Discover how to perform advanced query techniques and run MapReduce on MongoDB

Description

Plenty of small businesses face big amounts of data but lack the internal skills to support quantitative analysis. Understanding how to harness the power of data analysis using the latest open source technology can lead them to providing better customer service, the visualization of customer needs, or even the ability to obtain fresh insights about the performance of previous products. Practical Data Analysis is a book ideal for home and small business users who want to slice and dice the data they have on hand with minimum hassle.Practical Data Analysis is a hands-on guide to understanding the nature of your data and turn it into insight. It will introduce you to the use of machine learning techniques, social networks analytics, and econometrics to help your clients get insights about the pool of data they have at hand. Performing data preparation and processing over several kinds of data such as text, images, graphs, documents, and time series will also be covered.Practical Data Analysis presents a detailed exploration of the current work in data analysis through self-contained projects. First you will explore the basics of data preparation and transformation through OpenRefine. Then you will get started with exploratory data analysis using the D3js visualization framework. You will also be introduced to some of the machine learning techniques such as, classification, regression, and clusterization through practical projects such as spam classification, predicting gold prices, and finding clusters in your Facebook friends' network. You will learn how to solve problems in text classification, simulation, time series forecast, social media, and MapReduce through detailed projects. Finally you will work with large amounts of Twitter data using MapReduce to perform a sentiment analysis implemented in Python and MongoDB. Practical Data Analysis contains a combination of carefully selected algorithms and data scrubbing that enables you to turn your data into insight.

Who is this book for?

This book is for developers, small business users, and analysts who want to implement data analysis and visualization for their company in a practical way. You need no prior experience with data analysis or data processing; however, basic knowledge of programming, statistics, and linear algebra is assumed.

What you will learn

  • Work with data to get meaningful results from your data analysis projects Visualize your data to find trends and correlations Build your own image similarity search engine Learn how to forecast numerical values from time series data Create an interactive visualization for your social media graphExplore the MapReduce framework in MongoDB Create interactive simulations with D3js

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 22, 2013
Length: 360 pages
Edition : 1st
Language : English
ISBN-13 : 9781783280995
Category :
Languages :
Concepts :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Oct 22, 2013
Length: 360 pages
Edition : 1st
Language : English
ISBN-13 : 9781783280995
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 129.97
Practical Data Analysis
€41.99
Machine Learning with R
€45.99
Building Machine Learning Systems with Python
€41.99
Total 129.97 Stars icon

Table of Contents

14 Chapters
Getting Started Chevron down icon Chevron up icon
Working with Data Chevron down icon Chevron up icon
Data Visualization Chevron down icon Chevron up icon
Text Classification Chevron down icon Chevron up icon
Similarity-based Image Retrieval Chevron down icon Chevron up icon
Simulation of Stock Prices Chevron down icon Chevron up icon
Predicting Gold Prices Chevron down icon Chevron up icon
Working with Support Vector Machines Chevron down icon Chevron up icon
Modeling Infectious Disease with Cellular Automata Chevron down icon Chevron up icon
Working with Social Graphs Chevron down icon Chevron up icon
Sentiment Analysis of Twitter Data Chevron down icon Chevron up icon
Data Processing and Aggregation with MongoDB Chevron down icon Chevron up icon
Working with MapReduce Chevron down icon Chevron up icon
Online Data Analysis with IPython and Wakari Chevron down icon Chevron up icon

Customer reviews

Most Recent
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.6
(7 Ratings)
5 star 28.6%
4 star 42.9%
3 star 0%
2 star 14.3%
1 star 14.3%
Filter icon Filter
Most Recent

Filter reviews by




Pedram Jun 23, 2014
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
This is not a finished book. It is not well-written, the examples are hard to follow, they seem to shift quickly from one subject to another without explaining any of the logic or theory behind it. The code on the author's website doesn't match the code in the book, and some of the code is simply missing. I was very interested in this topic, but found it really hard to get through past Chapter 3.This book clearly needs more time and effort to make it something enjoyable and understandable for the reader.
Amazon Verified review Amazon
Carlos Rodriguez Contreras Feb 19, 2014
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This a very useful text for all people trying to get into Big Data Analysis. Concepts are clearly explained and readers do not need to be experts in any topic covered, this is why I chose the Cuesta's book over a lot of books on Big Data that apparently try to show mainly the expertise of authors. If you, like me, are interested in Big Data, this is a must on your shelf.
Amazon Verified review Amazon
R. Friesel Jr. Dec 09, 2013
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I just finished up reading "Practical Data Analysis" by Hector Cuesta (Packt Publishing, 2013) and overall, it was a pretty good overview and recommends some good tools. I would say that the book is a good place for someone to get started if they have no real experience performing these kinds of analyses, and though Cuesta doesn't go deep into the math behind it all, he isn't afraid to use the technical names for different formulae, which should make it easy for you to do your own follow-up research.Jeff Leek's Data Analysis on Coursera provides the lens through which I read this book. That being said, I found myself doing a lot of comparing and contrasting between the two. For example, they both use practical, reasonably small "real world" sample problems to highlight specific analytical techniques and/or features of their chosen toolkits. However, whereas Leek's course focused exclusively on using R, Cuesta assembles his own all-star team of tools using Python and D3.js. Perhaps it goes without saying, but there are pros and cons to each approach (e.g., Leek's "pure R" vs. Cuesta's "Python plus D3.js"), and I felt that it was best to consider them together.Cuesta's approach with this book is to present a sample scenario in each chapter that introduces a class of problem, a solution to that problem, and his recommended toolkit. For example, chapter six creates a stock price simulation, introducing simple simulation problems (especially for apparently stochastic data), time series data and Monte Carlo methods, and then how to simulate the data using Python and visualizing it in D3.js. Although the book is not strictly a "cookbook", the chapters very much feel like macro-level "recipes". There's quite a bit of code and some decent discussion around the concepts that govern the analytical model, and (true to the "practical" in the title) the emphasis is on the "how" and not the "why".While I did not read the entire book cover-to-cover, I would definitely recommend it to anyone that wants an introduction to some basic data analysis techniques and tools. You'll get more out of this book if you have some base to compare it to -- e.g., some experience in R (academic or otherwise); and you'll get the most out of this book if you also have a solid foundation in the mathematics and/or statistics that underlie these analytical approaches.DISCLOSURE: I was given an electronic copy of this book from the publisher in exchange for writing a review.
Amazon Verified review Amazon
José Carlos Dec 07, 2013
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is not about theories of data analysis, is about how move your hacking skills into the data analysis world.If you are a programmer/hacker who want to understanding a problem from a data-oriented perspective, this book isfor you.This book is a fast introduction to data analysis methods including some of the most used techniques forclassification, regression and clustering. The book provides a wide range of tools like Python, mlpy, Pandas, D3jsand MongoDB. The recipes are clear and easy to follow you can get into data analysis in fast way if you alreadyhave some programming skills.I can highly recommend chapters 10 and 11 which focus on Social Networks Analytics and Social NetworksGraph’s Visualization.
Amazon Verified review Amazon
Joshua E. Simons Dec 05, 2013
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
I am returning this book for two reasons. First, it is poorly-written and would have benefited greatly from editing to fix basic language issues. For me, this problem was bad enough that it prevented me from reading the book, despite my interest in the topic. The second problem is that this is much more of a cookbook than I was looking for, based on the content of the SVM chapter which was of most interest to me. There is only cursory coverage of the theory underlying the techniques mentioned, which for me is a problem because without that understanding one risks using poorly understood techniques in inappropriate ways to draw questionable conclusions.For someone who already has a good grounding in data analysis, I imagine this book could be a great introduction to a variety of software tools that can be used to perform practical data analysis. But for someone like me, who wants to develop a solid understanding of the statistical principles underlying these techniques so I can apply them correctly and thoughtfully, this is not the right book.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.