Search icon CANCEL
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Clojure for Data Science

You're reading from  Clojure for Data Science

Product type Book
Published in Sep 2015
Publisher
ISBN-13 9781784397180
Pages 608 pages
Edition 1st Edition
Languages
Author (1):
Henry Garner Henry Garner
Profile icon Henry Garner

Table of Contents (18) Chapters

Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
1. Statistics 2. Inference 3. Correlation 4. Classification 5. Big Data 6. Clustering 7. Recommender Systems 8. Network Analysis 9. Time Series 10. Visualization Index

Chapter 1. Statistics

 

"The people who cast the votes decide nothing. The people who count the votes decide everything."

 
 --Joseph Stalin

Over the course of the following ten chapters of Clojure for Data Science, we'll attempt to discover a broadly linear path through the field of data science. In fact, we'll find as we go that the path is not quite so linear, and the attentive reader ought to notice many recurring themes along the way.

Descriptive statistics concern themselves with summarizing sequences of numbers and they'll appear, to some extent, in every chapter in this book. In this chapter, we'll build foundations for what's to come by implementing functions to calculate the mean, median, variance, and standard deviation of numerical sequences in Clojure. While doing so, we'll attempt to take the fear out of interpreting mathematical formulae.

As soon as we have more than one number to analyze it becomes meaningful to ask how those numbers are distributed. You've probably already heard expressions such as "long tail" and the "80/20 rule". They concern the spread of numbers throughout a range. We demonstrate the value of distributions in this chapter and introduce the most useful of them all: the normal distribution.

The study of distributions is aided immensely by visualization, and for this we'll use the Clojure library Incanter. We'll show how Incanter can be used to load, transform, and visualize real data. We'll compare the results of two national elections—the 2010 United Kingdom general election and the 2011 Russian presidential election—and see how even basic analysis can provide evidence of potentially fraudulent activity.

You have been reading a chapter from
Clojure for Data Science
Published in: Sep 2015 Publisher: ISBN-13: 9781784397180
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}