Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Julia for Data Science

You're reading from   Julia for Data Science high-performance computing simplified

Arrow left icon
Product type Paperback
Published in Sep 2016
Publisher Packt
ISBN-13 9781785289699
Length 346 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Anshul Joshi Anshul Joshi
Author Profile Icon Anshul Joshi
Anshul Joshi
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. The Groundwork – Julia's Environment 2. Data Munging FREE CHAPTER 3. Data Exploration 4. Deep Dive into Inferential Statistics 5. Making Sense of Data Using Visualization 6. Supervised Machine Learning 7. Unsupervised Machine Learning 8. Creating Ensemble Models 9. Time Series 10. Collaborative Filtering and Recommendation System 11. Introduction to Deep Learning

Sampling


In the previous example, we spoke about calculating the mean height of 1,000 people out of the 10 million people living in New Delhi. While gathering the data of these 10 million people, let's say we started from a particular age or community, or in any sequential manner. Now, if we take 1,000 people who are consecutive in the dataset, there is a high probability that they would have similarities among them. This similarity would not give us the actual highlight of the dataset that we are trying to achieve. So, taking a small chunk of consecutive data points from the dataset wouldn't give us the insight that we want to gain. To overcome this, we use sampling.

Sampling is a technique to randomly select data from the given dataset such that they are not related to each other, and therefore we can generalize the results that we generate on this selected data over the complete dataset. Sampling is done over a population.

Population

A population in statistics refers to the set of all the...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image