Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mastering Social Media Mining with R

You're reading from   Mastering Social Media Mining with R Extract valuable data from your social media sites and make better business decisions using R

Arrow left icon
Product type Paperback
Published in Sep 2015
Publisher
ISBN-13 9781784396312
Length 248 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Toc

Table of Contents (8) Chapters Close

Preface 1. Fundamentals of Mining FREE CHAPTER 2. Mining Opinions, Exploring Trends, and More with Twitter 3. Find Friends on Facebook 4. Finding Popular Photos on Instagram 5. Let's Build Software with GitHub 6. More Social Media Websites Index

Exploratory data analysis


EDA techniques are used for discovering patterns in the data, summarization, as well as for visualization of the data. It is an essential step in the data analysis process, which helps to formulate various hypotheses about the data.

The EDA techniques shall be broadly classified into three types: univariate, bivariate, and multivariate analysis. Let's implement a few of the EDA techniques on our dataset.

First, let's see what kind of data we are analyzing. Using the function sapply, we determine the various columns present in the dataset and the datatype of those columns:

sapply(ausersubset, class)

We get the following output:

Note

Note that the preceding screenshot is just a part of the output.

In order to get a basic understanding of the whole dataset, such as the distribution of the values of the columns, we can use the summary function to get the highlights of the dataset. For example, we will get the minimum, mean, median, maximum, and quartile values for each column...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image