Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Learning Social Media Analytics with R

You're reading from   Learning Social Media Analytics with R Transform data from social media platforms into actionable business insights

Arrow left icon
Product type Paperback
Published in May 2017
Publisher Packt
ISBN-13 9781787127524
Length 394 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (4):
Arrow left icon
Raghav Bali Raghav Bali
Author Profile Icon Raghav Bali
Raghav Bali
Dipanjan Sarkar Dipanjan Sarkar
Author Profile Icon Dipanjan Sarkar
Dipanjan Sarkar
Karthik Ganapathy Karthik Ganapathy
Author Profile Icon Karthik Ganapathy
Karthik Ganapathy
Tushar Sharma Tushar Sharma
Author Profile Icon Tushar Sharma
Tushar Sharma
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. Getting Started with R and Social Media Analytics 2. Twitter – What's Happening with 140 Characters FREE CHAPTER 3. Analyzing Social Networks and Brand Engagements with Facebook 4. Foursquare – Are You Checked in Yet? 5. Analyzing Software Collaboration Trends I – Social Coding with GitHub 6. Analyzing Software Collaboration Trends II - Answering Your Questions with StackExchange 7. Believe What You See – Flickr Data Analysis 8. News – The Collective Social Media! Index

Getting started with R

This section will help you get started with setting up your analysis and development environment and also acquaint you with the syntax, data structures, constructs, and other important concepts related to the R programming language. Feel free to skim through this section if you consider yourself to be a master of R! We will be mainly focusing our attention on the following topics:

  • Environment setup
  • Data types
  • Data structures
  • Functions
  • Controlling code flow
  • Advanced operations
  • Visualizing data
  • Next steps

We will be explaining each construct or concept with hands-on examples and code so that it is easier to understand and you can also learn by doing. Before we dive into further details, let us briefly get to know more about R. R is actually a scripting language but is used extensively for statistical modeling and analysis. The roots of R lie in the S language which was a statistical programming language developed by AT&T. R is a community-driven language and has grown by leaps and bounds over the years. It now has a vast arsenal of tools, frameworks, and packages for processing, analyzing, and visualizing any type of data. Because it's open source, the community posts constant improvements to the base R language, and it introduces extremely powerful R packages capable of performing complex analyzes and visualizations.

R and Python are perhaps the two most popular languages to be used for statistical analysis; and R is often preferred by statisticians, mathematicians, and data scientists because it has more capabilities related to statistical modeling, learning, and algorithms. R is maintained by the Comprehensive R Archival Network (CRAN) and includes all the latest and past versions, binaries, and source code for R, and its packages for different operating systems. Capabilities also exist to connect and interface R with other frameworks including big data frameworks such as Hadoop and Spark, computing platforms, and languages such as Python, Matlab, SPSS, and data interfaces to any possible source such as social media platforms, news portals, the Internet of Things based device data, web traffic data, and so on.

Environment setup

We will be discussing the necessary steps for setting up a proper analysis environment by installing the necessary dependencies around the R ecosystem and also the necessary code snippets, functions, and modules which we will be using across all the chapters. You can refer to any code snippet being used across any chapter from the code files which will be provided for each chapter along with this book. Besides that, you can also access our GitHub repository https://github.com/dipanjanS/learning-social-media-analytics-with-r for necessary code modules, snippets and functions which will be used in the book and adopt them for your own analyzes!

The R language is free and open-source as we mentioned earlier, and is available for all major operating systems. At the time of writing this book, the latest version of R is 3.3.1 (code named Bug in Your Hair) and is available for downloading at https://www.r-project.org/. This link includes detailed steps, but the direct download page can be accessed at https://cloud.r-project.org/ if you are interested. Download the necessary binary distribution based on your operating system of choice and run the executable setup following the necessary instructions for the Windows platform. If you are using Unix or any *nix like environment, you can install it directly from the terminal too if needed.

Once R is installed, you can fire up the R interpreter directly. This has a graphical user interface (GUI) containing an editor where you can write your code and then execute it. We recommend using an Integrated Development Environment (IDE) instead which eases development and helps maintain code in a more structured way. Besides this you can also use it for other capabilities like generating R markdown documents, R notebooks and Shiny Web Applications. We recommend using RStudio which provides a user-friendly interface for working with R. You can download and install it from https://www.rstudio.com/products/rstudio/download3/ which contains installers for various operating systems.

Once installed, you can start RStudio and use R directly from the IDE itself. It usually contains a code editor window at the top and the R interactive interpreter in the bottom. The interactive interpreter is often called a Read-Evaluate-Print-Loop (REPL). The interpreter asks for input, evaluates and instantly returns the output if any in the interpreter window itself. The interface usually shows the > symbol when waiting for any input and often shows the + symbol in the prompt when you enter code which spans multiple lines. Anything in R is usually a vector and outputs are usually returned with square brackets preceding it, like [1] indicating the output is a vector of size one. Comments are used to describe functions or sections of code. We can specify comments by using the # symbol followed by text. A sample execution in the R interpreter is shown in the following code for convenience:

> 10 + 5
[1] 15
> c(1,2,3,4)
[1] 1 2 3 4
> 5 == 5
[1] TRUE
> fruit = 'apple'
> if (fruit == 'apple'){
+     print('Apple')
+ }else{
+     print('Orange')
+ }
[1] "Apple"

You can see various operations being performed in the R interpreter in the preceding code snippet, including some conditional evaluations and basic arithmetic. We will now delve deeper into the various constructs of R.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime