Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Big Data Analytics with R

You're reading from   Big Data Analytics with R Leverage R Programming to uncover hidden patterns in your Big Data

Arrow left icon
Product type Paperback
Published in Jul 2016
Publisher Packt
ISBN-13 9781786466457
Length 506 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Simon Walkowiak Simon Walkowiak
Author Profile Icon Simon Walkowiak
Simon Walkowiak
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

What this book covers

Chapter 1, The Era of "Big Data", gently introduces the concept of Big Data, the growing landscape of large-scale analytics tools, and the origins of R programming language and the statistical environment.

Chapter 2, Introduction to R Programming Language and Statistical Environment, explains the most essential data management and processing functions available to R users. This chapter also guides you through various methods of Exploratory Data Analysis and hypothesis testing in R, for instance, correlations, tests of differences, ANOVAs, and Generalized Linear Models.

Chapter 3, Unleashing the Power of R From Within, explores possibilities of using R language for large-scale analytics and out-of-memory data on a single machine. It presents a number of third-party packages and core R methods to address traditional limitations of Big Data processing in R.

Chapter 4, Hadoop and MapReduce Framework for R, explains how to create a cloud-hosted virtual machine with Hadoop and to integrate its HDFS and MapReduce frameworks with R programming language. In the second part of the chapter, you will be able to carry out a large-scale analysis of electricity meter data on a multinode Hadoop cluster directly from the R console.

Chapter 5, R with Relational Database Management Systems (RDBMSs), guides you through the process of setting up and deploying traditional SQL databases, for example,  SQLite, PostgreSQL and MariaDB/MySQL, which can be easily integrated with their current R-based data analytics workflows. The chapter also provides detailed information on how to build and benefit from a highly scalable Amazon Relational Database Service instance and query its records directly from R.

Chapter 6, R with Non-Relational (NoSQL) Databases, builds on the skills acquired in the previous chapters and allows you to connect R with two popular nonrelational databases a.) a fast and user-friendly MongoDB installed on a Linux-run virtual machine, and b.) HBase database operated on a Hadoop cluster run as part of the Azure HDInsight service.

Chapter 7, Faster than Hadoop: Spark with R, presents a practical example and a detailed explanation of R integration with the Apache Spark framework for faster Big Data manipulation and analysis. Additionally, the chapter shows how to use Hive database as a data source for Spark on a multinode cluster with Hadoop and Spark installed.

Chapter 8, Machine Learning Methods for Big Data in R, takes you on a journey through the most cutting-edge predictive analytics available in R. Firstly, you will perform fast and highly optimized Generalized Linear Models using Spark MLlib library on a multinode Spark HDInsight cluster. In the second part of the chapter, you will implement Naïve Bayes and multilayered Neural Network algorithms using R’s connectivity with H2O-an award-winning, open source, big data distributed machine learning platform.

Chapter 9, The Future of R: Big, Fast and Smart Data, wraps up the contents of the earlier chapters by discussing potential areas of development for R language and its opportunities in the landscape of emerging Big Data tools.

Online Chapter, Pushing R Further, available at https://www.packtpub.com/sites/default/files/downloads/5396_6457OS_ PushingRFurther.pdf, enables you to configure and deploy their own scaled-up and Cloud-based virtual machine with fully operational R and RStudio Server installed and ready to use.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime