You're reading from Data Wrangling with R Load, explore, transform and visualize data for modeling with tidyverse libraries

Product type Paperback

Published in Feb 2023

Publisher Packt

ISBN-13 9781803235400

Length 384 pages

Edition 1st Edition

Languages

Tools

Power BI

Concepts

Data Mining

Author (1):

Gustavo Santos

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Load and Explore Data

2. Chapter 1: Fundamentals of Data Wrangling FREE CHAPTER

3. Chapter 2: Loading and Exploring Datasets

4. Chapter 3: Basic Data Visualization

5. Part 2: Data Wrangling

6. Chapter 4: Working with Strings

7. Chapter 5: Working with Numbers

8. Chapter 6: Working with Date and Time Objects

9. Chapter 7: Transformations with Base R

10. Chapter 8: Transformations with Tidyverse Libraries

11. Chapter 9: Exploratory Data Analysis

12. Part 3: Data Visualization

13. Chapter 10: Introduction to ggplot2

14. Chapter 11: Enhanced Visualizations with ggplot2

15. Chapter 12: Other Data Visualization Options

16. Part 4: Modeling

17. Chapter 13: Building a Model with R

18. Chapter 14: Build an Application with Shiny in R

19. Conclusion

20. Other Books You May Enjoy

What this book covers

Chapter 1, Fundamentals of Data Wrangling, will introduce this book’s main theme, explaining what data wrangling is and why and when to use it. In addition, it also shows the main steps of a data science project and covers three well-known frameworks for data science projects.

Chapter 2, Load and Explore Datasets, provides different ways to load datasets to RStudio. Every project begins with data, so it is important to know how to load it into your session. It also begins exploring that data to familiarize you with exploratory data analysis.

Chapter 3, Basic Data Visualization, is the first touch point with data visualization, which is an important component of any data science project. In this chapter, we will learn about the first steps to creating compelling and meaningful graphics using only the built-in library from R.

Chapter 4, Working with Strings, starts our journey of learning about the wrangling functions for each major variable type. In this chapter, we study many possible transformations with text, from detecting words in a phrase or in a dataset to some highly customized functions that involve regular expressions and text mining concepts.

Chapter 5, Working with Numbers, comprises the transformations for numerical variables. The chapter covers operations with vectors, matrices, and data frames and also covers the apply functions and how to make a good read of the descriptive statistics of a dataset.

Chapter 6, Working with Date and Time Objects, is where we will learn more about this fascinating object type, date and time. It introduces concepts from the basics of creating a date and time object to a practical project that shows how it can be used in an analysis.

Chapter 7, Transformations with Base R, is the core of the book, exploring the most important transformations to be performed in a dataset. This chapter covers tasks such as slicing, grouping, replacing, arranging, binding data, and more. The most used transformations are covered here and mostly use the built-in functions without the need to load extra libraries.

Chapter 8, Transformations with tidyverse Libraries, follows the same idea as Chapter 7, but this time, the transformations are performed with tidyverse, which is a highly used R package for data science.

Chapter 9, Exploratory Data Analysis, is all about practice. After going over many transformation functions for different types of variables, it’s time to put the acquired knowledge into practice and work on a complete exploratory data analysis project.

Chapter 10, Introduction to ggplot2, introduces the visualization library, ggplot2, which is the most used library for data visualization in the R language, given its flexibility and robustness. In this chapter, we will learn more about the grammar of graphics and how ggplot2 is created based on this concept. We will also cover many kinds of plots and how to create them.

Chapter 11, Enhanced Visualizations with ggplot2, covers more advanced types of graphics that can be created with ggplot2, such as facet grids, maps, and 3D graphics.

Chapter 12, Other Data Visualization Options, is where we will see yet more options to visualize data, such as creating a basic plot in Microsoft Power BI but using the R language. We will also cover how to create word clouds and when that kind of visualization can be useful.

Chapter 13, Build a Model with R, is all about an end-to-end data science project. We will get a dataset and start exploring it, then we will clean the data and create some visualizations that help us to explain the steps taken, and that will lead us to the best model to be created.

Chapter 14, Build an Application with Shiny in R, is the final chapter, where we will take the model created in Chapter 13 and put it in production using a web application created with Shiny for R.