You're reading from Extending Power BI with Python and R Ingest, transform, enrich, and visualize data using the power of analytical languages

Product type Paperback

Published in Nov 2021

Publisher Packt

ISBN-13 9781801078207

Length 558 pages

Edition 1st Edition

Languages

Python

Tools

Power BI

Concepts

Business Intelligence

Author (1):

Luca Zavarella

View More author details

Table of Contents (22) Chapters

Preface

1. Section 1: Best Practices for Using R and Python in Power BI

2. Chapter 1: Where and How to Use R and Python Scripts in Power BI FREE CHAPTER

3. Chapter 2: Configuring R with Power BI

4. Chapter 3: Configuring Python with Power BI

5. Section 2: Data Ingestion and Transformation with R and Python in Power BI

6. Chapter 4: Importing Unhandled Data Objects

7. Chapter 5: Using Regular Expressions in Power BI

8. Chapter 6: Anonymizing and Pseudonymizing Your Data in Power BI

9. Chapter 7: Logging Data from Power BI to External Sources

10. Chapter 8: Loading Large Datasets beyond the Available RAM in Power BI

11. Section 3: Data Enrichment with R and Python in Power BI

12. Chapter 9: Calling External APIs to Enrich Your Data

13. Chapter 10: Calculating Columns Using Complex Algorithms

14. Chapter 11: Adding Statistics Insights: Associations

15. Chapter 12: Adding Statistics Insights: Outliers and Missing Values

16. Chapter 13: Using Machine Learning without Premium or Embedded Capacity

17. Section 3: Data Visualization with R in Power BI

18. Chapter 14: Exploratory Data Analysis

19. Chapter 15: Advanced Visualizations

20. Chapter 16: Interactive R Custom Visuals

21. Other Books You May Enjoy

Implementing missing value imputation algorithms

From here on, all missing value analysis will be done in R because very statistically specialized and simple-to-use packages that do not exist in the Python ecosystem have been developed for this language.

Suppose we need to calculate the Pearson correlation coefficient between the two numerical variables, Age and Fare, of the Titanic disaster dataset. Let's first consider the case where missing values are eliminated.

Removing missing values

The impact of applying listwise and pairwise deletion techniques is evident in the calculation of Pearson's correlation between numerical variables in the Titanic dataset. Let's load the data and select only numeric features:

library(dplyr)
dataset_url <- 'http://bit.ly/titanic-data-csv'
tbl <- readr::read_csv(dataset_url)
tbl_num <- tbl %>% 
  select( where(is.numeric) )

If you now calculate the correlation matrix for the two techniques...

The rest of the chapter is locked

You're reading from Extending Power BI with Python and R Ingest, transform, enrich, and visualize data using the power of analytical languages

Table of Contents (22) Chapters

Implementing missing value imputation algorithms

Removing missing values

Authors (1)

Personalised recommendations for you

You're reading from Extending Power BI with Python and R Ingest, transform, enrich, and visualize data using the power of analytical languages

Table of Contents (22) Chapters

Implementing missing value imputation algorithms

Removing missing values

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you