You're reading from Extending Power BI with Python and R Perform advanced analysis using the power of analytical languages

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781837639533

Length 814 pages

Edition 2nd Edition

Languages

Python

Tools

Power BI

Concepts

Business Intelligence

Author (1):

Luca Zavarella

View More author details

Table of Contents (27) Chapters

Preface

1. Where and How to Use R and Python Scripts in Power BI FREE CHAPTER

2. Configuring R with Power BI

3. Configuring Python with Power BI

4. Solving Common Issues When Using Python and R in Power BI

5. Importing Unhandled Data Objects

6. Using Regular Expressions in Power BI

7. Anonymizing and Pseudonymizing Your Data in Power BI

8. Logging Data from Power BI to External Sources

9. Loading Large Datasets Beyond the Available RAM in Power BI

10. Boosting Data Loading Speed in Power BI with Parquet Format

11. Calling External APIs to Enrich Your Data

12. Calculating Columns Using Complex Algorithms: Distances

13. Calculating Columns Using Complex Algorithms: Fuzzy Matching

14. Calculating Columns Using Complex Algorithms: Optimization Problems

15. Adding Statistical Insights: Associations

16. Adding Statistical Insights: Outliers and Missing Values

17. Using Machine Learning without Premium or Embedded Capacity

18. Using SQL Server External Languages for Advanced Analytics and ML Integration in Power BI

19. Exploratory Data Analysis

20. Using the Grammar of Graphics in Python with plotnine

21. Advanced Visualizations

22. Interactive R Custom Visuals

23. Other Books You May Enjoy

24. Index

Appendix 1: Answers

1. Appendix 2: Glossary

Importing large datasets with R

The same scalability limitations illustrated for Python packages used to manipulate data also exist for R packages in the Tidyverse ecosystem. Even in R, it is not possible to use a dataset larger than the available RAM on the machine. The first solution that is adopted in these cases is also to switch to Spark-based distributed systems that provide the SparkR language. It provides a distributed implementation of the DataFrame you are used to in R, supporting filtering, aggregation, and selection operations as you do with the dplyr package. For those of us who are fans of the Tidyverse world, RStudio is actively developing the sparklyr package, which allows you to use all the functionality of dplyr, even for distributed DataFrames. However, using Spark-based systems to process CSVs that together take up little more than the RAM you have available on your machine may be overkill due to the overhead of all the Java infrastructure needed to run them.In the...

The rest of the chapter is locked

You're reading from Extending Power BI with Python and R Perform advanced analysis using the power of analytical languages

Table of Contents (27) Chapters

Importing large datasets with R

Authors (1)

Personalised recommendations for you

You're reading from Extending Power BI with Python and R Perform advanced analysis using the power of analytical languages

Table of Contents (27) Chapters

Importing large datasets with R

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you