Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Extending Power BI with Python and R

You're reading from   Extending Power BI with Python and R Perform advanced analysis using the power of analytical languages

Arrow left icon
Product type Paperback
Published in Mar 2024
Publisher Packt
ISBN-13 9781837639533
Length 814 pages
Edition 2nd Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Luca Zavarella Luca Zavarella
Author Profile Icon Luca Zavarella
Luca Zavarella
Arrow right icon
View More author details
Toc

Table of Contents (27) Chapters Close

Preface 1. Where and How to Use R and Python Scripts in Power BI FREE CHAPTER 2. Configuring R with Power BI 3. Configuring Python with Power BI 4. Solving Common Issues When Using Python and R in Power BI 5. Importing Unhandled Data Objects 6. Using Regular Expressions in Power BI 7. Anonymizing and Pseudonymizing Your Data in Power BI 8. Logging Data from Power BI to External Sources 9. Loading Large Datasets Beyond the Available RAM in Power BI 10. Boosting Data Loading Speed in Power BI with Parquet Format 11. Calling External APIs to Enrich Your Data 12. Calculating Columns Using Complex Algorithms: Distances 13. Calculating Columns Using Complex Algorithms: Fuzzy Matching 14. Calculating Columns Using Complex Algorithms: Optimization Problems 15. Adding Statistical Insights: Associations 16. Adding Statistical Insights: Outliers and Missing Values 17. Using Machine Learning without Premium or Embedded Capacity 18. Using SQL Server External Languages for Advanced Analytics and ML Integration in Power BI 19. Exploratory Data Analysis 20. Using the Grammar of Graphics in Python with plotnine 21. Advanced Visualizations 22. Interactive R Custom Visuals 23. Other Books You May Enjoy
24. Index
Appendix 1: Answers
1. Appendix 2: Glossary

Importing large datasets with R

The same scalability limitations illustrated for Python packages used to manipulate data also exist for R packages in the Tidyverse ecosystem. Even in R, it is not possible to use a dataset larger than the available RAM on the machine. The first solution that is adopted in these cases is also to switch to Spark-based distributed systems that provide the SparkR language. It provides a distributed implementation of the DataFrame you are used to in R, supporting filtering, aggregation, and selection operations as you do with the dplyr package. For those of us who are fans of the Tidyverse world, RStudio is actively developing the sparklyr package, which allows you to use all the functionality of dplyr, even for distributed DataFrames. However, using Spark-based systems to process CSVs that together take up little more than the RAM you have available on your machine may be overkill due to the overhead of all the Java infrastructure needed to run them.In the...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image