Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Extending Excel with Python and R

You're reading from   Extending Excel with Python and R Unlock the potential of analytics languages for advanced data manipulation and visualization

Arrow left icon
Product type Paperback
Published in Apr 2024
Publisher Packt
ISBN-13 9781804610695
Length 344 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Steven Sanderson Steven Sanderson
Author Profile Icon Steven Sanderson
Steven Sanderson
David Kun David Kun
Author Profile Icon David Kun
David Kun
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Part 1:The Basics – Reading and Writing Excel Files from R and Python FREE CHAPTER
2. Chapter 1: Reading Excel Spreadsheets 3. Chapter 2: Writing Excel Spreadsheets 4. Chapter 3: Executing VBA Code from R and Python 5. Chapter 4: Automating Further – Task Scheduling and Email 6. Part 2: Making It Pretty – Formatting, Graphs, and More
7. Chapter 5: Formatting Your Excel Sheet 8. Chapter 6: Inserting ggplot2/matplotlib Graphs 9. Chapter 7: Pivot Tables and Summary Tables 10. Part 3: EDA, Statistical Analysis, and Time Series Analysis
11. Chapter 8: Exploratory Data Analysis with R and Python 12. Chapter 9: Statistical Analysis: Linear and Logistic Regression 13. Chapter 10: Time Series Analysis: Statistics, Plots, and Forecasting 14. Part 4: The Other Way Around – Calling R and Python from Excel
15. Chapter 11: Calling R/Python Locally from Excel Directly or via an API 16. Part 5: Data Analysis and Visualization with R and Python for Excel Data – A Case Study
17. Chapter 12: Data Analysis and Visualization with R and Python in Excel – A Case Study 18. Index 19. Other Books You May Enjoy

Performing a simple ML model with Python

In this section, we create a simple ML model in Python. Python has grown to be the primary go-to language for ML work (with R as the obvious alternative) and the number of packages implementing ML algorithms is difficult to overestimate. Having said that, sklearn remains the most widely used so we will also choose it for this section. Similarly to the R part of the chapter, we will use the xgboost model because it has a great balance between performance and explainability.

We will use the data loaded in the previous section.

Data preprocessing

The first thing to do for the modeling phase is to prepare the data. Fortunately, sklearn comes with a preprocessing functionality built-in!

Let’s review the steps involved in data preprocessing:

  • Handling missing values: Before training a model, it’s essential to address missing values in the dataset. sklearn provides methods for imputing missing values or removing rows...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime