You're reading from Data Science with .NET and Polyglot Notebooks Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781835882962

Length 404 pages

Edition 1st Edition

Languages

Tools

Codespaces

Concepts

Artificial Intelligence

Author (1):

Matt Eland

View More author details

Table of Contents (22) Chapters

Preface

1. Part 1: Data Analysis in Polyglot Notebooks

2. Chapter 1: Data Science, Notebooks, and Kernels FREE CHAPTER

3. Chapter 2: Exploring Polyglot Notebooks

4. Chapter 3: Getting Data and Code into Your Notebooks

5. Chapter 4: Working with Tabular Data and DataFrames

6. Chapter 5: Visualizing Data

7. Chapter 6: Variable Correlations

8. Part 2: Machine Learning with Polyglot Notebooks and ML.NET

9. Chapter 7: Classification Experiments with ML.NET AutoML

10. Chapter 8: Regression Experiments with ML.NET AutoML

11. Chapter 9: Beyond AutoML: Pipelines, Trainers, and Transforms

12. Chapter 10: Deploying Machine Learning Models

13. Part 3: Exploring Generative AI with Polyglot Notebooks

14. Chapter 11: Generative AI in Polyglot Notebooks

15. Chapter 12: AI Orchestration with Semantic Kernel

16. Part 4: Polyglot Notebooks in the Enterprise

17. Chapter 13: Enriching Documentation with Mermaid Diagrams

18. Chapter 14: Extending Polyglot Notebooks

19. Chapter 15: Adopting and Deploying Polyglot Notebooks

20. Index

Why subscribe?

21. Other Books You May Enjoy

Controlling AutoML pipelines

In this section, we’ll explore a few ways we can improve our Featurizer and Regression calls in our pipeline by telling AutoML more about our data and what we want it to try.

We’ll start with Featurizer.

Customizing the Featurizer

Right now, we’re getting the default behavior, where Featurizer will use all of our columns except for the name and label column. In this setup, the Featurizer has to guess what each column means.

However, Featurizer also lets us tell it more about our data. It does this by giving us parameters that let us explicitly tell it which columns are numeric, text, or categorical. Actually, we can figure these values out from most DataFrame objects by looking at the column types.

Let’s start by declaring a small ColumnData class to contain our column names:

public class ColumnData {
    public List<string> Text {get; set;} = new();
    public List...

The rest of the chapter is locked

You're reading from Data Science with .NET and Polyglot Notebooks Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel

Table of Contents (22) Chapters

Controlling AutoML pipelines

Customizing the Featurizer

Authors (1)

Personalised recommendations for you

You're reading from Data Science with .NET and Polyglot Notebooks Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel

Table of Contents (22) Chapters

Controlling AutoML pipelines

Customizing the Featurizer

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you