You're reading from Serverless Machine Learning with Amazon Redshift ML Create, train, and deploy machine learning models using familiar SQL commands

Product type Paperback

Published in Aug 2023

Publisher Packt

ISBN-13 9781804619285

Length 290 pages

Edition 1st Edition

Languages

Python

Tools

Amazon Redshift

Concepts

Machine Learning

Authors (4):

Phil Bates

Sumeet Joshi

Debu Panda

Bhanu Pittampally

View More author details

Table of Contents (19) Chapters

Preface

1. Part 1:Redshift Overview: Getting Started with Redshift Serverless and an Introduction to Machine Learning

2. Chapter 1: Introduction to Amazon Redshift Serverless FREE CHAPTER

3. Chapter 2: Data Loading and Analytics on Redshift Serverless

4. Chapter 3: Applying Machine Learning in Your Data Warehouse

5. Part 2:Getting Started with Redshift ML

6. Chapter 4: Leveraging Amazon Redshift ML

7. Chapter 5: Building Your First Machine Learning Model

8. Chapter 6: Building Classification Models

9. Chapter 7: Building Regression Models

10. Chapter 8: Building Unsupervised Models with K-Means Clustering

11. Part 3:Deploying Models with Redshift ML

12. Chapter 9: Deep Learning with Redshift ML

13. Chapter 10: Creating a Custom ML Model with XGBoost

14. Chapter 11: Bringing Your Own Models for Database Inference

15. Chapter 12: Time-Series Forecasting in Your Data Warehouse

16. Chapter 13: Operationalizing and Optimizing Amazon Redshift ML Models

17. Index

Why subscribe?

18. Other Books You May Enjoy

Determining the optimal number of clusters

One popular method that is frequently adopted is the Elbow method. The idea of the Elbow method is to run K-means algorithms with different values of K – for example, from 1 cluster all the way to 10 – and for each value of K, calculate the sum of squared errors. Then, plot a chart of the sum of squared deviation (SSD) values. SSD is the sum of the squared difference and is used to measure variance. If the line chart looks like an arm, then the elbow on the arm is the value of K that is the best among the various K values. The method behind this approach is that SSD usually tends to decrease as the value of K is increased, and the goal of the evaluation method is also to aim for lower SSD or mean squared deviation (MSD) values. The elbow represents a starting point, where SSD starts to have diminishing returns when the K value increases.

In the following chart, you can see that the MSD value, when charted over different K...

The rest of the chapter is locked

You're reading from Serverless Machine Learning with Amazon Redshift ML Create, train, and deploy machine learning models using familiar SQL commands

Table of Contents (19) Chapters

Determining the optimal number of clusters

Unlock this book and the full library FREE for 7 days

Authors (4)

Personalised recommendations for you