Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Training Systems Using Python Statistical Modeling Explore popular techniques for modeling your data in Python

Product type Paperback

Published in May 2019

Publisher Packt

ISBN-13 9781838823733

Length 290 pages

Edition 1st Edition

Languages

Python

Tools

Pandas

Concepts

Machine Learning

Author (1):

Curtis Miller

View More author details

Table of Contents (9) Chapters

Preface

1. Classical Statistical Analysis

2. Introduction to Supervised Learning FREE CHAPTER

3. Binary Prediction Models

4. Regression Analysis and How to Use It

5. Neural Networks

6. Clustering Techniques

7. Dimensionality Reduction

8. Other Books You May Enjoy

Leave a review - let other readers know what you think

Hierarchical clustering

In this section, we will first look at similarity measures. Then, we will learn about hierarchical clustering.

We talked before about different notions of distance in the Computing distances section. Now, I want to talk about the idea of similarity. A similarity score describes how similar two objects are. There is no universal definition of the properties a similarity score has, but everyone agrees that similar objects have a high similarity score and dissimilar objects have a low similarity score. Dissimilarity is the opposite of similarity, and distance is a form of dissimilarity. Hierarchical clustering uses dissimilarity to form clusters. This means that if we can come up with similarity scores that make sense, we can cluster just about any type of data in a meaningful way.

In this section, I will be focusing on Jaccard similarity, which is related...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Curtis Miller

Curtis Miller is a doctoral candidate at the University of Utah studying mathematical statistics. He writes software for both research and personal interest, including the R package (CPAT) available on the Comprehensive R Archive Network (CRAN). Among Curtis Miller's publications are academic papers along with books and video courses all published by Packt Publishing. Curtis Miller's video courses include Unpacking NumPy and Pandas, Data Acquisition and Manipulation with Python, Training Your Systems with Python Statistical Modelling, and Applications of Statistical Learning with Python. His books include Hands-On Data Analysis with NumPy and Pandas.

See other products by Curtis Miller