Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Machine Learning with Apache Spark Quick Start Guide Uncover patterns, derive actionable insights, and learn from big data using MLlib

Product type Paperback

Published in Dec 2018

Publisher Packt

ISBN-13 9781789346565

Length 240 pages

Edition 1st Edition

Languages

Java

Tools

Apache Spark

Concepts

Big Data

Author (1):

Jillur Quddus

View More author details

Table of Contents (10) Chapters

Preface

1. The Big Data Ecosystem FREE CHAPTER

2. Setting Up a Local Development Environment

3. Artificial Intelligence and Machine Learning

4. Supervised Learning Using Apache Spark

5. Unsupervised Learning Using Apache Spark

6. Natural Language Processing Using Apache Spark

7. Deep Learning Using Apache Spark

8. Real-Time Machine Learning Using Apache Spark

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Clustering

As described in Chapter 3, Artificial Intelligence and Machine Learning, in unsupervised learning, the goal is to uncover hidden relationships, trends, and patterns given only the input data, x_i, with no output, y_i. In other words, our input dataset will be of the following form:

Clustering is a well-known example of a class of unsupervised learning algorithms where the goal is to segment data points into groups, where all of the data points in a specific group share similar features or attributes in common. By the nature of clustering, however, it is recommended that clustering models are trained on large datasets to avoid over fitting. The two most commonly used clustering algorithms are hierarchical clustering and k-means clustering, which are differentiated from each other by the processes by which they construct clusters. We shall study both of these algorithms...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Quddus

Jillur Quddus is a lead technical architect, polyglot software engineer and data scientist with over 10 years of hands-on experience in architecting and engineering distributed, scalable, high-performance, and secure solutions used to combat serious organized crime, cybercrime, and fraud. Jillur has extensive experience of working within central government, intelligence, law enforcement, and banking, and has worked across the world including in Japan, Singapore, Malaysia, Hong Kong, and New Zealand. Jillur is both the founder of Keisan, a UK-based company specializing in open source distributed technologies and machine learning, and the lead technical architect at Methods, the leading digital transformation partner for the UK public sector.

See other products by Quddus