Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Machine Learning with Scala Quick Start Guide Leverage popular machine learning algorithms and techniques and implement them in Scala

Product type Paperback

Published in Apr 2019

Publisher Packt

ISBN-13 9781789345070

Length 220 pages

Edition 1st Edition

Languages

Scala

Tools

Construct

Concepts

Machine Learning

Authors (2):

Md. Rezaul Karim

Ajay Kumar N

View More author details

Table of Contents (9) Chapters

Preface

1. Introduction to Machine Learning with Scala FREE CHAPTER

2. Scala for Regression Analysis

3. Scala for Learning Classification

4. Scala for Tree-Based Ensemble Techniques

5. Scala for Dimensionality Reduction and Clustering

6. Scala for Recommender System

7. Introduction to Deep Learning with Scala

8. Other Books You May Enjoy

Leave a review - let other readers know what you think

Summary

In this chapter, we discussed some clustering analysis techniques, such as k-means, bisecting k-means, and GMM. We saw a step-by-step example of how to cluster ethnic groups based on their genetic variants. In particular, we used the PCA for dimensionality reduction, k-means for clustering, and H2O and ADAM for handling large-scale genomics datasets. Finally, we learned about the elbow and silhouette methods for finding the optimal number of clusters.

Clustering is the key to most data-driven applications. Readers can try to apply clustering algorithms on higher-dimensional datasets, such as gene expression or miRNA expression, in order to cluster similar and correlated genes. A great resource is the gene expression cancer RNA-Seq dataset, which is open source. This dataset can be downloaded from the UCI machine learning repository at https://archive.ics.uci.edu/ml/datasets...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

1 of 2

Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.

See other products by Karim

Kumar N

Ajay Kumar N has experience in big data, and specializes in cloud computing and various big data frameworks, including Apache Spark and Apache Hadoop. His primary language of choice is Python, but he also has a special interest in functional programming languages such as Scala. He has worked extensively with NumPy, pandas, and scikit-learn, and often contributes to open source projects related to data science and machine learning.

See other products by Kumar N