Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Hands-On Big Data Modeling Effective database design techniques for data architects and business intelligence professionals

Product type Paperback

Published in Nov 2018

Publisher Packt

ISBN-13 9781788620901

Length 306 pages

Edition 1st Edition

Languages

Python

Tools

Bitcoin

Concepts

Big Data

Authors (3):

James Lee

Tao Wei

Suresh Kumar Mukhiya

View More author details

Table of Contents (17) Chapters

Preface

1. Introduction to Big Data and Data Management

2. Data Modeling and Management Platforms FREE CHAPTER

3. Defining Data Models

4. Categorizing Data Models

5. Structures of Data Models

6. Modeling Structured Data

7. Modeling with Unstructured Data

8. Modeling with Streaming Data

9. Streaming Sensor Data

10. Concept and Approaches of Big-Data Management

11. DBMS to BDMS

12. Modeling Bitcoin Data Points with Python

13. Modeling Twitter Feeds Using Python

14. Modeling Weather Data Points with Python

15. Modeling IMDb Data Points with Python

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Theory

Clustering is the machine learning task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. Given a set of data points, we can use a clustering algorithm to group each data point into a specific group. In theory, data points that are clustered in the same group should have similar properties or features, while data points in different groups should have highly distinct properties or features. Clustering is a common technique for statistical data analysis, and is used in many fields.

There are different types of clustering algorithm. The following are the most common clustering algorithms:

K-means clustering algorithm
Mean-shift clustering
Agglomerative-hierarchical clustering
Density-Based Spatial Clustering

We use clustering for IMDb because similar datasets are very close to each other...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (3)

Lee

Joanna Lee has more than 8 years of experience in game development. She has designed and programmed various video games. She first started working with Unreals game engine in 2005 and is very excited to be able to author a book about the newest Unreal Engine 4. She has also worked with many other engines as well as reviewed books and videos on Cry Engine 4.

See other products by Lee

Kumar Mukhiya

Suresh Kumar Mukhiya is a PhD candidate, currently affiliated to the Western Norway University of Applied Sciences (HVL). He is a big data enthusiast, specializing in Information Systems, Model-Driven Software Engineering, Big Data Analysis, Artificial Intelligence and Frontend development. He has completed a Masters in Information Systems from the Norwegian University of Science and Technology (NTNU, Norway) along with a thesis in processing mining. He also holds a bachelor's degree in computer science and information technology (BSc.CSIT) from Tribhuvan University, Nepal, where he was decorated with the Vice-Chancellor's Award for obtaining the highest score. He is a passionate photographer and a resilient traveler.

See other products by Kumar Mukhiya

Wei

Tao Wei is a passionate software engineer who works in a leading Silicon Valley-based big data analysis company. Previously, Tao worked in big IT companies, including IBM and Cisco. He has intensive experience in designing and building distributed, large-scale systems with proven high availability and reliability. Tao has an MS degree in computer science from McGill University and many years' experience as a teaching assistant in a variety of computer science classes. In his spare time, he enjoys reading and swimming, and is a passionate photographer.

See other products by Wei