Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Learning Apache Cassandra Managing fault-tolerant, scalable data with high performance

Product type Paperback

Published in Apr 2017

Publisher

ISBN-13 9781787127296

Length 360 pages

Edition 2nd Edition

Languages

Java

Tools

Cassandra

Concepts

Databases

Author (1):

Sandeep Yarabarla

View More author details

Table of Contents (15) Chapters

Preface

1. Getting Up and Running with Cassandra FREE CHAPTER

2. The First Table

3. Organizing Related Data

4. Beyond Key-Value Lookup

5. Establishing Relationships

6. Denormalizing Data for Maximum Performance

7. Expanding Your Data Model

8. Collections, Tuples, and User-Defined Types

9. Aggregating Time-Series Data

10. How Cassandra Distributes Data

11. Cassandra Multi-Node Cluster

12. Application Development Using the Java Driver

13. Peeking under the Hood

14. Authentication and Authorization

Partial denormalization

Our initial approach to home timelines, which used the existing, fully-normalized data structure that we've already built, is technically viable but will perform very poorly at scale. If I follow F users and want a page of size P for my home timeline, Cassandra will need to do the following:

Query F partitions for P rows, each
Perform an ordered merge of FxP rows in order to retrieve only the most recent P

The most distressing part of this is the fact that both operations grow in complexity proportionally with the number of people I follow. Let's start by trying to fix this.

The basic goal of the home timeline is to show me the most recent status updates that matter to me. Instead of doing all the work to find out what status updates matter to me, based on whom I follow, at read time, let's shift some of the work to write time.

I'll create a table that stores references...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Yarabarla

Sandeep Yarabarla is a professional software engineer working for Verizon Labs, based out of Palo Alto, CA. After graduating from Carnegie Mellon University, he has worked on several big data technologies for a spectrum of companies. He has developed applications primarily in Java and Go. His experience includes handling large amounts of unstructured and structured data in Hadoop, and developing data processing applications using Spark and MapReduce. Right now, he is working with some cutting-edge technologies such as Cassandra, Kafka, Mesos, and Docker to build fault-tolerant and highly scalable applications.

See other products by Yarabarla