Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
DynamoDB Applied Design Patterns

You're reading from   DynamoDB Applied Design Patterns Apply efficient DynamoDB design patterns for high performance of applications

Arrow left icon
Product type Paperback
Published in Sep 2014
Publisher
ISBN-13 9781783551897
Length 202 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Uchit Hamendra Vyas Uchit Hamendra Vyas
Author Profile Icon Uchit Hamendra Vyas
Uchit Hamendra Vyas
Arrow right icon
View More author details
Toc

Efficient use of primary keys

As DynamoDB is a NoSQL database and is used with scalable applications, table data might grow exponentially. This might reduce data read and write throughput (the number of 1 KB read or write requests per second) if not managed efficiently. This management starts right from choosing the correct primary key and its parameters. Take a look at the following table:

Efficient use of primary keys

As soon as the table is created, the table data is partitioned on the hash key attribute. What this means is that if the table has three partitions, then the first two items will go to the first partition, the third item will go to the second partition and the last item will go to the third partition. This partition is based purely on hash logic, which we are not going to discuss here.

In our library catalogue example, we are always looking for a certain book, with the assumption that the first thing that comes to our mind when identifying a book is its title. That is why we decided to set the BookTitle attribute as the hash key. Another reason why we chose this specific attribute as the hash key is the assumption that most of the scan operations for the table will include the BookTitle attribute.

DynamoDB does not allow duplication of the hash key (provided that the table does not have a range attribute), so if the primary key is a simple hash key, then we are enforcing that an entry cannot be made into the previous table with the same book title. But in a real-world scenario this is not the case. So we are in need of a range key attribute as well. The next decision to be taken is what should be made the range attribute. We will assume that the second attribute that comes to mind when identifying a book is the name of its author. Unlike the hash key attribute, range key attributes are ordered (also grouped on the hash key attribute). Here also we are enforcing upon DynamoDB that the same author will never write a book on the same title.

Take a look at the following table (which is incorrect and is shown only to understand the concept):

Efficient use of primary keys

But this might fail in several cases because the later editions of the book might have been authored by the same author. In this case, the second item insertion will simply overwrite the first item because the primary key is duplicated. As a solution, at this point in time I'd recommend you to concatenate the Author attribute along with the Edition attribute separated by # (or any other acceptable delimiter). So the table will look as follows:

Efficient use of primary keys

Observe the String range key attribute Author#Edition. Even if some of the items don't have the edition included in the range key attribute, it will not create any trouble at the DynamoDB end (but we have to take care from the application programming front).

Some of you might have thought of making the range key attribute type as StringSet, but remember that hash or range key attributes cannot be a Set type.

There are a few things to be kept in mind before choosing the correct hash and range attributes:

  • Since the table is partitioned based on the hash key attribute, do not choose repeating attributes that will have only single-digit (very few) unique values. For example, the Language attribute of our table has only three identical values. Choosing this attribute will eat up a lot of throughput.
  • Give the most restricted data type. For example, if we decide to make some number attributes as primary key attributes, then (even though String can also store numbers) we must use the Number data type only, because the hash and ordering logic will differ for each data type. Other advantages will be discussed in Chapter 5, Query and Scan Operations in DynamoDB, while discussing query and scan.
  • Do not put too many attributes or too lengthy attributes (using delimiter as discussed formerly) into the primary key attributes, because it becomes mandatory that every item must have these attributes and all the attributes will become part of the query operation, which is inefficient.
  • Make the attribute to be ordered as the range key attribute.
  • Make the attribute to be grouped (or partitioned) as the hash key attribute.
You have been reading a chapter from
DynamoDB Applied Design Patterns
Published in: Sep 2014
Publisher:
ISBN-13: 9781783551897
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image