You're reading from AWS Cloud Computing Concepts and Tech Analogies A guide to understand AWS services using easy-to-follow analogies from real life

Product type Paperback

Published in Apr 2023

Publisher Packt

ISBN-13 9781804611425

Length 358 pages

Edition 1st Edition

Tools

AWS

Concepts

Cloud Computing

Authors (3):

Marco Tamassia

Ashish Prajapati

Juan Carlos Ruiz

View More author details

Table of Contents (22) Chapters

Preface

1. Part 1: Cloud Infrastructure and Core Services

2. Chapter 1: Understanding Cloud Computing – Demystifying the Cloud FREE CHAPTER

3. Chapter 2: Global Infrastructure behind Cloud Platforms – What Is the Cloud Made of?

4. Chapter 3: Computing – In Cloud We Trust, Everything Else We Compute

5. Chapter 4: Storage – Where Should I Keep My Data and Maybe Publish It?

6. Chapter 5: Networking – So, How Do I Get Inside, Outside, and Across the Cloud?

7. Part 2: Platform Services

8. Chapter 6: Databases – To SQL or Not to SQL for So Many Records…

9. Chapter 7: Identity and Access Management – Who Am I and What Can I Do?

10. Chapter 8: Monitoring – Is Big Brother Watching?

11. Chapter 9: Scalability – I Scale to the Moon and Back

12. Part 3: Application Services

13. Chapter 10: Automation – Look, My Infrastructure Is in Code!

14. Chapter 11: Decoupled Architectures – in Space and Time

15. Chapter 12: Containers – Contain Yourself and Ship Some Containers

16. Chapter 13: Serverless – So, Where Are My Servers?

17. Chapter 14: Caching – Microseconds Latency: Why Are We Always in a Rush?

18. Chapter 15: Blockchain – Who Watches the Watchmen?

19. Chapter 16: What the Future Holds

20. Index

Why subscribe?

21. Other Books You May Enjoy

Non-relational databases

After the break, Berta wants to finish her explanation with non-relational databases. As there are several types that she wants to talk about, she starts as soon as possible.

A picture containing text, light

Description automatically generated Berta: AWS also offers different non-relational databases, and there are many of them. Each one of them is suited to a specific use case, for example, graph databases.

A picture containing text, light

Description automatically generated Alex: These are for graphics?

A picture containing text, light

Description automatically generated Berta: No, graph, not graphics. You can store different entities and the relationships between them, building a complex graph. They can be used in recommendation engines, fraud detection, identity graphs, and similar applications. For that purpose, Amazon Neptune would be the right choice.

You can also have databases in RAM, for extreme performance, or databases that are time-based, where you analyze time series, such as for the Internet of Things, stock prices, measurements such as temperature or pressure, or anything where the time sequence is important. You can use ElastiCache as a memory cache, and TimeStream for time-series storage and analysis.

A picture containing text, light

Description automatically generated Harold: I think I have read something about an immutable database?

A picture containing text, light

Description automatically generated Berta: Yes, that is QLDB, Amazon Quantum Ledger Database. It is a database where the change log cannot be modified or tampered with. It’s very useful for legal proof, or just to maintain the sequence of changes as a fully historical log of all activity that can be verified.

A picture containing text, light

Description automatically generated Alex: This is great. I like the idea of having purpose-built databases, rather than trying to use only one type of database. But these databases seem too specialized. Is there any generic-purpose database that is also non-relational?

A picture containing text, light

Description automatically generated Berta: Sure. There is one is called Amazon DynamoDB. It is a non-relational database, supporting both key-value and document data models. Being non-relational, nobody enforces a fixed schema. So, each row in DynamoDB—called an Item—can have any number of columns at any moment. That means your tables can adapt to your changing business requirements, without having to stop the database to modify the previous schema:

Figure 6.15 — Key-value and document model

A picture containing text, light

Description automatically generated Harold: So, if the data is stored as tables, how is it different from relational databases?

A picture containing text, light

Description automatically generated Berta: The key here is that there’s only one table, without any relationships. If your application requires them, you can surely create multiple tables, but they will be completely independent, in separate databases. The application will have to perform the joining logic. You usually have to design the table for all the queries you might anticipate, including all the data and possible indexes.

Also, traditional relational databases have separate endpoints for control operations to create and manage the tables, and a separate endpoint for data operations to create, read, update, and delete (also called CRUD) actions on the data in a table. DynamoDB simplifies this: it offers a single endpoint that can accept all types of requests. Amazon DynamoDB is serverless; so, you don’t worry about any of the operational overhead. Also, DynamoDB supports eventual and strong consistency.

A picture containing text, light

Description automatically generated Harold: Could you please explain it?

A picture containing text, light

Description automatically generated Berta: Sure. A database consistency model defines the mode and timing in which a successful write or update is reflected in a later read operation of that same value. Let us consider an example to explain it. Do you use a credit or debit card?

A picture containing text, light

Description automatically generated Harold: I use both. For cash withdrawals, I use a debit card, and for purchases, a credit card.

A picture containing text, light

Description automatically generated Berta: Good. So, if you have withdrawn some money using your debit card and you immediately check your account balance again, will the recent withdrawal reflect in the account statement?

A picture containing text, light

Description automatically generated Harold: Yes. It will.

A picture containing text, light

Description automatically generated Berta: And if you made a purchase with your credit card, will it also reflect it at the same time?

A picture containing text, light

Description automatically generated Harold: I think it shows the transaction in a pending state; it doesn’t show as completed immediately.

A picture containing text, light

Description automatically generated Berta: Correct. Credit card processing works slightly differently. The vendor from whom you have purchased the product has to claim a settlement of the transaction. Eventually—by that, I mean after some time—the transaction will show as completed in your account.

DynamoDB always stores multiple copies of your data. Let’s assume it keeps three copies. At any time, one of the copies is chosen as a Leader. Every time a write or update request is initiated, DynamoDB will ensure that at least two copies (the leader and one more copy) are immediately updated to reflect the change. The third copy will have stale data for some time, but finally, it will also be updated to reflect the change.

A picture containing text, light

Description automatically generated Harold: But why not update all the copies in the first place?

A picture containing text, light

Description automatically generated Berta: The performance impact. If DynamoDB had to wait for all three copies to confirm the write, the application that requested to write would have to wait longer, waiting for the slowest node. Imagine you want to host a meeting with three people in different time zones; you would have to find a common timeslot that suits all three participants. This problem is somewhat simpler when the meeting requires only two people to attend.

A picture containing text, light

Description automatically generated Harold: Oh, I get it now. It’s another quorum algorithm, similar to the one used in Aurora. A majority of storage nodes take the decision. So, the third copy is still waiting to be updated, but the acknowledgment of the write has already been sent to the application. This means my data in different copies is not consistent, but at least two copies will have the latest data.

A picture containing text, light

Description automatically generated Berta: Yes, for some time. This is sometimes referred to as data being in a soft state. But there are options available in DynamoDB if you want to always read the most up-to-date data.

A picture containing text, light

Description automatically generated Alex: But why would someone be interested in reading stale data in the first place?

A picture containing text, light

Description automatically generated Berta: For better performance. Let me give you an example. Let’s say you have stored stock prices for a company in DynamoDB. Consider the price of the stock to be $100, and all three copies currently have the same value of $100. Now, you read this data in two different applications. Application one, which is a news scroll, displays the current price of the stock, and application two, is a financial application with which you can buy or sell stocks.

A picture containing text, light

Description automatically generated Alex: Okay.

A picture containing text, light

Description automatically generated Berta: If you have a news scroll application, you could add a disclaimer such as This data is delayed by 15 minutes and display the data from DynamoDB. In this case, accuracy is not that important, as you are okay with having data delayed by 15 minutes. DynamoDB will never supply you with wrong data or random data, but it might give you data that is stale. It was accurate a while ago, but currently, it may or may not be accurate. As there are three copies, and your read request can land on any copy, there is a one-in-three chance that you may get stale data. But this method will always return the data with the lowest latency. This is called eventual consistency.

A picture containing text, light

Description automatically generated Alex: Agreed – if you don’t specify the node, your query might end up in any of them.

A picture containing text, light

Description automatically generated Berta: Now, if you want to use the same stock data for a financial application – this means your priority is accuracy rather than speed. You always need to get the most up-to-date data for any financial transaction. In DynamoDB, you can indicate—by using a parameter in your query—that the reader needs the most recent, accurate data. This time, DynamoDB will find out which is the leader and will deliver data from it; this way, you’ll get the most up-to-date data. This is called strong consistency.

A picture containing text, light

Description automatically generated Alex: That’s nice. Based on your read requirement you can choose to have eventually consistent data or strongly consistent data. I like the flexibility it offers.

A picture containing text, light

Description automatically generated Berta: Eventual consistency is the default mechanism. If a requester doesn’t specify any parameter and just issues a request to read, DynamoDB interprets it as an eventual read request.

A picture containing text, light

Description automatically generated Harold: So, all write requests are always consistent; it’s when you read that you select eventual consistency or strong consistency.

A picture containing text, light

Description automatically generated Berta: That’s correct.

A picture containing text, light

Description automatically generated Charles: In the traditional world, the performance of a database is based on the server it is running. How is the performance of DynamoDB controlled?

A picture containing text, light

Description automatically generated Berta: In the case of DynamoDB, you have to configure Read Capacity Units (RCUs) and Write Capacity Units (WCUs) to achieve specific performance. These are table-level settings. An RCU defines the number of strongly consistent reads per second of items up to 4 KB in size. Eventually consistent reads use half the provisioned read capacity. So, if you configured your table for 10 RCU, you could perform 10 strongly consistent read operations, or 20 eventual read operations (double the amount of strongly consistent reads) of 4 KB each. A WCU is the number of 1 KB writes per second.

A picture containing text, light

Description automatically generated Charles: Okay. What I don’t understand exactly is how many RCUs or WCUs are required for a new application, or for an application that has spiky or unpredictable access?

A picture containing text, light

Description automatically generated Berta: Amazon DynamoDB has got you covered. It has two capacity modes for processing, on-demand and provisioned. In the on-demand mode, DynamoDB instantly accommodates your workloads as they ramp up or down. So, if you have a new table with an unknown workload, an application with unpredictable traffic, or you want to pay only for what you actually use, on-demand mode is a great option. If you choose provisioned mode, you have to specify the number of reads and writes per second needed for your application. So, if your application has predictable and consistent traffic, you want to control costs, and only pay a specific amount, the provisioned mode is better.

A picture containing text, light

Description automatically generated Charles: And this mode has to be selected at table creation time or can it be modified later?

A picture containing text, light

Description automatically generated Berta: You can set the read/write capacity mode at table creation, or you can change it later too, either manually or programmatically.

A picture containing text, light

Description automatically generated Harold: By the way, you mentioned some non-relational databases also support querying through SQL?

A picture containing text, light

Description automatically generated Berta: Yes. DynamoDB supports PartiQL, an open source, SQL-compatible query language. Furthermore, you can use a client-side GUI tool called NoSQL Workbench for Amazon DynamoDB, which provides data modeling, data visualization, and query development features for DynamoDB tables.

I think we now have enough information to map our existing databases to AWS services. Let’s start listing all the databases that we plan to migrate to AWS and work as a team to identify possible migration methods.

The rest of the chapter is locked

You're reading from AWS Cloud Computing Concepts and Tech Analogies A guide to understand AWS services using easy-to-follow analogies from real life

Table of Contents (22) Chapters

Non-relational databases

Unlock this book and the full library FREE for 7 days

Authors (3)

Personalised recommendations for you