Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
AWS Cloud Computing Concepts and Tech Analogies

You're reading from   AWS Cloud Computing Concepts and Tech Analogies A guide to understand AWS services using easy-to-follow analogies from real life

Arrow left icon
Product type Paperback
Published in Apr 2023
Publisher Packt
ISBN-13 9781804611425
Length 358 pages
Edition 1st Edition
Tools
Arrow right icon
Authors (3):
Arrow left icon
Marco Tamassia Marco Tamassia
Author Profile Icon Marco Tamassia
Marco Tamassia
Ashish Prajapati Ashish Prajapati
Author Profile Icon Ashish Prajapati
Ashish Prajapati
Juan Carlos Ruiz Juan Carlos Ruiz
Author Profile Icon Juan Carlos Ruiz
Juan Carlos Ruiz
Arrow right icon
View More author details
Toc

Table of Contents (22) Chapters Close

Preface 1. Part 1: Cloud Infrastructure and Core Services
2. Chapter 1: Understanding Cloud Computing – Demystifying the Cloud FREE CHAPTER 3. Chapter 2: Global Infrastructure behind Cloud Platforms – What Is the Cloud Made of? 4. Chapter 3: Computing – In Cloud We Trust, Everything Else We Compute 5. Chapter 4: Storage – Where Should I Keep My Data and Maybe Publish It? 6. Chapter 5: Networking – So, How Do I Get Inside, Outside, and Across the Cloud? 7. Part 2: Platform Services
8. Chapter 6: Databases – To SQL or Not to SQL for So Many Records… 9. Chapter 7: Identity and Access Management – Who Am I and What Can I Do? 10. Chapter 8: Monitoring – Is Big Brother Watching? 11. Chapter 9: Scalability – I Scale to the Moon and Back 12. Part 3: Application Services
13. Chapter 10: Automation – Look, My Infrastructure Is in Code! 14. Chapter 11: Decoupled Architectures – in Space and Time 15. Chapter 12: Containers – Contain Yourself and Ship Some Containers 16. Chapter 13: Serverless – So, Where Are My Servers? 17. Chapter 14: Caching – Microseconds Latency: Why Are We Always in a Rush? 18. Chapter 15: Blockchain – Who Watches the Watchmen? 19. Chapter 16: What the Future Holds 20. Index 21. Other Books You May Enjoy

Non-relational databases

After the break, Berta wants to finish her explanation with non-relational databases. As there are several types that she wants to talk about, she starts as soon as possible.

A picture containing text, light

Description automatically generatedBerta: AWS also offers different non-relational databases, and there are many of them. Each one of them is suited to a specific use case, for example, graph databases.

A picture containing text, light

Description automatically generatedAlex: These are for graphics?

A picture containing text, light

Description automatically generatedBerta: No, graph, not graphics. You can store different entities and the relationships between them, building a complex graph. They can be used in recommendation engines, fraud detection, identity graphs, and similar applications. For that purpose, Amazon Neptune would be the right choice.

You can also have databases in RAM, for extreme performance, or databases that are time-based, where you analyze time series, such as for the Internet of Things, stock prices, measurements such as temperature or pressure, or anything where the time sequence is important. You can use ElastiCache as a memory cache, and TimeStream for time-series storage and analysis.

A picture containing text, light

Description automatically generatedHarold: I think I have read something about an immutable database?

A picture containing text, light

Description automatically generatedBerta: Yes, that is QLDB, Amazon Quantum Ledger Database. It is a database where the change log cannot be modified or tampered with. It’s very useful for legal proof, or just to maintain the sequence of changes as a fully historical log of all activity that can be verified.

A picture containing text, light

Description automatically generatedAlex: This is great. I like the idea of having purpose-built databases, rather than trying to use only one type of database. But these databases seem too specialized. Is there any generic-purpose database that is also non-relational?

A picture containing text, light

Description automatically generatedBerta: Sure. There is one is called Amazon DynamoDB. It is a non-relational database, supporting both key-value and document data models. Being non-relational, nobody enforces a fixed schema. So, each row in DynamoDB—called an Item—can have any number of columns at any moment. That means your tables can adapt to your changing business requirements, without having to stop the database to modify the previous schema:

Figure 6.15 — Key-value and document model

Figure 6.15 — Key-value and document model

A picture containing text, light

Description automatically generatedHarold: So, if the data is stored as tables, how is it different from relational databases?

A picture containing text, light

Description automatically generatedBerta: The key here is that there’s only one table, without any relationships. If your application requires them, you can surely create multiple tables, but they will be completely independent, in separate databases. The application will have to perform the joining logic. You usually have to design the table for all the queries you might anticipate, including all the data and possible indexes.

Also, traditional relational databases have separate endpoints for control operations to create and manage the tables, and a separate endpoint for data operations to create, read, update, and delete (also called CRUD) actions on the data in a table. DynamoDB simplifies this: it offers a single endpoint that can accept all types of requests. Amazon DynamoDB is serverless; so, you don’t worry about any of the operational overhead. Also, DynamoDB supports eventual and strong consistency.

A picture containing text, light

Description automatically generatedHarold: Could you please explain it?

A picture containing text, light

Description automatically generatedBerta: Sure. A database consistency model defines the mode and timing in which a successful write or update is reflected in a later read operation of that same value. Let us consider an example to explain it. Do you use a credit or debit card?

A picture containing text, light

Description automatically generatedHarold: I use both. For cash withdrawals, I use a debit card, and for purchases, a credit card.

A picture containing text, light

Description automatically generatedBerta: Good. So, if you have withdrawn some money using your debit card and you immediately check your account balance again, will the recent withdrawal reflect in the account statement?

A picture containing text, light

Description automatically generatedHarold: Yes. It will.

A picture containing text, light

Description automatically generatedBerta: And if you made a purchase with your credit card, will it also reflect it at the same time?

A picture containing text, light

Description automatically generatedHarold: I think it shows the transaction in a pending state; it doesn’t show as completed immediately.

A picture containing text, light

Description automatically generatedBerta: Correct. Credit card processing works slightly differently. The vendor from whom you have purchased the product has to claim a settlement of the transaction. Eventually—by that, I mean after some time—the transaction will show as completed in your account.

DynamoDB always stores multiple copies of your data. Let’s assume it keeps three copies. At any time, one of the copies is chosen as a Leader. Every time a write or update request is initiated, DynamoDB will ensure that at least two copies (the leader and one more copy) are immediately updated to reflect the change. The third copy will have stale data for some time, but finally, it will also be updated to reflect the change.

A picture containing text, light

Description automatically generatedHarold: But why not update all the copies in the first place?

A picture containing text, light

Description automatically generatedBerta: The performance impact. If DynamoDB had to wait for all three copies to confirm the write, the application that requested to write would have to wait longer, waiting for the slowest node. Imagine you want to host a meeting with three people in different time zones; you would have to find a common timeslot that suits all three participants. This problem is somewhat simpler when the meeting requires only two people to attend.

A picture containing text, light

Description automatically generatedHarold: Oh, I get it now. It’s another quorum algorithm, similar to the one used in Aurora. A majority of storage nodes take the decision. So, the third copy is still waiting to be updated, but the acknowledgment of the write has already been sent to the application. This means my data in different copies is not consistent, but at least two copies will have the latest data.

A picture containing text, light

Description automatically generatedBerta: Yes, for some time. This is sometimes referred to as data being in a soft state. But there are options available in DynamoDB if you want to always read the most up-to-date data.

A picture containing text, light

Description automatically generatedAlex: But why would someone be interested in reading stale data in the first place?

A picture containing text, light

Description automatically generatedBerta: For better performance. Let me give you an example. Let’s say you have stored stock prices for a company in DynamoDB. Consider the price of the stock to be $100, and all three copies currently have the same value of $100. Now, you read this data in two different applications. Application one, which is a news scroll, displays the current price of the stock, and application two, is a financial application with which you can buy or sell stocks.

A picture containing text, light

Description automatically generatedAlex: Okay.

A picture containing text, light

Description automatically generatedBerta: If you have a news scroll application, you could add a disclaimer such as This data is delayed by 15 minutes and display the data from DynamoDB. In this case, accuracy is not that important, as you are okay with having data delayed by 15 minutes. DynamoDB will never supply you with wrong data or random data, but it might give you data that is stale. It was accurate a while ago, but currently, it may or may not be accurate. As there are three copies, and your read request can land on any copy, there is a one-in-three chance that you may get stale data. But this method will always return the data with the lowest latency. This is called eventual consistency.

A picture containing text, light

Description automatically generatedAlex: Agreed – if you don’t specify the node, your query might end up in any of them.

A picture containing text, light

Description automatically generatedBerta: Now, if you want to use the same stock data for a financial application – this means your priority is accuracy rather than speed. You always need to get the most up-to-date data for any financial transaction. In DynamoDB, you can indicate—by using a parameter in your query—that the reader needs the most recent, accurate data. This time, DynamoDB will find out which is the leader and will deliver data from it; this way, you’ll get the most up-to-date data. This is called strong consistency.

A picture containing text, light

Description automatically generatedAlex: That’s nice. Based on your read requirement you can choose to have eventually consistent data or strongly consistent data. I like the flexibility it offers.

A picture containing text, light

Description automatically generatedBerta: Eventual consistency is the default mechanism. If a requester doesn’t specify any parameter and just issues a request to read, DynamoDB interprets it as an eventual read request.

A picture containing text, light

Description automatically generatedHarold: So, all write requests are always consistent; it’s when you read that you select eventual consistency or strong consistency.

A picture containing text, light

Description automatically generatedBerta: That’s correct.

A picture containing text, light

Description automatically generatedCharles: In the traditional world, the performance of a database is based on the server it is running. How is the performance of DynamoDB controlled?

A picture containing text, light

Description automatically generatedBerta: In the case of DynamoDB, you have to configure Read Capacity Units (RCUs) and Write Capacity Units (WCUs) to achieve specific performance. These are table-level settings. An RCU defines the number of strongly consistent reads per second of items up to 4 KB in size. Eventually consistent reads use half the provisioned read capacity. So, if you configured your table for 10 RCU, you could perform 10 strongly consistent read operations, or 20 eventual read operations (double the amount of strongly consistent reads) of 4 KB each. A WCU is the number of 1 KB writes per second.

A picture containing text, light

Description automatically generatedCharles: Okay. What I don’t understand exactly is how many RCUs or WCUs are required for a new application, or for an application that has spiky or unpredictable access?

A picture containing text, light

Description automatically generatedBerta: Amazon DynamoDB has got you covered. It has two capacity modes for processing, on-demand and provisioned. In the on-demand mode, DynamoDB instantly accommodates your workloads as they ramp up or down. So, if you have a new table with an unknown workload, an application with unpredictable traffic, or you want to pay only for what you actually use, on-demand mode is a great option. If you choose provisioned mode, you have to specify the number of reads and writes per second needed for your application. So, if your application has predictable and consistent traffic, you want to control costs, and only pay a specific amount, the provisioned mode is better.

A picture containing text, light

Description automatically generatedCharles: And this mode has to be selected at table creation time or can it be modified later?

A picture containing text, light

Description automatically generatedBerta: You can set the read/write capacity mode at table creation, or you can change it later too, either manually or programmatically.

A picture containing text, light

Description automatically generatedHarold: By the way, you mentioned some non-relational databases also support querying through SQL?

A picture containing text, light

Description automatically generatedBerta: Yes. DynamoDB supports PartiQL, an open source, SQL-compatible query language. Furthermore, you can use a client-side GUI tool called NoSQL Workbench for Amazon DynamoDB, which provides data modeling, data visualization, and query development features for DynamoDB tables.

I think we now have enough information to map our existing databases to AWS services. Let’s start listing all the databases that we plan to migrate to AWS and work as a team to identify possible migration methods.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image