Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Database Design and Modeling with Google Cloud

You're reading from   Database Design and Modeling with Google Cloud Learn database design and development to take your data to applications, analytics, and AI

Arrow left icon
Product type Paperback
Published in Dec 2023
Publisher Packt
ISBN-13 9781804611456
Length 234 pages
Edition 1st Edition
Concepts
Arrow right icon
Author (1):
Arrow left icon
Abirami Sukumaran Abirami Sukumaran
Author Profile Icon Abirami Sukumaran
Abirami Sukumaran
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Part 1:Database Model: Business and Technical Design Considerations
2. Chapter 1: Data, Databases, and Design FREE CHAPTER 3. Chapter 2: Handling Data on the Cloud 4. Part 2:Structured Data
5. Chapter 3: Database Modeling for Structured Data 6. Chapter 4: Setting Up a Fully Managed RDBMS 7. Chapter 5: Designing an Analytical Data Warehouse 8. Part 3:Semi-Structured, Unstructured Data, and NoSQL Design
9. Chapter 6: Designing for Semi-Structured Data 10. Chapter 7: Unstructured Data Management 11. Part 4:DevOps and Databases
12. Chapter 8: DevOps and Databases 13. Part 5:Data to AI
14. Chapter 9: Data to AI – Modeling Your Databases for Analytics and ML 15. Chapter 10: Looking Ahead – Designing for LLM Applications 16. Index 17. Other Books You May Enjoy

Business aspect

Business requirements are the starting point for your application and also for choosing your database system. There are four stages in the life cycle of data in its business application that help determine the choice of database system:

  • Data ingestion
  • Storage
  • Process
  • Visualize

The following diagram represents the attributes in the four stages of data and the categories of questions in each stage in the life cycle of your data:

Figure 1.1 – Representation of the four stages of data and the categories of questions in each stage

Figure 1.1 – Representation of the four stages of data and the categories of questions in each stage

Let’s look at some of these attributes in detail. Some of them are in the business attributes category, while others are technical.

Ingestion

This is the first stage in the data life cycle and it is all about acquiring (bringing in) data from different sources in one place into your system. In this stage, the questions that arise are bucketed into three categories:

  • What type of data are you bringing in?
  • What is the purpose of this data?
  • What is the structure of your data?

Let’s take a look at each in detail.

Types of data

There are broadly three types of data we will be dealing with that highly influence the choice of database and storage.

Application data

This is the kind of data that is generated or downloaded as part of the application’s content and can contain transactional data that is generated by users and applications – for example, online retail applications, log data from applications, event data, and clickstream data. Let’s take a look at a specific example – consider a banking application in which user A transfers money from their account to user B’s account. In this case, the user data, such as the account ID, name, bank details, the recipient’s name, and transaction date, constitute the application data.

Live stream and real-time stream data

This data comes from real-time sources such as streaming data, which comes in continuously from data sources such as sensor data. These can also be event data responses and can be very frequent compared to batch data processing. It refers to data that is immediately available and not delayed by a system or process. The term real-time stream refers to streams of real-time data that are gathered and stored or processed as they come in. This includes monitoring data such as CPU utilization, memory consumption, Internet of Things (IoT) devices data such as humidity and pressure, and automated real-time environmental temperature monitoring data.

Batch data

This is data that comes in as bulk at scheduled intervals and could be event-triggered. For example, batch data is transactional data that comes in from applications after a transaction and is stored for use in later stages of the data life cycle. This can include data extracted from one application for use in another at a later point, data migration use cases, and file uploads for processing later. Such applications may not be designed for real-time operations on the data.

The purpose of data

The specific use case and the nature of implementing applications using the data being ingested is a critical factor in determining the choice and design of the database. There may be cases where the type and ingestion mode of data fall into a different choice of database design, whereas its functional use case would imply a different purpose. For example, you could have data streamed in from live events or housekeeping data coming in real-time from transactions, but the specific use case you are designing for might only involve visualization, analytical, or ML functionalities. So, make sure you understand what purpose you are solving with the data that is being ingested in a specific mode and type.

The structure of data

The structure of the data is a crucial factor in deciding the choice and design of a database. There are three widely recognized categories:

  • Structured
  • Semi-structured
  • Unstructured

Let’s briefly explore these three categories.

Structured data

This type of data is typically composed of rows and columns; rows are entities or records and columns are attributes. Structured data is organized in such a way that you can be sure that the data structure will be consistent for the most part throughout the life cycle of that data, except for the possible addition or removal of some attributes altogether. This kind of data is mostly transactional or analytical.

Semi-structured data

Semi-structured data does not follow a fixed tabular format – that is, a column-row structure. Instead, it stores schema attributes along with data. The attributes for semi-structured data could vary for each record. The major differentiating factor for each kind of semi-structured data is the way they are accessed.

Unstructured data

Unstructured data includes images, audio files, and so on. Unstructured data does not have a definite schema or data model. The amount of unstructured data is much larger than that of structured data. So, the methods by which we store such data are more important than ever. Here are some examples of unstructured data:

  • Text
  • Audio
  • Video
  • Images
  • Other binary large objects (BLOBs)

Now that we have had a sneak peek into the structure of data, be sure to include functional and design questions based on these categories while designing your database and application model.

You have been reading a chapter from
Database Design and Modeling with Google Cloud
Published in: Dec 2023
Publisher: Packt
ISBN-13: 9781804611456
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime