You're reading from AWS for Solutions Architects The definitive guide to AWS Solutions Architecture for migrating to, building, scaling, and succeeding in the cloud

Product type Paperback

Published in Apr 2023

Publisher Packt

ISBN-13 9781803238951

Length 692 pages

Edition 2nd Edition

Tools

AWS

Concepts

Cloud Computing

Authors (4):

Neelanjali Srivastav

Saurabh Shrivastava

Alberto Artasanchez

Imtiaz Sayed

View More author details

Table of Contents (19) Chapters

AWS for Solutions Architects, Second Edition: Design your cloud infrastructure by implementing DevOps, containers, and Amazon Web Services

1 Understanding AWS Principles and Key Characteristics FREE CHAPTER

2 Understanding AWS Well-Architected Framework and Getting Certified

3 Leveraging the Cloud for Digital Transformation

4 Networking in AWS

5 Storage in AWS – Choosing the Right Tool for the Job

6 Harnessing the Power of Cloud Computing

7 Selecting the Right Database Service

8 Best Practices for Application Security, Identity, and Compliance

9 Dive efficiency with Cloud Operation Automation and DevOps in AWS

10 Bigdata and streaming data processing in AWS

11 Datawarehouse, Data Query and Visualization in AWS

12 Machine Learning, IoT, and Blockchain in AWS

13 Containers in AWS

14 Microservice and Event-Driven Architectures

15 Domain-Driven Design

16 Data Lake Patterns – Integrating Your Data across the Enterprise

17 Availability, Reliability, and Scalability Patterns

18 AWS Hands-On Lab and Use Case

Data bucketing

Another scheme to partition data is to use buckets within a single partition. When using bucketing, a column or multiple columns are used to group rows together and "bucket" or categorize them. The best columns to use for bucketing are columns that will often be used to filter the data. So, when queries use these columns as filters, not as much data will need to be scanned and read when performing these queries.

Another characteristic that makes a column a good candidate for bucketing is high cardinality. In other words, you want to use columns that have a large number of unique values. So, primary key columns are ideal bucketing columns.

Amazon Athena simplifies which columns will be bucketed during table creation by using the CLUSTERED BY clause. An example of a table creation statement using this clause follows:

CREATE EXTERNAL TABLE employee (
id string,
name string,
salary double,
address string,
timestamp bigint)
PARTITIONED BY (
timestamp string...

The rest of the chapter is locked

You're reading from AWS for Solutions Architects The definitive guide to AWS Solutions Architecture for migrating to, building, scaling, and succeeding in the cloud

Table of Contents (19) Chapters Close

Data bucketing

Unlock this book and the full library FREE for 7 days

Authors (4)

Personalised recommendations for you

Table of Contents (19) Chapters