Any solution design revolves around data, and it is mostly about storing, updating, and accessing it regardless of whether it is about customers or products. As the adoption of the internet is increasing, so is data and the need for data architects. In the last decade, data growth has risen exponentially – not long ago, gigabytes of data were considered to be big data, but now even 100 terabytes of data are deemed to be normal. You can even get a 1-terabyte computer hard disk.
Traditionally, data used to be stored in a structured relational way. Now, most data is in an unstructured format generated from resources such as social media, Internet of Things (IoT), and application logs. There is a need to store, process, and analyze data to get useful insights, where the data architect role comes into the picture.
The data architect defines a set of rules, policies, standards, and models that govern the type of data that's used and collected in the organization database. They design, create, and manage the data architecture in an organization. A data architect develops data models and data lake designs to capture business's key performance indicators (KPIs) and enable data transformation. They ensure consistent data performance and data quality across the organization.
The primary customers for a data architect are as follows:
- Business executives using Business Intelligence (BI) tools for data visualization
- Business analysts using a data warehouse to get more data insight
- Data engineers performing data wrangling using Extract, Transform, and Load (ETL) jobs
- Data scientists for machine learning
- Development teams for application data management
To fulfill organizational needs, the data architect is responsible for the following:
- Selection of database technology
- A relational database schema for application development
- Data warehousing for data analysis and BI tools
- Data lake as the centralized datastore
- Datamart design
- Machine learning tools
- Data security and encryption
- Data compliance
You will learn more about data architectures in Chapter 13, Data Engineering and Machine Learning. Overall, the data architect needs to be aware of different database technologies, BI tools, data security, and encryption to make the right selection.