Technical aspect
Whether you are engineering or architecting, ask the right data questions! There is always this question about the responsibilities concerning data. When designing data architecture, you must manage the business and technology requirements around the architecture, be involved in designing data extraction, transformation, and loading, and provide direction to the team for methods of organizing, formatting, and presenting data. Once you’ve done this, you’ll be an architect.
As an engineer, you create applications and develop solutions to enable data for distribution, processing, and analysis and participate in one or more of those activities directly.
But in either case, you are an expert. You need to ask the right questions and set the right expectations as you approach the technical aspects of data. It is not always possible to get the best solution with the following questions, but they will help you get started and eliminate the mismatches easily right off the bat:
- Volume and scalability: Volume is the amount of data you need to store and process in a given period. Scalability is how much you expect your data to grow in the foreseeable future. Here are some questions you can ask in this area:
- What is the size of data you are going to be dealing with at the time of design and at each stage in the life cycle of the data?
- How much do you expect it to scale with time?
- Velocity: This is the rate at which data is transferred or processed by your application. Some questions that you can ask concerning the velocity of data are as follows:
- What is the rate/schedule at which the data needs to be sent and processed?
- If you are ingesting, processing, and writing into storage, do you need to match the velocity at each of these stages?
- Veracity: Veracity is the amount of variation you can expect in the data structure and attributes:
- What variation is expected to be seen in the incoming data?
- Security: Access control, encryption, and security are key considerations at the database design stage:
- How much access restriction (row-level, object-level, and fine-grained levels of access control), encryption, privacy, and compliance does your data need?
- What kind of encryption is required for the data?
- Do you need to design views based on the type of data and access control required for your data?
Other common areas of design consideration are availability, resilience, reliability, and portability.
- Data retrieval: Data can be retrieved in many different modes, depending on your use case. The design aspects to keep in mind concerning retrieval are as follows:
- What is the volume of the data being read?
- What is the volume of the data being written?
- What is the frequency of reads?
- What is the frequency of writes?
This is a key technical aspect to address in design because when it’s not done in design, often, engineers and architects are posed with performance challenges and go back to assessing their foundational architecture and configuration at a much-matured stage in development.