What this book covers
- Chapter 1, Overview of the Business Problem Statement, provides a definition of the business problem faced by the data engineer. It also provides an introduction to the entire book.
- Chapter 2, A Data Engineer’s Journey – Background Challenges, elaborates on the challenges faced when building a modern data system.
- Chapter 3, A Data Engineer’s Journey – IT’s Vision and Mission, illustrates various mission and vision statements and urges you to develop one if one does not already exist. This way, you can keep your focus on the end and not deviate from your strategy.
- Chapter 4, Architecture Principles, elaborates on the need to develop principles that keep you solidly grounded in reality. Many examples are provided and explained because they drive the best practices.
- Chapter 5, Architecture Framework – Conceptual Architecture Best Practices, depicts architecture as the framework for design engineering. Too often projects go off the rails because the architecture shifts and the structure of the engineering design falls apart. Architecture is a communication tool to keep consensus, especially when things go wrong – and they always do in any engineering effort.
- Chapter 6, Architecture Framework – Logical Architecture Best Practices, describes the need to formally define and document the architecture for all, thus tying the conceptual level to the physical level of the architecture.
- Chapter 7, Architecture Framework – Physical Architecture Best Practices, defines what will be built and eventually what was built and where it all operates.
- Chapter 8, Software Engineering Best Practice Considerations, elaborates on the software best practices needed for the data engineering effort to succeed.
- Chapter 9, Key Considerations for Agile SDLC Best Practices, discusses the project management and development processes needed to deliver a data solution.
- Chapter 10, Key Considerations for Quality Testing Best Practices, provides testing best practices for a data factory.
- Chapter 11, Key Considerations for IT Operational Service Best Practices, defines operational requirements for a data solution.
- Chapter 12, Key Considerations for Data Service Best Practices, elaborates on data services, where the focus is on refining raw data into a gem, like a diamond, with facets. It takes the focus away from servicing data as a blob. Examples are provided to illustrate this important message.
- Chapter 13, Key Considerations for Management Best Practices, gets into the details of data factory curation and processing with a focus on difficult problems to solve.
- Chapter 14, Key Considerations for Data Delivery Best Practices, continues Chapter 13’s theme but addresses difficult problem areas for a business and the impediments that can be overcome with the best practices presented.
- Chapter 15, Other Considerations – Measures, Calculations, Restatements and Data Science Best Practices, defines the analysis workbench and various tools and processes for the data consumer. This is what is necessary to deliver data at the end of the data factory.
- Chapter 16, Machine Learning Pipeline Best Practices and Processes, dives deeper into machine learning/deep learning, Generative AI (GenAI), and ways to apply knowledge engineering to cooperatively address the future vision where AI takes center stage.
- Chapter 17, Takeaway Summary – Putting It All Together, presents the book’s conclusion and parting wishes for the development of your future-proof data engineering designs.
- Chapter 18, Appendix and Use Cases, delivers on the promise to elaborate on a few high-level use cases with a primer on the technologies used in those use cases.