Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Engineering Best Practices

You're reading from   Data Engineering Best Practices Architect robust and cost-effective data solutions in the cloud era

Arrow left icon
Product type Paperback
Published in Oct 2024
Publisher Packt
ISBN-13 9781803244983
Length 550 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
David Larochelle David Larochelle
Author Profile Icon David Larochelle
David Larochelle
Richard J. Schiller Richard J. Schiller
Author Profile Icon Richard J. Schiller
Richard J. Schiller
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Preface 1. Chapter 1: Overview of the Business Problem Statement FREE CHAPTER 2. Chapter 2: A Data Engineer’s Journey – Background Challenges 3. Chapter 3: A Data Engineer’s Journey – IT’s Vision and Mission 4. Chapter 4: Architecture Principles 5. Chapter 5: Architecture Framework – Conceptual Architecture Best Practices 6. Chapter 6: Architecture Framework – Logical Architecture Best Practices 7. Chapter 7: Architecture Framework – Physical Architecture Best Practices 8. Chapter 8: Software Engineering Best Practice Considerations 9. Chapter 9: Key Considerations for Agile SDLC Best Practices 10. Chapter 10: Key Considerations for Quality Testing Best Practices 11. Chapter 11: Key Considerations for IT Operational Service Best Practices 12. Chapter 12: Key Considerations for Data Service Best Practices 13. Chapter 13: Key Considerations for Management Best Practices 14. Chapter 14: Key Considerations for Data Delivery Best Practices 15. Chapter 15: Other Considerations – Measures, Calculations, Restatements, and Data Science Best Practices 16. Chapter 16: Machine Learning Pipeline Best Practices and Processes 17. Chapter 17: Takeaway Summary – Putting It All Together 18. Chapter 18: Appendix and Use Cases 19. Index 20. Other Books You May Enjoy

Summary

In this overview of the business problem, you have learned a number of foundational elements that will be elaborated on in subsequent chapters. This chapter introduced the topics needed to gain an understanding of the current state of data engineering and the creation of future-proof designs. You have learned that businesses are faced with an ever-changing technological landscape. Competition requires one to innovate at scale to remain relevant. This causes a constant implementation stream of total-cost-of-ownership (TCO) budget allocations for refactoring and re-envisioning during what would normally be a run/manage phase of a system’s lifespan. In this chapter, and in subsequent chapters, we make many references to the engineering solution’s TCO. These references will be reminders to all stakeholders that the solutions developed are within the real world business setting. They are not created in some abstract vacuum, devoid of budgeting constraints that will, at times, limit possibilities. It is important to note that when the TCO is clear, yet constrained by budgets, these constraints repeatedly appear on the monthly and quarterly radar reports presented to the enterprise. These constraints will most likely have imposed risk. Without a constant stream of reminders, the business will forget how these constraints have impacted the solution.

Additionally, building a system that perpetuates false facts, even if spun as true facts, is foolish. Make the future data solution smart! We are entering an exciting future where data and information solutions will become smarter and support knowledge and intelligence capabilities. Embrace the change and know its implications on your data engineering choices. DataOps needs to be adopted by data professionals as a critical approach to managing data in today’s complex, data-driven world.

One size does not fit all and as such, building with data contracts in mind will force the development of data stores with the same logical data into the physical data architecture as fit-for-purpose parallel instantiations. Correctly building data solutions to be future-proof requires a vision, strategy, mission, and architectural approach to prevent the implementation from dying an untimely death due to the juggling needed to get the solution serviceable for the business.

Third-party vendors and cloud providers will produce well architected solutions that do not integrate, or worse yet, that foster architectural anti-patterns that must be avoided. As such, the data mesh and the cloud provider’s data fabrics are only buzzwords until the concepts are fully understood and rationalized into your architecture and organization’s objectives. Design data solutions consistently to the architecture you develop, develop use cases across the system, and test, regression test, and monitor them for continual service in order to maintain the trust established through data contracts.

Lastly, stay agile! Read! Learn! Be innovative! Once the big picture is grasped, the forward-looking perspective will grant you the foresight to look beyond the obstacles that will be encountered. You will be able to keep the data solution and its data fresh and current with a governed, agile architectural process.

In the next chapter, you will be presented with the architectural background challenges that build on this overview.

You have been reading a chapter from
Data Engineering Best Practices
Published in: Oct 2024
Publisher: Packt
ISBN-13: 9781803244983
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime