Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Solutions Architect's Handbook

You're reading from   Solutions Architect's Handbook Kick-start your solutions architect career by learning architecture design principles and strategies

Arrow left icon
Product type Paperback
Published in Mar 2020
Publisher Packt
ISBN-13 9781838645649
Length 490 pages
Edition 1st Edition
Tools
Arrow right icon
Authors (2):
Arrow left icon
Neelanjali Srivastav Neelanjali Srivastav
Author Profile Icon Neelanjali Srivastav
Neelanjali Srivastav
Saurabh Shrivastava Saurabh Shrivastava
Author Profile Icon Saurabh Shrivastava
Saurabh Shrivastava
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. The Meaning of Solution Architecture 2. Solution Architects in an Organization FREE CHAPTER 3. Attributes of the Solution Architecture 4. Principles of Solution Architecture Design 5. Cloud Migration and Hybrid Cloud Architecture Design 6. Solution Architecture Design Patterns 7. Performance Considerations 8. Security Considerations 9. Architectural Reliability Considerations 10. Operational Excellence Considerations 11. Cost Considerations 12. DevOps and Solution Architecture Framework 13. Data Engineering and Machine Learning 14. Architecting Legacy Systems 15. Solution Architecture Document 16. Learning Soft Skills to Become a Better Solution Architect 17. Other Books You May Enjoy

Unstructured data stores

When you look at the requirements for an unstructured data store, it seems that Hadoop is a perfect choice because it is scalable, extensible, and very flexible. It can run on consumer hardware, has a vast ecosystem of tools, and appears to be cost effective to run. Hadoop uses a master-and-child-node model, where data is distributed between multiple child nodes and the master node co-ordinates jobs for running queries on data. The Hadoop system is based on massively parallel processing (MPP), which makes it fast to perform queries on all types of data, whether it is structured or unstructured.

When a Hadoop cluster is created, each child node created from the server comes with a block of the attached disk storage called a local Hadoop Distributed File System (HDFS) disk store. You can run the query against stored data using common processing frameworks such as Hive, Ping, and Spark. However, data on the local disk persists only for the life of the associated...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image