Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Apache Hive Essentials

You're reading from   Apache Hive Essentials Essential techniques to help you process, and get unique insights from, big data

Arrow left icon
Product type Paperback
Published in Jun 2018
Publisher Packt
ISBN-13 9781788995092
Length 210 pages
Edition 2nd Edition
Languages
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Dayong Du Dayong Du
Author Profile Icon Dayong Du
Dayong Du
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Overview of Big Data and Hive FREE CHAPTER 2. Setting Up the Hive Environment 3. Data Definition and Description 4. Data Correlation and Scope 5. Data Manipulation 6. Data Aggregation and Sampling 7. Performance Considerations 8. Extensibility Considerations 9. Security Considerations 10. Working with Other Tools 11. Other Books You May Enjoy

A short history

In the 1960s, when computers became a more cost-effective option for businesses, people started to use databases to manage data. Later on, in the 1970s, relational databases became more popular for business needs since they connected physical data with the logical business easily and closely. In the next decade, Structured Query Language (SQL) became the standard query language for databases. The effectiveness and simplicity of SQL motivated lots of people to use databases and brought databases closer to a wide range of users and developers. Soon, it was observed that people used databases for data application and management and this continued for a long period of time.

Once plenty of data was collected, people started to think about how to deal with the historical data. Then, the term data warehousing came up in the 1990s. From that time onward, people started discussing how to evaluate current performance by reviewing the historical data. Various data models and tools were created to help enterprises effectively manage, transform, and analyze their historical data. Traditional relational databases also evolved to provide more advanced aggregation and analyzed functions as well as optimizations for data warehousing. The leading query language was still SQL, but it was more intuitive and powerful compared to the previous versions. The data was still well-structured and the model was normalized. As we entered the 2000s, the internet gradually became the topmost industry for the creation of the majority of data in terms of variety and volume. Newer technologies, such as social media analytics, web mining, and data visualizations, helped lots of businesses and companies process massive amounts of data for a better understanding of their customers, products, competition, and markets. The data volume grew and the data format changed faster than ever before, which forced people to search for new solutions, especially in the research and open source areas. As a result, big data became a hot topic and a challenging field for many researchers and companies.

However, in every challenge there lies great opportunity. In the 2010s, Hadoop, which was one of the big data open source projects, started to gain wide attention due to its open source license, active communities, and power to deal with the large volumes of data. This was one of the few times that an open source project led to the changes in technology trends before any commercial software products. Soon after, the NoSQL database, real-time analytics, and machine learning, as followers, quickly became important components on top of the Hadoop big data ecosystem. Armed with these big data technologies, companies were able to review the past, evaluate the current, and grasp the future opportunities.

You have been reading a chapter from
Apache Hive Essentials - Second Edition
Published in: Jun 2018
Publisher: Packt
ISBN-13: 9781788995092
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime