Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Cloud Analytics with Microsoft Azure. - Second Edition

You're reading from  Cloud Analytics with Microsoft Azure. - Second Edition

Product type Book
Published in Jan 2021
Publisher Packt
ISBN-13 9781800202436
Pages 184 pages
Edition 2nd Edition
Languages
Authors (3):
Has Altaiar Has Altaiar
Profile icon Has Altaiar
Jack Lee Jack Lee
Profile icon Jack Lee
Michael John Peña Michael John Peña
Profile icon Michael John Peña
View More author details
Toc

Big data analytics

The term "big data" is often used to describe massive volumes of data that traditional tools cannot handle. It can be characterized by the five Vs:

  • Volume: This indicates the volume of data that needs to be analyzed for big data analytics. We are now dealing with larger datasets than ever before. This has been made possible because of the availability of electronic products such as mobile devices and IoT sensors that have been widely adopted all over the globe for commercial purposes.
  • Velocity: This refers to the rate at which data is being generated. Devices and platforms, such as those just mentioned, constantly produce data on a large scale and at rapid speed. This makes collecting, processing, analyzing, and serving data at rapid speeds necessary.
  • Variety: This refers to the structure of data being produced. Data sources are inconsistent, having a mix of structured, unstructured, and some semi-structured data (you will learn more about this in the Bringing your data together section).
  • Value: This refers to the value of the data being extracted. Accessible data may not always be valuable. With the right tools, you can derive value from the data in a cost-effective and scalable way.
  • Veracity: This is the quality or trustworthiness of data. A raw dataset will usually contain a lot of noise (or data that needs cleaning) and bias and will need cleaning. Having a large dataset is not useful if most of the data is not accurate.

Big data analytics is the process of finding patterns, trends, and correlations in unstructured data to derive meaningful insights that shape business decisions. This unstructured data is usually large in file size (images, videos, and social graphs, for instance).

This does not mean that relational databases are not relevant for big data. In fact, modern data warehouse platforms such as Azure Synapse Analytics (formerly known as Azure SQL Data Warehouse) support structured and semi-structured data (such as JSON) and can infinitely scale to support terabytes to petabytes of data. Using Microsoft Azure, you have the flexibility to choose any platform. These technologies can complement each other to achieve a robust data analytics pipeline.

Here are some of the best use cases of big data analytics:

  • Social media analysis: Through social media sites such as Twitter, Facebook, and Instagram, companies can learn what customers are saying about their products and services. Social media analysis helps companies to target their audiences by utilizing user preferences and market trends. The challenges here are the massive amount of data and the unstructured nature of tweets and posts.
  • Fraud prevention: This is one of the most familiar use cases of big data. One of the prominent features of big data analytics when used for fraud prevention is the ability to detect anomalies in a dataset. Validating credit card transactions by understanding transaction patterns such as location data and categories of purchased items is an example of this. The biggest challenge here is ensuring that the AI/ML models are clean and unbiased. There might be a chance that the model was trained just for a specific parameter, such as a user's country of origin, hence the model will focus on determining patterns on just the user's location and might miss out on other parameters.
  • Price optimization: Using big data analytics, you can predict what price points will yield the best results based on historical market data. This allows companies to ensure that they do not price their items too high or too low. The challenge here is that many factors can affect prices. Focusing on just a specific factor, such as a competitor's price, might eventually train your model to just focus on that area, and may disregard other factors such as weather and traffic data.

Big data for businesses and enterprises is usually accompanied by the concept of having an IoT infrastructure, where hundreds, thousands, or even millions of devices are connected to a network that constantly sends data to a server.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime