Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
NoSQL Data Models

You're reading from   NoSQL Data Models Addresses severe issues related to NoSQL data models

Arrow left icon
Product type Paperback
Published in Aug 2018
Publisher Wiley
ISBN-13 9781786303646
Length 278 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Olivier Pivert Olivier Pivert
Author Profile Icon Olivier Pivert
Olivier Pivert
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface
1 NoSQL Languages and Systems FREE CHAPTER 2 Distributed SPARQL Query Processing: a Case Study with Apache Spark 3 Doing Web Data: from Dataset Recommendation to Data Linking 4 Big Data Integration in Cloud Environments: Requirements, Solutions and Challenges 5 Querying RDF Data: a Multigraph-based Approach 6 Fuzzy Preference Queries to NoSQL Graph Databases 7 Relevant Filtering in a Distributed Content-based Publish/Subscribe System List of Authors
Index
End User License Agreement

2.4. SPARQL and MapReduce

The features expected from modern RDF triple stores are reminiscent of the Big Data trend in which solutions implementing specialized data stores from scratch are rare due to the enormous development effort they require. Instead, many RDF triple stores prefer to rely on existing infrastructures based on MapReduce [DEA 04] and clusters of distributed data and computation nodes for achieving efficient parallel processing over massively distributed data sets (see section 2.4.2.1). However, these cluster infrastructures are not designed as fully-fledged data management systems [STO 10] and integrating an efficient query processor on top of them is a challenging task. In particular, data storage and communication costs generated by the evaluation of joins (including data preprocessing and indexing) over distributed data need to be addressed cautiously. This section mainly reflects the work published in [NAA 17, NAA 16].

2.4.1. MapReduce-based SPARQL processing

Given...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image