Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Scaling Big Data with Hadoop and Solr, Second Edition

You're reading from   Scaling Big Data with Hadoop and Solr, Second Edition Understand, design, build, and optimize your big data search engine with Hadoop and Apache Solr

Arrow left icon
Product type Paperback
Published in Apr 2015
Publisher
ISBN-13 9781783553396
Length 166 pages
Edition 1st Edition
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Hrishikesh Vijay Karambelkar Hrishikesh Vijay Karambelkar
Author Profile Icon Hrishikesh Vijay Karambelkar
Hrishikesh Vijay Karambelkar
Arrow right icon
View More author details
Toc

Loading data in Apache Solr

Once Apache Solr is configured, the next step is to load data in Apache Solr and run queries. There are different ways to load data into Apache Solr. The following diagram depicts most of the used ones:

Loading data in Apache Solr

We have already seen the simple post tool earlier while setting up Apache Solr. We are going to understand Extracting Request Handler.

Extracting request handler – Solr Cell

Solr Cell is one of the most powerful handlers for uploading any type of data. This is particularly useful if you wish to run Solr on a set of files/unstructured data containing different formats such as office, pdf, eBook, emails, and text. In Apache Tika, text extraction is based purely on file type and content. So, if you have a PDF of scanned images containing text, Apache Tika won't be able to extract any of the text from it. In such cases, you need to use OCR-based software to bring in such functionality for Solr. You can simply try this by downloading the curl utility and then...

You have been reading a chapter from
Scaling Big Data with Hadoop and Solr, Second Edition
Published in: Apr 2015
Publisher:
ISBN-13: 9781783553396
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image