0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Scaling Apache Solr

You're reading from Scaling Apache Solr Optimize your searches using high-performance enterprise search repositories with Apache Solr

Product type Paperback

Published in Jul 2014

Publisher

ISBN-13 9781783981748

Length 298 pages

Edition 1st Edition

Languages

Java

Tools

Solr

Concepts

SEO

Author (1):

Hrishikesh Vijay Karambelkar

View More author details

Table of Contents (13) Chapters

Preface

1. Understanding Apache Solr FREE CHAPTER

2. Getting Started with Apache Solr

3. Analyzing Data with Apache Solr

4. Designing Enterprise Search

5. Integrating Apache Solr

6. Distributed Search Using Apache Solr

7. Scaling Solr through Sharding, Fault Tolerance, and Integration

8. Scaling Solr through High Performance

9. Solr and Cloud Computing

10. Scaling Solr Capabilities with Big Data

A. Sample Configuration for Apache Solr

Index

Working with rich documents

We have seen how Apache Solr has inbuilt handlers for CSV, JSON, and XML formats in the last section. In any content management system of an organization, a data item may be residing in documents which are in different formats, such as PDF, DOC, PPT, XLS. The biggest challenge with these types is, they are all semi-structured forms. Interestingly, Apache Solr handles many of these formats directly, and it is capable of extracting the information from these types of data sources, thanks to Apache Tika! Apache Solr uses code from the Apache Tika project to provide a framework for incorporating many different file-format parsers such as Apache PDFBox and Apache POI into Solr itself.

Note

The framework to extract content from different data sources in Apache Solr is also called Solr CEL, solr-cell or more commonly Solr Cell.

Understanding Apache Tika

Apache Tika is a SAX-based parser for extracting the metadata from different types of documents. Apache Tika uses the...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Vijay Karambelkar

Vijay Karambelkar

Hrishikesh Vijay Karambelkar is an innovator and an enterprise architect with 16 years of software design and development experience, specifically in the areas of big data, enterprise search, data analytics, text mining, and databases. He is passionate about architecting new software implementations for the next generation of software solutions for various industries, including oil and gas, chemicals, manufacturing, utilities, healthcare, and government infrastructure. In the past, he has authored three books for Packt Publishing: two editions of Scaling Big Data with Hadoop and Solr and one of Scaling Apache Solr. He has also worked with graph databases, and some of his work has been published at international conferences such as VLDB and ICDE.

See other products by Vijay Karambelkar

Personalised recommendations for you

Based on your interests and search pattern

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m