Scaling Apache Solr: Optimize your searches using high-performance enterprise search repositories with Apache Solr

Vijay Karambelkar

€18.99 per month

4.7 (3 Ratings)

Paperback Jul 2014 298 pages 1st Edition

Vijay Karambelkar

€18.99 per month

4.7 (3 Ratings)

Paperback Jul 2014 298 pages 1st Edition

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!

Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!

50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

Thousands of reference materials covering every tech concept you need to stay up to date.

Subscribe now

View plans & pricing

View table of contents

Preview Book

Description

This book is a step-by-step guide for readers who would like to learn how to build complete enterprise search solutions, with ample real-world examples and case studies. If you are a developer, designer, or architect who would like to build enterprise search solutions for your customers or organization, but have no prior knowledge of Apache Solr/Lucene technologies, this is the book for you.

What you will learn

Gain a complete understanding of Apache Solr and its ecosystem
Develop scalable, highperformance search applications using Apache Solr
Customize ApacheSolrbased search for different requirements
Discover different techniques to build highspeed enterprise searches
Design enterpriseready search engines and implement a scalable enterprise search functionality
Integrate an ApacheSolrbased search with different subsystems and legacy systems
Scale Apache Solr through sharding, replication, and fault tolerance
Learn about performance tuning for your Solrbased application while scaling your data
Make your enterprise search cloudready to be able to work with multiple clients

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!

Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!

50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

Thousands of reference materials covering every tech concept you need to stay up to date.

Subscribe now

View plans & pricing

Frequently bought together

€35.99

€28.99

Total € 64.98

Sai Nov 18, 2014

With 10 chapters fitted into about 300 pages, not only is it well written it is most importantly well structured.I've always loved books that go a step further in demoing/displaying how the technology could be scaled for an enterprise problem and this book does that exactly. And in doing so, the author is mindful that he could have a naive audience tasting the waters of Big Data. It starts with a building a use case for Solr, then dives into the Solr architecture rounding it up with use cases for Solr application.The book covers the following topicsInstallation and configurationData analysis with Apache SolrEnterprise search design patternsSolr Integration with Java, CMS frameworks, PHP, JS, Drupal.Distributed search, leveraging the IAAS and PAAS Cloud services(Amazon Cloud, Windows Azure, etc)Scaling Solr using sharding, fault tolerance, mongoDB, stormOptimization/Fine tuning Solr ApplicationsIntegration with Hadoop, Katta, Cassandra and R is covered in the final chapter of the book.What I liked most about the book were the case studies presented with some of the chapters that demo the real world enterprise problems and the ways how Solr was used to provide an effective solution around it.All I can say is Great Job, Hrishikesh!

Amazon Verified review

A. Zubarev Oct 04, 2014

Reading Scaling Apache Solr by Hrishikesh Karambelkar turned out to be a great surprise. Having my expectation set initially low (without an apparent reason) it suddenly unfolded into something huge I could regret having otherwise passed by. It is actually an extremely thoughtful and full of practical examples ... I do not know how to call it, whether a cookbook or handbook, but definitely a wonderful masterpiece!The book may be a good source of wisdom or advice for new projects and even serve as a guide to resolving issues or improving poorly performing search applications.Hrishikesh covers a wide (be warned, it is wide, and in a good sense of the word) variety of topics in his work:Data processing with Apache Solr (techniques)Enterprise search design principalsIntegration examples (Java, Drupal and more)Distributed search, leveraging the CloudMaking Solr scalableMonitoring of Solr including optimizationIntegration with Big Data pillars as Hadoop, Zookeeper, KattaNoSQL: MongoDB and CassandraData analysis with RAs a bonus, the author covers fixes to the most common pitfalls or errors.Blew my expectations! THIS IS the book you want to keep on your bookshelf and electronic media.After reading this book you can be assured to sail with fear the high waters of the Big Data ocean!5 starts + out of five!

Tomasz Sobczak Oct 13, 2014

We live in a world flooded by data and information and all realize that if we can’t find what we’re looking for (e.g. a specific document), there’s no benefit from all these data stores. When your data sets become enormous or your systems need to process thousands of messages a second, you need to an environment that is efficient, tunable and ready for scaling. We all need well-designed search technology.A few days ago, a book called "Scaling Apache Solr" landed on my desk. The author, Hrishikesh Vijay Karambelkar, has written an extremely useful guide to one of the most popular open-source search platforms, Apache Solr. Solr is a full-text, standalone, Java search engine based on Lucene, another successful Apache project. For people working with Solr, like myself, this book should be on their Christmas shopping list! It’s one of the best on this subject.Karambelkar is an enterprise architect with a long history in both commercial products and open source technology. As he says, he currently spends most of his time solving problems for the software industry and developing the next generation of products.The book is divided into 10 chapters. Basically, the first three are an introduction to Apache Solr and cover its architecture, features, configuration and setting up. Chapter One contains many practical cases of Apache Solr, to help beginners understand the topic.Chapter Four is very interesting and describes a common pattern for enterprise search solutions. These patterns focus on data processing/integration and how to meet the requirements of users (interface, relevancy, general experience).The rest of the book mainly refers to the central topic, that is distributing search queries and how to scale/optimize a system. The book discusses all Apache Solr concepts like replication, fault tolerance, sharding and illustrates them with helpful examples. The book precisely explains SolrCloud - a bundle of built-in distributed capabilities available from version 4.0.Chapter 8, dedicated to optimization, drew my attention. It is full of useful tips concerning JVM parameters and manipulating data structures or caching layers as well."Scaling Apache Solr" covers both basic and advanced subjects. The information is well organised, clear and concise. Lots of examples and cases in this book can be absorbed by beginners. I was nicely surprised by the chapter describing integration possibilities. There’s some great information about using Solr with Cassandra, MapReduce paradigm or R (programming language for computational statistics) although I would have preferred this subject to be covered in more detail. The book has two more advantages: first, it discusses designing an enterprise search system in general terms and second, it can be treated as an introduction to large volume data processing.I believe I need to emphasize that many sections related to defining a schema, importing data, running SolrCloud or searching in near real time (NRT) are not just a raw documentation, they also have the author's well-judged advice and comments.Unfortunately, I felt some of the more advanced topics were not described in enough detail. For example, index merging, documents relevance or using dynamic fields in data structure. Moreover, reading the book, I had a feeling that some parts do not fit the title, such as the section about clustering with Carrot2 or integration with PHP web portal.In summary, I can say that I have read this book with pleasure and satisfaction, which in fact is rare regarding technology publications! For me, as a person who has been working with Solr since version 1.3, it was a great way to review and sort out some of its aspects. On the other hand, I'm pretty sure, that people starting their experience with Apache Solr will take a lot from this book. Although, it is mainly focused on advanced problems, it starts with the basics.Despite some little imperfections I can truly recommend this book, especially because it describes the concrete technology in an easy-to-read way and also refers to some general architectural patterns.

Scaling Apache Solr: Optimize your searches using high-performance enterprise search repositories with Apache Solr

What do you get with a Packt Subscription?