Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Apache Solr for Indexing Data

You're reading from   Apache Solr for Indexing Data Enhance your Solr indexing experience with advanced techniques and the built-in functionalities available in Apache Solr

Arrow left icon
Product type Paperback
Published in Dec 2015
Publisher
ISBN-13 9781783553235
Length 160 pages
Edition 1st Edition
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Anshul Johri Anshul Johri
Author Profile Icon Anshul Johri
Anshul Johri
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Getting Started FREE CHAPTER 2. Understanding Analyzers, Tokenizers, and Filters 3. Indexing Data 4. Indexing Data – The Basic Technique and Using Index Handlers 5. Indexing Data with the Help of Structured Datasources – Using DIH 6. Indexing Data Using Apache Tika 7. Apache Nutch 8. Commits, Real-Time Index Optimizations, and Atomic Updates 9. Advanced Topics – Multilanguage, Deduplication, and Others 10. Distributed Indexing 11. Case Study of Using Solr in E-Commerce Index

Running Solr

To test whether your installation was completed successfully, you need to run Solr. Type these commands in the terminal to run it:

$ cd /usr/local/Cellar/solr/4.4.0/libexec/example/
$ java -jar start.jar

After you run the preceding commands, you will see lots of dumping messages/logs on the terminal. Don't worry! It's normal. Just try to fix any error if it is there. Once the messages are stopped and there is no error message, simply go to any web browser and type http://localhost:8983/solr/#/.

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You will see following screen on your browser:

Running Solr

Fresh Solr do not contain any data. In Solr terminology, data is termed as a document. You will learn how to index data in Solr in upcoming chapters.

Installing Solr in Windows

There are multiple ways of installing Solr on a Windows machine. Here, I have explained the way to set up Solr with Jetty running as a service via NSSM:

  1. Install the latest Java JDK from http://www.oracle.com/technetwork/java/javase/downloads/index.html.
  2. Download the latest Solr release (ZIP version) from http://www.apache.org/dyn/closer.cgi/lucene/solr/. At the time of writing this book, the latest Solr release was 4.10.1.
  3. Unzip the Solr download. You should have files as shown in the following screenshot. Open the example folder.
    Installing Solr in Windows
  4. Copy the etc, lib, logs, solr, and webapps folders and start.jar to C:\solr (you will need to create the folder at C:\solr), as shown in the following screenshot:
    Installing Solr in Windows
  5. Now open the C:\solr\solr folder and copy the contents back to the root C:\solr folder. When you are done, you can delete the C:\solr\solr folder. See the following image, the selected folder you can delete now:
    Installing Solr in Windows

    At this point, your C:\solr directory should look like what is shown in the following screenshot:

    Installing Solr in Windows
  6. Solr can be run at this point if you start it from the command line. Change your directory to C:\solr and then run java -Dsolr.solr.home=C:/solr/ -jar start.jar.
  7. If you go to http://localhost:8983/solr/, you should see the Solr dashboard.
  8. Now Solr is up and running, so we can work on getting Jetty to run as a Windows service. Since Jetty comes bundled with Solr, all that we need to do is run it as a service. There are several options to do this, but the one I prefer is through Non-Sucking Service Manager (NSSM)program in windows which is the, the most compatible service manager across Windows environment. NSSM can be downloaded from http://nssm.cc/download.
  9. Once you have downloaded NSSM, open the win32 or win64 folder as appropriate and copy nssm.exe to your C:\solr folder.
  10. Open Command Prompt, change the directory to C:\solr, and then run nssm install Solr.
  11. A dialog will open. Select java.exe as the application located at C:\Windows\System32\.
  12. In the options input box, enter: Dsolr.solr.home=C:/solr/ -Djetty.home=C:/solr/ -Djetty.logs=C:/solr/logs/ -cp C:/solr/lib/*.jar;C:/solr/start.jar -jar C:/solr/start.jar.
  13. Click on Install service. You should get a service successfully installed message.
  14. Finally run net start Solr.
  15. Jetty should now be running as a service. Check this by going to http://localhost:/8983/solr/.

Installing Solr on Linux

To install Solr on Linux/Unix, you will need Java Runtime Environment (JRE) version 1.7 or higher. Then follow these steps:

  1. Download the latest Solr release (.tgz) from http://www.apache.org/dyn/closer.cgi/lucene/solr/. At the time of writing this book, the latest release was 4.10.1.
  2. Unpack the file to your desired location.
  3. Solr runs inside a Java servlet container, such as Tomcat, Jetty, and so on. Solr distribution includes a working demo server in the example directory, which runs in Jetty. You can use Jetty servlet container, or use your preferred servlet container. If you are using a servlet container other than Jetty and it's already running, then stop that server.
  4. Copy the solr-4.10.1.war file from the Solr distribution under the dist directory to the webapps directory of your servlet container. Change the name of this file; it must be named solr.war.
  5. Copy the Solr home directory, solr-4.x.0/example/solr/, from the distribution to your desired Solr home location.
  6. Start your servlet container, passing to it the location of your Solr home in one of these ways:
    1. Set the solr.solr.home Java system property to your Solr home (for example, using this example jetty setup: java -Dsolr.solr.home=/some/dir -jar start.jar).
    2. Configure the servlet container so that a JNDI lookup of java:comp/env/solr/home by the Solr web app will point to your Solr Home.
    3. Start the servlet container in the directory containing ./solr. The default Solr Home is solr under the JVM's current working directory ($CWD/solr).
  7. To confirm the installation, just go to http://localhost:/8983/solr/ and you will see the Solr dashboard. Now your Solr is up and running.

Thus, by the end of the installation, your Solr is up and running. But since we have not fed any data into Solr, it will not index any data. Let's try to insert some example data into our server.

The Solr download comes with example data bundled in it. We can use the same data for indexing as an example. Go to the exampledocs directory under the example directory. Here, you will see a lot of files. Now go to the command line (terminal) and type the following commands:

$ cd $SOLR_HOME/example/exampledocs/
$ ./post.sh vidcard.xml

Within the post.sh file, the script will call http://localhost:8983/solr/update using curl to post xml data from the vidcard.xml file. When the import completes (without any error), you will see a message that looks something like this:

Installing Solr on Linux

Now let's try to check out our imported data from web browser. Try http://localhost:8983/solr/select?q=*:*&wt=json to fetch all of the data in your Solr instance, like this:

Installing Solr on Linux

When you see the preceding data, it means that your Solr server is running properly and is ready to index your desired feed. You will be reading indexing in depth in upcoming chapters.

You have been reading a chapter from
Apache Solr for Indexing Data
Published in: Dec 2015
Publisher:
ISBN-13: 9781783553235
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image