Using core discovery
Until Solr 4.4, solr.xml
needed to include mandatory information, such as the cores definition. This was needed because Solr used this information to get and load the defined cores and their properties, basically information that was required for Solr to operate properly. Starting from Solr 4.4, a new structure of the solr.xml
file was introduced, and in addition to this, a process called core discovery was implemented. Due to these changes, we are not forced to describe the core in the solr.xml
file, but instead, we can use simple text files, and Solr will automatically load the appropriate cores. This recipe will show you how to use the core discovery process.
How to do it...
Using the new core discovery process is very simple.
- We start with creating the
solr.xml
file, which should be put in the home directory of Solr. The contents of the file should look like the following:<?xml version="1.0" encoding="UTF-8" ?> <solr> <solrcloud> <str name="host">${host:}</str> <int name="hostPort">${jetty.port:8983}</int> <str name="hostContext">${hostContext:solr}</str> <int name="zkClientTimeout">${zkClientTimeout:30000}</int> <bool name="genericCoreNodeNames"> ${genericCoreNodeNames:true}</bool> </solrcloud> <shardHandlerFactory name="shardHandlerFactory" class="HttpShardHandlerFactory"> <int name="socketTimeout">${socketTimeout:0}</int> <int name="connTimeout">${connTimeout:0}</int> </shardHandlerFactory> </solr>
- After this, we are ready to use the core discovery. For each core, apart from the standard configuration stored in the
conf
directory, we need to create thecore.properties
file, which should be placed in the same directory as theconf
directory. For example, if we have a core namedsample_core
, our very simplecore.properties
file will look like this:name=sample_core
That's all; during startup, Solr will load our core.
How it works...
The solr.xml
file is the same one that is provided with the Solr example deployment, and it contains the default values related to Solr configuration. The host
property specifies the hostname, and the hostPort
property specifies the port on which Solr will run (it will be taken from the jetty.port
property, and is by default 8983
). The hostContext
property specifies the web application context under which Solr will be available (by default, it is solr
). In addition to this, we can specify the ZooKeeper client session timeout by using the zkClientTimeout
property (used only in the SolrCloud mode, defaulting to 30,000 milliseconds). By default, we also say that we want Solr to use generic core names for SolrCloud, and we can change this by specifying false
in the genericCoreNodeNames
property.
There are two additional properties that relate to
shard handling. The socketTimeout
property specifies the timeout of socket connection, and the connTimeout
property specifies the timeout of connection. Both the properties are used to create clients used by Solr to communicate between shards. The connection timeout specifies the timeout when Solr connects to another shard, and it takes a long time; the socket timeout is about the time to wait for the response to be back.
The simplest core.properties
file is an empty file, in which case, Solr will try to choose the core name for us. However, in our case, we wanted to give the core a name we've chosen, and because of this, we included a single name
entry that defines the name Solr will assign to the core. You should remember that Solr will try to load all the cores that have the core.properties
file present, and the core name doesn't have to live in the directory of the same name.
Of course, the name
property is not the only property available for usage. There are other properties, but in most cases, you'll use the name
property only:
name
: This is the name of the core.config
: This is the configuration filename, which defaults tosolrconfig.xml
.dataDir
: This is the directory where data is stored. By default, Solr will use a directory calleddata
that is created on the same level as theconf
directory.ulogDir
: This is the directory where the transaction log entries are stored. For performance reasons, it might be good to store transaction logfiles on a disks other than the index files.schema
: This is the name of the file describing the index structure, which defaults toschema.xml
.shard
: This is the identifier of the shard.collection
: This is the name of the collection the core belongs to.roles
: This is the core role definition.loadOnStartup
: This can take a value oftrue
orfalse
. It defaults totrue
, which means Solr will load the core during startup.transient
: This can take a value oftrue
orfalse
. It defaults tofalse
, which means that the core can't be automatically unloaded by Solr.coreNodeName
: This is the name of the core used by SolrCloud.
Finally, it is worth saying that the old solr.xml
format will not be supported in Solr 5.0, so it is good to get familiar with the new format now.
There's more...
If you want to see all the properties and sections exposed by the new solr.xml
format, refer to the official Apache Solr documentation located at https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml.