At some point, you may need to tune Solr1 to handle heavier indexing and searching loads; this is especially true for production environments with heavy search index usage. This page will look at some of the configuration options available in order to optimize and make Solr production-ready. The tuning methods presented below are intended as a general guide to optimize a Solr server; it is still important to consider your organization's use case before committing to a different configuration.
Before making any changes, it is important to know which parts of Solr may need further optimization. To give you a head start, you may check your Solr instance's performance statistics and log files; both of which are accessible via the Solr Admin UI (not available on the embedded version). To get to the Solr Admin UI, follow the steps below:
- Navigate to
- Select a Solr core to examine.
- Click Plugins/Stats.
- Select a plugin type to view its metrics.
Among the most important configurations for tuning the JVM is memory allocation3. While it is
advisable to allocate more memory than the bare minimum that Solr needs, it is also important to ensure sufficient
memory is available to the operating system (OS) as it improves performance by caching files from the Solr index. A
good rule of thumb is to give Solr the memory it needs, add some extra, and leave the rest to the OS. The dashboard
of the Solr Admin UI provides information about how much memory the Solr instance uses. To configure memory
allocation in the JVM, set the
-Xms argument to set the initial memory size and the
-Xmx argument to set the
maximum heap size. These arguments can be configured in the include file (
solr.in.cmd) located in
Doing a normal commit writes the index files to the storage disk and opens a new searcher where the newly committed data is available. This is an expensive operation due to the processes involved. It is also called a "hard commit". On the other hand, a soft commit is a less expensive operation; it makes data quickly searchable by not writing to the disk immediately. It is Solr's implementation of a near real time (NRT) search.
Whilst a soft commit seems more viable, at some point a hard commit is still needed to ensure the durability of data.
The timing and behavior of commits can affect the performance of a Solr server. To configure Solr's behavior on commits,
When Lucene performs incremental indexing, changes are written to new files. Solr keeps most of these
files open at the same time. Unfortunately, this can exceed the limit of open files and file descriptions on Unix-based
operating systems which means the server will most likely crash. To fix this, use the command:
ulimit -n [number].
It is recommended to configure this to at least
unlimited, depending on the limits of your OS.
The current limit of the system can be checked by running
ulimit -a or via the Solr Web UI Admin page.
Distributed indexing and searching
An index may become too large to fit on a single node or system. Solr supports horizontal scaling which is also known as sharding. Sharding distributes the index to multiple systems, each which is known as a shard. When performing a query, Solr will handle the merging of results from each shard as if it were a single index. These capabilities are collectively referred to as SolrCloud. SolrCloud specializes in distributed search and indexing. SolrCloud is recommended for scaling as it supports automatic load-balancing, fault tolerance, and other important features that allow for a distributed architecture. Traditional sharding is also possible but its configuration will be more complex. Read these resources to learn more on how to configure sharding:
Learn how to migrate to SolrCloud
Martini features extensive documentation on how to shift to SolrCloud, from configuring the ZooKeeper ensemble, to configuring the cluster of Solr servers, and configuring Martini itself.
Most settings on Solr can be calibrated via the
solrconfig.xml file. The default
/server/solr/configsets/_default/conf contains documentation of the fields that you can
configure. Solr also has a detailed page called
"Apache Solr Reference Guide: The Well-Configured Solr Instance".
For performance troubleshooting, Apache Solr has a wiki that lays out the factors that
affect a Solr server's performance.