At some point, you may need to tune Solr1 to handle heavier indexing and searching loads; this is especially true for production environments with heavy search index usage. In this page, we will look at some of the configuration options available you can use in order to optimize and make Solr production-ready. The tuning methods presented below are intended as a general guide to optimize a Solr server; it is still important to consider your organization's use case before committing to a different configuration.
Before making any changes, it is important to know which parts of Solr may need further optimization. To give you a head start, you may check your Solr instance's performance statistics and log files; both of which are accessible via the Solr Admin UI (not available on the embedded version). To get to the Solr Admin UI, follow the steps below:
- Navigate to
- Select a Solr core to examine.
- Click Plugins/Stats.
- Select a plugin type to view its metrics.
Among the most important configurations for tuning the JVM is memory allocation3. While it is
advisable to allocate more memory than the bare minimum that Solr needs, it is also important to allot sufficient memory
to the operating system (OS) as it improves performance by
caching files from the Solr index. A good rule of thumb is to give Solr the memory it needs, add a some extra, and leave
the rest to the OS. The dashboard of the Solr Admin UI provides information about how much memory the Solr instance
uses. To configure memory allocation in the JVM, set the
-Xms argument to set the initial memory size and the
argument to set the maximum heap size. These arguments can be configured in the include file
solr.in.cmd) located in
Doing a normal commit writes the index files to the storage disk and opens a new searcher where the newly committed data is available. This is an expensive operation due to the processes involved. It is also called a "hard commit". On the other hand, a soft commit is a less expensive operation; it makes data quickly searchable by not writing to the disk immediately. It is Solr's implementation of a near real time (NRT) search.
Whilst a soft commit seems more viable, at some point, a hard commit is still needed to ensure the durability of data.
The timing and behavior of commits can affect the performance of a Solr server. To configure Solr's behavior on commits,
When Lucene performs incremental indexing, changes are written to new files. Solr keeps most of these
files open at the same time. Unfortunately, this can exceed the limit of open files and file descriptions on Unix-based
operating systems which means the server will most likely crash. To fix this, use the command:
ulimit -n [number].
It is recommended to configure this to at least
unlimited, depending on the limits of your OS.
The current limit of the system can be checked by running
ulimit -a or via the Solr Web UI Admin page.
Distributed indexing and searching
An index may become too large to fit on a single node or system. Solr supports horizontal scaling which is also known as sharding. Sharding distributes the index to multiple systems, each which is known as shard. When doing a query, Solr will handle the merging of results from each shard as if it were a single index. These capabilities are collectively referred to as SolrCloud. SolrCloud specializes in distributed search and indexing. SolrCloud is recommended for scaling as it supports automatic load-balancing, fault tolerance, and other important features that allow for a distributed architecture. Traditional sharding is also possible but its configuration will be more complex. Check out these resources to configure sharding:
Learn how to migrate to SolrCloud
Martini features an extensive documentation on how to shift to SolrCloud, from configuring the ZooKeeper ensemble, to configuring the cluster of Solr servers, and configuring Martini itself.
Most settings on Solr can be calibrated via the
solrconfig.xml file. The default
/server/solr/configsets/_default/conf contains documentation of the fields that you can
configure. Another good place to check is the
"Apache Solr Reference Guide: The Well-Configured Solr Instance".
For performance troubleshooting, Apache Solr also has a
wiki that lays out the factors that affect a Solr server's performance.