Skip to content

Configuring SolrCloud to Work with TORO Integrate and an External ZooKeeper Ensemble

Once done with setting up the ZooKeeper ensemble, we can now install and configure SolrCloud. We're going to configure three instances of SolrCloud and connect them to the ZooKeeper ensemble. Solr comes with an embedded ZooKeeper, however, it's not recommended to use it in production systems as it defeats the purpose of redundancy.

Learn more about shards and replication in SolrCloud by reading Distributed Search with Index Sharding

Assumptions

  • In this guide, we will be using Solr 6.2.1.
  • We will store our universal configurations in /datastore/apps/solr/configs.
  • The IP addresses of our ZooKeeper hosts in our ZooKeeper ensemble are:

    • 192.168.21.71
    • 192.168.21.72
    • 192.168.21.73
  • We will configure three Solr servers for our SolrCloud cluster namely solr1, solr2, and solr3.

    • solr1

      • Home directory: /datastore/apps/solr/instances/solr1
      • IP address: 192.168.21.74
    • solr2

      • Home directory: /datastore/apps/solr/instances/solr2
      • IP address: 192.168.21.75
    • solr3

      • Home directory: /datastore/apps/solr/instances/solr3
      • IP address: 192.168.21.76

Procedure

  1. Download Solr's portable installer (solr-6.2.1.zip in our case), extract it, then copy the extracted directory to each Solr instance's home folder.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    cd /datastore/apps/solr
    
    # Dowload, extract, and rename the installer.
    wget http://mirror.rise.ph/apache/lucene/solr/6.2.1/solr-6.2.1.zip
    unzip solr-6.2.1.zip
    mv solr-6.2.1 solr
    
    # Copy the directory to each instance's home folder.
    cp -r solr instances/solr1/
    cp -r solr instances/solr2/
    cp -r solr instances/solr3/
    
  2. Create a copy of TORO Integrate's Solr core configurations in Solr's global configuration folder.

    1
    cp -r <toro-integrate-home>/solr/cores /datastore/apps/solr/configs
    
  3. Upload TORO Integrate's Solr core configurations, schema.xml and solrconfig.xml, to ZooKeeper. You can send these files to any Solr server and they will automatically be copied to all other ZooKeeper servers' folders.

    1. Navigate to any Solr server's server/scripts/cloud-scripts directory which contains the zkcli.sh script.

      1
      cd /datastore/apps/solr/instances/solr1/solr/server/scripts/cloud-scripts
      
    2. Use this script to upload the core_tracker and core_invoke_monitor Solr cores' configurations.

      1
      2
      ./zkcli.sh -zkhost 192.168.21.71:2181 -cmd upconfig -confname core_tracker -confdir /datastore/apps/solr/configs/cores/tracker/conf
      ./zkcli.sh -zkhost 192.168.21.71:2181 -cmd upconfig -confname core_invoke_monitor -confdir /datastore/apps/solr/configs/cores/invoke-monitor/conf
      

      We specified the -confname arguments above such that they refer to the names of the collections whose configurations are to be uploaded to ZooKeeper via the command. We are going to use these configuration names later in creating the Solr collections TORO Integrate will be using.

      The zkcli.sh script

      Learn more about Solr's zkcli.sh script by reading Command Line Utilities.

  4. Start the Solr servers in cloud mode.

    To do that, we will need the execute the following command for every Solr server we have:

    1
    <solr-server-home>/solr start -cloud -z <zookeeper-host-addresses>
    

    ... like so:

    1
    2
    3
    /datastore/apps/solr/instances/solr1/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    /datastore/apps/solr/instances/solr2/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    /datastore/apps/solr/instances/solr3/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    

    Solr should now start and connect to your ZooKeeper Ensemble. To check if Solr has started in cloud mode, open the Solr Admin UI in your browser and see if the Cloud tab as seen in the screenshot below:

    SolrCloud Admin Web UI

  5. Finally, create the collections needed by TORO Integrate (core_invoke_monitor and core_tracker) using the Solr Collections API's create endpoint. You may use the create endpoint of any configured Solr server.

    Here's how we did ours:

    [start-toggler] Example Request

    1
    2
    curl -X GET \
      'http://192.168.21.74:8983/solr/admin/collections?action=CREATE&name=jte_core_invoke_monitor&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName=core_invoke_monitor&wt=json'
    

    Example Response

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    {
        responseHeader: {
            status: 0,
            QTime: 12825
        },
        success: {
            192.168.21.76:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 3286
                },
                core: "jte_core_invoke_monitor_shard3_replica3"
            },
            192.168.21.75:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 3014
                },
                core: "jte_core_invoke_monitor_shard2_replica2"
            },
            192.168.21.74:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 3431
                },
                core: "jte_core_invoke_monitor_shard3_replica1"
            }
        }
    }
    
    ![core_invoke_monitor] Example Request

    1
    2
    curl -X GET \
      'http://192.168.21.74:8983/solr/admin/collections?action=CREATE&name=jte_core_tracker&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName=core_tracker&wt=json'
    

    Example Response

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    {
        responseHeader: {
            status: 0,
            QTime: 14724
        },
        success: {
            192.168.21.76:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 5468
                },
                core: "jte_core_tracker_shard3_replica1"
            },
            192.168.21.75:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 4846
                },
                core: "jte_core_tracker_shard1_replica3"
            },
            192.168.21.74:8983_solr: {
                responseHeader: {
                    status: 0,
                    QTime: 4989
                },
                core: "jte_core_tracker_shard2_replica2"
            }
        }
    }
    
    ![core_tracker] [end-toggler]

    In our requests above, we passed in a couple of query parameters. It is important to set the name and configName parameters to their pre-defined values; all other properties can vary depending on your needs.

    • name

      The name of the collection. TORO Integrate requires a specific format for core names: <package_name>_<core_name>. If prefixed, it's <prefix>_<package_name>_<core_name>.

      Create unique collection names by using prefixes

      It is possible to connect multiple TORO Integrate instances to a single SolrCloud cluster. In this kind of scenario, having identical collection names will result in shared Solr collections. This means that the Tracker and Invoke Monitor data of different TORO Integrate instances will all reside in the same solr indexes.

      To prevent that from happening, it is recommended to use prefixes for your collections. This is what we did in our example above – we prefixed our collection names with jte (jte_core_tracker and jte_core_invoke_monitor).

    • replicationFactor

      The number of replicas SolrCloud will create.

    • maxShardsPerNode

      The number of shards SolrCloud will create for each replica.

    • configName

      The name of the configuration to use for this collection. Ideally, you should pass in the value of the -confname parameter we defined earlier in Step #3.

    • wt

      The type of response you want to receive.

    Once done, open any of the Solr servers' Solr Admin UI on your browser. Click on the Cloud tab and then Graph. In the Graph page, you should see a graphical representation of how your Solr collections are mapped or distributed in your network. It should look similar to the screenshot below:

    SolrCloud Admin Graph View

    Accessing the same page in other configured Solr servers should still yield the same map.

At this point, your Solr cluster is ready. You can now proceed to setting up TORO Integrate.