Skip to content

Overview

Development

Overview

IDEs

API Explorer

Creating, Reading, Updating, and Deleting Documents in your Custom Search Index

After you have successfully linked your Solr core to your Integrate package, perhaps your next question is, "How do I add, edit, or delete documents in the index?"

In this guide, we will show you how to add, edit, search, and delete documents in a custom search index. We will be creating scripts that index and un-index movie data and we will discuss the objects and methods used in those scripts that make indexing and un-indexing possible. For simplicity's sake, the data we're going to index will be manually entered via parameters.

Stuff you need to know...

​This guide assumes that you have gone through the basics of creating a custom Solr core or collection and Gloop/Groovy services.

Get the code!

The scripts mentioned in this guide are available in the examples package.

Preparation

Before we get to indexing and un-indexing documents, we must ensure that our custom Solr core is already exposed to the Integrate package we're going to use. These steps are discussed here.

Here's the outline of our set-up:

  • Our package is called examples. This is where our scripts will reside.
  • Our target Solr core is embedded and named movie-core. As a result, the directory structure of the examples package is:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    examples
    ├── classes
    ├── code
    ├── conf
    ├── web
    └── solr
        └── movie-core
            └── core.properties
            └── conf
                └── schema.xml
                └── solrconfig.xml
    
  • The examples package's esb-package.xml file has already been edited to make the embedded Solr core known:

    1
    2
    3
    4
    5
    6
    7
    <esb-package>
        <!-- ... -->
        <solr-cores>
            <solr-core name="movie-core" enabled="true" />
            <!-- ... -->
        </solr-cores>
    </esb-package>
    

Procedures

Now that our package and custom Solr core have been set up, we can proceed to creating our scripts. This guide is split into five sub-sections:

We will use Gloop and Groovy services to do all those things for us.

Other examples...

Check out the examples package's search indices. There, you will see services that demonstrate the use of SolrMethods and other Solr related functionality.

Creating the Model

You can manually create your Gloop model from scratch, or you can extract the fields defined in the schema.xml file to create a model based from it. In our case, we will do the latter using the SchemaToGloopModelGenerator service:

Model-generating Gloop service

We'll place this script in examples's code directory, under the solr.customSolrCore.model package. You should be able to use this script to parse your own schema.xml file, but depending on your set up, you may need to tweak it a little more. Here's a breakdown of its Gloop steps:

  • In Line 1, we have a map step that calls GroovyMethods.getPackage() to get the Integrate package where the script resides. The return value is then stored in a variable called esbPackage.

  • In Line 2, we have another map step that declares and initializes a Path variable that points to schema.xml's location. We'll use esbPackage#getHome() as the base path and from there, we can traverse to schema.xml's actual location, like so:

    1
    Paths.get(esbPackage.getHome(), 'solr', 'movie-core', 'conf', 'schema.xml')
    
  • In Line 3, we have added a third map step but this time, we use it to declare and initialize a String variable containing schema.xml's content. We did that this way:

    1
    Files.readAllBytes(movieCorePath);
    

    Gloop Converstion

    You may have noticed that the last line of code read in a byte array, but the variable was a string. This is possible thanks to the Gloop ObjectToCharSequenceConverter

  • In Line 4, we create an invoke step that calls SolrMethods.solrSchemaToGloopModel(String, String, String, String, List<GloopModel>). This method will create the Gloop model Movie, based on the schema.xml file, in solr.customSolrCore.model.

    1
    SolrMethods.solrSchemaToGloopModel("MovieDocument", schemaContent, null, "solr.customSolrCore.model", null)
    

All you have to do now is run the service and voila! You now have your schema.xml-based Gloop model! If you're following through our example, this will produce MovieDocument.model in solr.customSolrCore.model. We'll use this model later.

1
2
3
4
5
6
7
<package-name>
└── code
   └── solr
       └── customSolrCore
           └── model
               └── MovieDocument.model
               └── SchemaToGloopModelGenerator.gloop

The MovieDocument Gloop model should have the following fields:

  • id (String)
  • movieTitle (String)
  • director (String)
  • cast (String[])

In this case, the Groovy bean class MovieDocument.groovy will hold the movie data we want to index. We'll place it under the solr.customSolrCore.model package. Its content will be:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
package solr.customSolrCore.model

import org.apache.solr.client.solrj.beans.Field

class MovieDocument {

    String id;

    @Field
    String movieTitle;

    @Field
    String director;

    @Field
    String[] cast;

}

The @Field annotations indicate which fields we want to index.

The Groovy bean for movie documents in the examples package does not contain the id field

The id field is defined as the unique key for documents in the movie-core and is auto-generated upon creation (see UpdateRequestProcessorChain configuration in sorconfig.xml).

Fields defined in the schema

If you will take a look at movie-core's schema.xml file, you will notice that its documents are defined so that it has six fields: id, movieTitle, director, cast, _version_, and text.

  • id is the identifier for our documents and whose value is automatically generated by Solr due to the UpdateRequestProcessorChain configuration in solrconfig.xml
  • _version_ is, once again, a property whose value is automatically supplied by Solr and is an internal field used by the partial update procedure, update log process, and by SolrCloud; this field is required to perform optimistic concurrency
  • text is a compilation of copied fields, and is used as the default search field when clients do their queries

The other fields are provided by the client.

Indexing the Model's Data

Since our model is ready, we can now create a service that gets and indexes the model's data. We'll populate our models manually to make things simpler.

Insert in bulk

You can use the SolrMethods.insertMany(...) one-liners to insert documents in bulk.

The MovieIndexer Gloop service will be responsible for indexing our MovieDocument's data. Here's a preview of the steps we will have in this service:

The MovieIndexer service's steps

MovieIndexer's sole input parameter is called movieDocument which is based on the MovieDocument Gloop model we created earlier. Because of this, we will be prompted to enter four fields when we run the service: id, movieTitle, director and casts. TORO Integrate will build the movieDocument parameter from our inputs and from there, we can index movieDocument via SolrMethods.index(String, String, GloopModel).

The bullet points below explain each step in the service:

  • In Line 1, we have a block step of type try-catch. This allows Gloop to mirror Java's try-catch where it wraps the code that could possibly throw an exception in try block and performs a rescue in the catch block.
  • In Line 3, under the try block, we have an invoke step that calls SolrMethods.index(String, String, GloopModel). This is where the actual indexing will happen. It'll index movieDocument so that it will be available for querying in examples's movie-core Solr core later.
  • In Line 5, we have another invoke step that calls LoggerMethods.error(String); this time, under the catch block. This will just log the exception if anything goes wrong whilst indexing.

Running the service will prompt you to populate the required MovieIndexer model. You can enter whatever values you want to index. The service, if invoked successfully, should return a response similar to below:

MovieIndexer's sample successful response

This time, we'll create an endpoint whose parameters are to be mapped to the MovieDocument bean's fields. We can just call this Spring-based endpoint and the indexing will take place.

Simply create a Groovy file named MovieSolrAPI in solr.customSolrCore and edit it so that it contains the code below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
package solr.customSolrCore

import java.util.Map
import javax.servlet.http.HttpServletRequest

import org.springframework.web.bind.annotation.*
import org.springframework.web.bind.annotation.RequestMethod
import org.springframework.http.MediaType
import org.springframework.http.ResponseEntity
import org.apache.solr.servlet.SolrRequestParsers

import io.toro.integrate.core.service.annotation.InputType
import io.toro.integrate.core.service.annotation.InputTypeField

import solr.customSolrCore.model.MovieDocument

@RestController
@RequestMapping('/solr-package')
class MovieSolrAPI {
  @RequestMapping(produces = [MediaType.APPLICATION_JSON_VALUE, MediaType.APPLICATION_XML_VALUE], 
                  method = RequestMethod.POST)
  ResponseEntity<?> addDocument(@RequestParam String movieTitle,
                                @RequestParam String director,
                                @RequestParam String[] casts) {
    def document = new MovieDocument(movieTitle: movieTitle, director: director, cast: casts)
    'movie-core'.writeToIndex( null, document ).toString()
    return ResponseEntity.ok(document)
  }
}

The default solr.customSolrCore.MovieSolrAPI class calls the incorrect one-liner method for indexing

If you have the examples package, you will notice that the content of solr.customSolrCore.MovieSolrAPI is different from the snippet defined above.

This document shows the corrected class because the original class calls a non-existent method for indexing; hence, if you manage to run it, you will receive an error. The default solr.customSolrCore.MovieSolrAPI class is also not annotated with Spring annotations; hence, you will not be able to call it via an HTTP call or using the Service Invoker.

The corrected version (snippet above) fixes all these issues.

As you may notice, in:

  • Line 25, we constructed a MovieDocument object (document variable) from the parameters of our request.
  • Line 26, we used SolrMethods.index(String, String, GloopModel) method, a one-liner, to index the data for us. We subsequently called the GloopMethod#toString() method so that our endpoint's response is the indexed MovieDocument model.

With that said, a call to the endpoint will trigger the indexing of your movie data. For example:

1
2
3
4
curl -X POST \
  'http://localhost:8080/api/solr-package?movieTitle=Forrest%20Gump&director=Robert%20Zemeckis&casts=Tom%20Hanks&casts=Robin%20Wright&casts=Gary%20Sinise&casts=Mykelti%20Williamson&casts=Sally%20Field' \
  -H 'accept: application/json' \
  -H 'cache-control: no-cache' \

Try out the service via the Service Invoker

You can click on the run button shown at the beginning of the signature of a method to run the method.

Invoking a Groovy service via the Service Invoker

Updating Documents

There are two ways to update a Solr document in TORO Integrate. For your benefit, an example is written for each given method. And for each of these examples, we will assume that we need to update this particular entry in the index:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
{
    "movieTitle": "Forrest Gump",
    "director": "Robert Zemeckis",
    "cast": [
        "Tom Hanks",
        "Robin Wright",
        "Gary Sinise",
        "Mykelti Williamson",
        "Sally Field"
    ],
    "id": "4fc1a64f-7226-4aad-bf27-5c399e2c453c",
    "_version_": 1617640267263770624
}
  • Using SolrMethods.writeToIndex(...) one-liner methods (preferred)

    True to its purpose, using a one-liner method is the easiest way to update an existing document.

    If the update is only partial, one must:

    1. Create a SolrInputDocument object.
    2. Set the id field of the SolrInputDocument object.
    3. Populate the SolrInputDocument object with fields and values that must be modified.
    4. Call the one-liner method to index the SolrInputDocument object.

    For example, to update the director property of the original document, we'll do something like:

    1
    2
    3
    4
    5
    def document = new SolrInputDocument()  
    document.setField('id', '4fc1a64f-7226-4aad-bf27-5c399e2c453c')
    document.setField('director', [set:'Robert Lee Zemeckis'])
    
    'movie-core'.writeToIndex(null, document)
    

    Why use [set:"$newValue"]?

    In the example snippet above, we are updating the field director to have the value of "Robert Lee Zemeckis". You might notice that we passed a Map as the second argument of the call to SolrInputDocument#setField(String, Object), unlike what we did when setting the ID wherein we passed a String.

    The Map argument lets us define the modifier for the field that needs to be updated. In this case, our modifier is set which allows us to "set or replace the field value with the specified value". Solr provides other modifiers which you can use instead.

    If, however, you need to update all fields of the document:

    1. Create your bean object as usual.
    2. Set the bean object's id property (unique key property) so we know which document to update.
    3. Populate all fields of the object.
    4. Call the one-liner method to re-index the bean object.

    For example:

    1
    2
    3
    4
    5
    MovieDocument movie = new MovieDocument()
    movie.setId('4fc1a64f-7226-4aad-bf27-5c399e2c453c')
    movie.setDirector('Robert Lee Zemeckis')
    
    'movie-core'.writeToIndex(null, movie)
    

    After the changes have been committed, you will notice that the director field has been updated, but the rest of the fields are left blank. This is because with the snippet above, we have only specified the value of the director property.

  • Using SolrJ's SolrClient

    A call to the one-liner method SolrMethods.solr(String) returns a SolrClient object which you can use to directly interact with the Solr core tied to it (specified by passing the name of the core as the argument). However, to use SolrClient, one must be familiar with SolrJ and Groovy.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    import org.apache.solr.common.SolrInputDocument
    
    // ...
    
    SolrClient solrClient = SolrMethods.solr( coreName )
    SolrInputDocument document = new SolrInputDocument()
    
    document.setField('id', '4fc1a64f-7226-4aad-bf27-5c399e2c453c')
    document.setField('director', [set:'Robert Lee Zemeckis'])
    
    solrClient.add( solrInputDocument )
    

    Similarly, a SolrInputDocument object is populated with fields that need updating.

Searching for Documents

There are three ways to search for a Solr document in TORO Integrate:

  • Using one-liners (preferred)

    True to its purpose, using the one-liner method is the easiest way to update an existing document. For example:

    1
    2
    3
    SolrMethods.query(coreName, packageName, [q:"*:*"]).getResults()
    SolrMethods.query(coreName, packageName, [q:"movieTitle:*${titleSubstring}*"]).getResults()
    SolrMethods.query(coreName, packageName, [q:"id:${id}"]).getResults()
    
  • Using SolrJ's SolrClient

    A call to the one-liner method SolrMethods.solr(String) returns a SolrClient object which you can use to directly interact with the Solr core tied to it (specified by passing the name of the core as the argument). However, to use SolrClient, one must be familiar with SolrJ and Groovy.

    1
    2
    SolrClient solrClient = SolrMethods.solr(coreName)
    SolrDocumentList documents = solrClient.getById(ids)
    
  • Using the search API

    Alternatively, you can use TORO Integrate's Search REST API to search for your document. This will involve the use of Solr's SearchHandlers via the native or Solr-derived endpoint. For example:

    1
    2
    3
    curl -X GET \
    'http://<host>:<port>/esbapi/v1/solr/<package name>/<core name>/select?q=id:<id>' \
    -H 'Authorization: Bearer <access token>'
    

Removing Documents

There are two ways to delete a Solr document in TORO Integrate:

  • Using one-liners (preferred)

    True to its purpose, using the one-liner method is the easiest way to delete an existing document. There are three one-liner methods you can use to delete a Solr document:

    The example below shows how to delete a document with the ID 4fc1a64f-7226-4aad-bf27-5c399e2c453c:

    1
    2
    3
    'movie-core'.deleteById(null, ['4fc1a64f-7226-4aad-bf27-5c399e2c453c'])
    
    'movie-core'.deleteByQueryString(null, 'id:4fc1a64f-7226-4aad-bf27-5c399e2c453c')
    
  • Using SolrJ's SolrClient

    A call to the one-liner method SolrMethods.solr(String) returns a SolrClient object which you can use to directly interact with the Solr core tied to it (specified by passing the name of the core as the argument). However, to use SolrClient, one must be familiar with SolrJ and Groovy.

    The example below shows how to delete a document with the ID 4fc1a64f-7226-4aad-bf27-5c399e2c453c:

    1
    2
    3
    4
    SolrClient solrClient = 'movie-core'.solr()
    solrClient.deleteById(['4fc1a64f-7226-4aad-bf27-5c399e2c453c'])
    
    solrClient.deleteByQuery('id:4fc1a64f-7226-4aad-bf27-5c399e2c453c')