Skip to content

Crawler Configuration

As Seen in the User Interface

Prior to running, the Crawler must be configured with the information necessary to identify which elements in the web page should be scanned and for which fields in the OpenAPI schema their values should map to.

The Crawler is configured using the Crawler tab. Upon selecting this tab, you would be able to see a toolbar and four more tabs underneath labeled (from left to right): Operation, Request Body, Parameter, and Response. These tabs contain the bulk of all configurable Crawler-related fields.

Where else can the Crawler be configured?

The Settings tab contains a couple of fields for changing the Crawler's behavior.

Crawler, Operation tab

Crawler, Request Body tab

Crawler, Parameter tab

Crawler, Response tab

Configuration Fields

All fields in the Crawler tab expect CSS selectors, and as said earlier, they will be used for identifying which elements to extract data from. Rules defined in all four tabs will be applied per HTTP operation. The Crawler expects uniform HTML structure per HTTP operation, which is why the same set of rules are reused.

Each tab groups related fields together and is independent of each other, despite the seeming recurrence of fields. The Operation tab contains fields for specifying where to get each API operation's general information. Meanwhile, the Request Body tab expects information on how to extract acceptable request bodies. The Parameter tab, on the other hand, is for defining rules on how to identify endpoint parameters. Lastly, like its name implies, the Response tab is for specifying the responses a client should expect from every endpoint.

To better understand each tab's fields, let's go through all of them in detail. In the tables below, you will be able to view each field's:

  • label
  • unique key (used to access the configuration's value in custom scripts, which are typically created by users who want to implement their own custom crawling strategy)

    1
    Config.get(key)
    
  • target element

Hover over field labels to see their description or purpose

`Operation Url` property's hint

Operation Fields

The Operation tab contains fields for identifying the name, description, HTTP method, and path of API operations.

All fields expect CSS selectors

The fields in the Crawler tab should be fed CSS selectors. These CSS selectors are expected to point to the elements containing the data required by the plugin. If under a wrapper element, the CSS selector should be relative to the parent element's selector.

Uniform rules for all operations

The rules you enter for the fields below are assumed to be applicable to all HTTP operations. For example, if you set the Wrapper field's value to #tickets .scroll-spy, then the Crawler will assume that the HTML wrapper for every operation can be obtained through that selector.

Label Key CSS Selector Target
Document Url documentUrl <a> elements whose href attribute contains the URL for each API operation's documentation page. If provided, the Crawler will scan the linked pages to acquire the fields below. If unprovided, the Crawler will assume all operations are documented altogether in the current web page.
Wrapper wrapper The parent container of each operation, which effectively contains the fields below.
Name name Elements containing the name or ID of each operation.
Description description Elements containing each operation's description.
Method method Elements which indicate the HTTP method of each operation (GET, POST, PUT, DELETE, OPTIONS, TRACE, and PATCH).
Path path Elements containing each operation's relative path. If unset, the plugin will assume that the operation is accessible in the default path (/).

Request Body Fields

All fields expect CSS selectors

The fields in the Crawler tab should be fed CSS selectors. These CSS selectors are expected to point to the elements containing the data required by the plugin. If under a wrapper element, the CSS selector should be relative to the parent element's selector.

Uniform rules for all operations

The rules you enter for the fields below are assumed to be applicable to all HTTP operations. For example, if you set the Wrapper field's value to #tickets .scroll-spy, then the Crawler will assume that the HTML wrapper for every operation can be obtained through that selector.

Name Key CSS Selector Target
Request Body Url <a> elements whose href attribute points to the documentation page of each API operation's supported request body/bodies. If provided, the Crawler will scan the linked pages to acquire the fields below. If unprovided, the Crawler will assume all supported request bodies are documented altogether in the current web page.
Operation Wrapper The parent container of each endpoint's supported request body, which should also hold a reference to the operation's name. It contains all of the fields below.
Operation Name The elements containing the name or ID of each operation; this should have the same value as the Operation tab's Name field.
Request Body Payload Sample accepted JSON or XML request payloads. If provided, the Crawler will use the sample payload instead to discern which fields should be present in every request payload.

Payload Metadata

If the Request Body Payload field is left blank, the Crawler will use the following fields in order to define the expected content of the request body.

Name Key CSS Selector Target
Wrapper Wrapper element for every payload information, which also means it contains the fields below.
Name Elements whose values are the names of each payload property.
Description Elements containing the description of each request body.
Type Elements specifying which schema was used for each request body's type. It is object by default.
Array Elements indicating whether the schema is an array or not. If the schema is used as an array, then the element's content must be true or array.

Parameter Fields

All fields expect CSS selectors

The fields in the Crawler tab should be fed CSS selectors. These CSS selectors are expected to point to the elements containing the data required by the plugin. If under a wrapper element, the CSS selector should be relative to the parent element's selector.

Uniform rules for all operations

The rules you enter for the fields below are assumed to be applicable to all HTTP operations. For example, if you set the Wrapper field's value to #tickets .scroll-spy, then the Crawler will assume that the HTML wrapper for every operation can be obtained through that selector.

Name Key CSS Selector Target
Parameter Url <a> elements whose href attribute points to the documentation page of each API operation's supported parameters. If provided, the Crawler will scan the linked pages to acquire the fields below. If unprovided, the Crawler will assume all supported parameters are documented altogether in the current web page.
Operation Wrapper The parent container of each endpoint's list of supported parameters, which should also hold a reference to the operation's name. It contains all of the fields below.
Operation Name The elements containing the name or ID of each operation; this should have the same value as the Operation tab's Name field.
Parameter Payload Sample accepted JSON or XML parameter payloads. If provided, the Crawler will use the sample payload instead to discern which parameters are supported by the endpoint.

Parameter Metadata

If Parameter Payload is left blank, the Crawler will use the following fields in order to identify which parameters are supported by each endpoint.

Name Key CSS Selector Target
Wrapper Wrapper element for every endpoint's list of supported parameters, which also means it contains the fields below.
Name Elements whose values are the names of each supported parameter.
Description Elements whose values are the descriptions of each supported parameter
Location Elements indicating every parameter's type, which are either PATH, QUERY, HEADER, or COOKIE.
Required Elements which indicate whether a parameter is mandatory or not. If mandatory, the element's content should be either true or required.
Allow Empty Value Elements indicating whether a parameter can be set with an empty value. If a parameter can have an empty value, the element's content must be either true or allow empty value.
Type Elements specifying the schema used for each parameter's type. It is object by default.
Array Elements indicating whether the schema is an array or not. If it is used as an array, then the element's content must be either be true or array.

Response Fields

All fields expect CSS selectors

The fields in the Crawler tab should be fed CSS selectors. These CSS selectors are expected to point to the elements containing the data required by the plugin. If under a wrapper element, the CSS selector should be relative to the parent element's selector.

Uniform rules for all operations

The rules you enter for the fields below are assumed to be applicable to all HTTP operations. For example, if you set the Wrapper field's value to #tickets .scroll-spy, then the Crawler will assume that the HTML wrapper for every operation can be obtained through that selector.

Name Key CSS Selector Target
Response Url <a> elements whose href attribute points to the documentation page of each API operation's set of expected responses. If provided, the Crawler will scan the linked pages to acquire the fields below. If unprovided, the Crawler will assume all responses are documented altogether in the current web page.
Operation Wrapper Wrapper element containing information about each endpoint's response1, which should also hold a reference to the operation's name. It contains all of the fields below.
Operation Name The elements containing the name or ID of each operation; this should have the same value as the Operation tab's Name field.
Status Code Elements containing the status code of each endpoint's response. It is default by default
Response Description Elements containing the description of every endpoint's supported response.
Response Payload Sample expected JSON or XML response payloads. If provided, the Crawler will use the sample payload instead to piece together each response's supposed content.

Payload Metadata

Name Key CSS Selector Target
Wrapper Wrapper element for every response's set of fields, which also means it contains the fields below.
Name Elements whose values are the names of each response payload field.
Description Elements whose values are the descriptions of each response payload field.
Required Elements indicating whether each payload field is mandatory or not. If mandatory, the element's content must be either true or required.
Type Elements specifying the schema used for each payload field's type. It is object by default.
Array Elements indicating whether the schema is an array or not. If it is used as an array, then the element's content must be either be true or array.

  1. An endpoint can have multiple responses. For this field, however, the wrapper element must point to each individual response's information.