Skip to content

Custom Crawling Strategy

Configured crawling strategies dictate how crawls would be executed by the plugin. For your convenience, default strategies are provided; however, there will be times when these out-of-the-box behaviors will simply not suffice. To work around this and use custom logic, you can code your own crawling strategies.

Prerequisite to creating your own custom crawling strategy

You should know how to code in JavaScript (preferably ES6) as this will be the language your strategy will be written in.

Toggling Custom Crawling Behavior

To configure a custom crawling strategy:

How to toggle the Strategy and Scripts tabs

  1. Click the Toggle strategy and scripts button along the Crawler toolbar.
  2. Select the tab whose strategy you want to customize, and then scroll down at the bottom.
  3. Provide the code for your custom crawling strategy in the Strategy tab.

    What's the Script tab for?

    The Script tab is where you can define reusable functions and variables for your strategy code. This means any of your strategies, regardless of which tab they belong to, can use components declared there.

Writing Your Own Custom Crawling Strategy

Each tab's strategy is distinct; they all serve different purposes. When writing your own crawling strategy, you must write it in a way that accomplishes that purpose. For the Operation tab, the strategy's goal is to register operations; the Request Body tab, to register supported request bodies; the Parameter tab, supported parameters; and lastly, the Response tab, to register expected responses.

There are methods reserved for registering an operation, request body, parameter, or response. They are the addOperation(Operation), setRequestBody(String, RequestBody) addParameter(String, Parameter), and addResponse(String, Response) methods, respectively. Your crawling strategy should call either of these methods (which one will depend on the context) when it finds an operation/request body/parameter/response in the document.

However, the add* methods require special arguments. For example, the addOperation method requires an argument of type Operation. To build these special objects, you can use the Factory class's methods.

The Factory methods, however, will require operation/request body/parameter/response data to be passed as arguments. To get these data from the documentation page, you can use the Crawler class's methods. To further help you make transformations, the Crawler also exposes a bunch of other utility methods.

Available Properties

The following properties will be available in the context of each tab's crawling strategy:

Name Description
metadata Read only. It contains all available models.
targetEl The target document element. It is usually the root element of the page.
DataType Available data types for schemas.
ParamLocation Available locations for parameter.
ResponseStatus Available http statuses for response.

Available Methods

You can call the following methods in any tab's crawling strategy code:

Name Description
info(String:message) Display a message of level INFO in the Status dialog.
warning(String:message) Display a message of level WARNING in the Status dialog.
error(String:message) Display a message of level ERROR in the Status dialog.
message(String:message) Display a primary message in the Status dialog.
test(String:message) Print a message in the browser's console.
addOperation(Operation:operation) Add an operation.
addResponse(String:operationId, Response:response) Add a response to an operation.
addParameter(String:operationId, Parameter:parameter) Add a parameter to an operation.
setRequestBody(String:operationId, RequestBody:requestBody) Set the request body of an operation.
addSchema(String:title, Schema:schema) Add a schema.
addTag(Tag:tag) Add a tag.

In addition to the custom methods above added by TORO, you can also call these:

Name Description
setTimeout(Function:func, Number:ms, Object:params) Call a function or evaluate an expression after a specified number of milliseconds.
setInterval(Function:func, Number:ms, Object:params) Repeatedly call a function or execute a code snippet, with a fixed time delay between each call. It returns an interval ID which uniquely identifies the interval, so you can remove it later by calling clearInterval(). This method is offered on the Window and Worker interfaces.
clearTimeout(Object:id) Prevent the function set with the setTimeout() to execute.

Available Methods from Config

You can call the following methods from the Config class in your crawling strategy code:

Name Description
get(String:propertyName): Crawler Get a property's Crawler configuration.

Available Methods from Factory

You can call the following methods from the Factory class in your crawling strategy code:

Name Description
createOperation(String:operationId, String:description, String:method, String:path): Operation Create an operation model.
createParameter(String:name, String:location, String:description, Boolean:required, Boolean:deprecated, Boolean:allowEmptyValue, String:schema, Boolean:array): Parameter Create a parameter model.
createResponse(Integer:statusCode, String:mediaType, String:schema): Response Create a response model.
createRequestBody(String:description, Boolean:required, Map<String, String>:content): RequestBody Create a request body model.
createTag(String:name, String:description): Tag Create a tag model.
createSchema(String:title): Schema Create a schema model.

Available Methods from Crawler

You can call the following methods from the Factory class in your crawling strategy code:

Name Description
onProgressChanged(String:message) Display a message in the Progress dialog.
getElements(Element:targetElement, String:selector): Element[] Retrieve elements.
getElement(Element:targetElement, String:selector): Element[] Retrieve element.
getElementText(Element:targetElement, String:selector) Retrieve text content from element.
getDocument(String:url): Element Retrieve document element from provided URL.
getDocuments(String[]:url): Element[] Retrieve document elements from provided URLs.
getUrls(Element[]:targetElements, Config:config, String operationId): String[] Retrieve URLs from provided elements. You can pass your config to change how it will extract the URL from the element.
filter(String:text, CrawlerFilter[]:filters): String Filters the text.
exclude(String:text, CrawlerExclude[]:excludes): Boolean Returns boolean whether to exclude or not.
evaluate(String:text, Element:targetElement, String:expression, String operationId = null): String Evaluates the text using the provided expression.
setupText(Config:config, Element:targetElement, String operationId = null): String Retrieve text from element and then evaluate and filter the text.
evalAndFilterText(String:text, Config:config, Element:targetElement, String operationId = null) Evaluate and filter the text using the config provided.
findOperationMethod(String:text): String Find operation method from text.

Available Methods from StringUtils

You can call the following methods from the StringUtils class in your crawling strategy code:

Name Description
camelCase(String:text): String Apply camelCase to the provided text.
toUpperCaseFirst(String:text): String Convert the first character of the string to uppercase.
toLowerCaseFirst(String:text): String Convert the first character of the string to lowercase.
findVariables(String:text): String[] Find path variables in the provided path; enclosed curly braces {} indicate a path variable. Given the string '/user/{id}/', the output of this function would be ['id'].
findQueries(String:text): String[] Find query variables from the provided path. Given the string '?id=2&ref=home', this function will return ['id', 'ref'].

Available Methods from SchemaUtils

You can call the following methods from the SchemaUtils class in your crawling strategy code:

Name Description
isArray(Schema:schema): Boolean Checks whether the schema is an array.
toArray(Schema:schema): Schema Transforms the schema from a regular model to an array model.
stripArray(Schema:schema): Boolean Transforms the schema from an array model to a regular model.
fromJsonString(String:title, String:text): Schema Create a schema model from JSON text.
findDataType(Object:data): String Look for the data type from object data.
fetchType(String:key): Type Find the type.

Available Methods from JsonUtils

You can call the following methods from the JsonUtils class in your crawling strategy code:

Name Description
parse(String:text): Object Transforms the text data to a key-value object.
stringify(Object:data): String Transforms the object data to a text string.

Available Methods from Object

You can call the following methods from the Object class in your crawling strategy code:

Name Description
assign(Object:target, Object... sources): Object Used to copy the values of all enumerable own properties from one or more source objects to a target object. It will return the target object.
is(Object:value1, Object:value2) Determines whether two values are the same value.
keys(Object:obj): String[] Returns an array of a given object's property names, in the same order as we get with a normal loop.
values(Object:obj) Returns an array of a given object's own enumerable property values, in the same order as that provided by a for...in loop.
defineProperty(Object:obj, String:prop, Descriptor:descriptor): Object Defines a new property directly on an object, or modifies an existing property on an object, and returns the object.
defineProperties(Object:obj, Descriptor:props): Object Defines new or modifies existing properties directly on an object, returning the object.
create(Object:proto, Object:propertiesObject): Object Creates a new object, using an existing object to provide the newly created object's proto (see browser console for visual evidence).
entries(Object:obj): Object[] Returns an array of a given object's own enumerable property [key, value] pairs, in the same order as that provided by a for...in loop.
freeze(Object:obj): Object Freezes an object. This means it prevents new properties from being added to it; prevents existing properties from being removed; and prevents existing properties, or their enumerability, configurability, or writability, from being changed. It also prevents the prototype from being changed. The method returns the passed object.
getOwnPropertyDescriptor(Object:obj, Descriptor:prop): Object Returns a property descriptor for an own property (that is, one directly present on an object and not in the object's prototype chain) of a given object.
getOwnPropertyDescriptors(Object:obj): Object Returns all own property descriptors of a given object.
getOwnPropertyNames(Object:obj): Object Returns an array of all properties (including non-enumerable properties except for those which use symbol) found directly upon a given object.
getOwnPropertySymbols(Object:obj): Object Returns an array of all symbol properties found directly upon a given object.
getPrototypeOf(Object:obj): Object Returns the prototype (i.e. the value of the internal [[Prototype]] property) of the specified object.
isExtensible(Object:obj): Boolean Determines if an object is extensible (whether it can have new properties added to it).
isFrozen(Object:obj): Boolean Determines if an object is frozen.
isSealed(Object:obj): Boolean Determines if an object is sealed.

Writing Your Own Reusable Methods

If you've got logic reusable across strategies, you can declare them as methods and register them in the Script tab. Methods registered there will be callable in all strategies, regardless of which tab a strategy belongs to. This allows you to store your logic in one place and reduce the length of your code.

The Script tab's content is consistent

Even if you shuffle through the Operation, Request Body, Parameter, and Response tabs, the Script tab's content will remain the same. This is because the Script tab's content is shared by all tabs.

Consider the following snippet:

1
2
3
registerMethod('toUpperCase', text => {
    return text.toUpperCase();
});

Writing this in the Script tab prompts the plugin to register a method with the signature of toUpperCase(String). This function transforms passed-in strings to their uppercased version. toUpperCase(String) can then be used in any strategy code, like so:

1
toUpperCase('Hello, world!'); // returns 'HELLO, WORLD!'