An Introduction to the Crawler
With Docs to OpenAPI, you can create an API's corresponding OpenAPI schema out of data from its documentation. This data can either be manually entered or automatically extracted by the Docs to OpenAPI plugin for you using a feature called, as you might have guessed, Crawler.
What kind of data does the plugin expect?
The information required by the plugin pertains to the list of API operations, their request bodies, parameters, and responses. Each of these have their own respective tabs under the Crawler tab.
The Crawler needs to be configured so that it knows which information it should take from the
web page and for which field (in the OpenAPI schema) each value should go into. It is able to identify which parts of
the web page to acquire primarily through the use of CSS selectors and additionally, with the
help of the
Filter fields of the
When CSS selectors aren't enough...
Filter fields are typically only used when CSS selectors aren't enough to identify
and isolate the data needed. This happens when, for example, a value requires further string processing because
unnecessary characters are attached to it.
The Crawler is specifically designed to parse and extract data from HTML documents. It is ideal for use in pages where every operation is documented in a similar fashion, meaning they are all structurally identical.
The use of the Crawler is preferred and recommended by TORO as populating API data this way is quicker (especially when dealing with massive APIs) and less prone to human error. It is required, however, for the contents of documentation web pages to have consistent HTML formatting.
Learn the basics