Scrape

axiom.scrape(url, selector, pager, max_results, settings) extracts a list of structured records from a page (or a sequence of pages, if you supply a pager). The same step type as the No-Code Tool's Get data from bot's current page, called from your code with selectors you choose at runtime.

Signature


const rows = await axiom.scrape(url, selector, pager, max_results, settings);
ParameterTypeRequiredDescription
urlstring | string | nullYesURL (or array of URLs) to scrape. Pass null to scrape the page currently loaded in the session. Multiple URLs are scraped in sequence.
selectorobjectYesSelector definition for the records to extract. Use the No-Code Tool's selector tool (or a recorded selector from a previous run) to produce this.
pagerobject | nullYesSelector for the "next page" link. Pass null for single-page scraping. With a pager, the scraper paginates until max_results is hit or the pager stops resolving.
max_resultsnumber | nullYesCap on the number of records returned. Must be a positive whole number (or null for unlimited).
settingsobjectYesBehavioural tuning (output format, minimum wait, …). Pass {} for defaults.

Returns a 2D array of records (rows × fields).

Example


Scrape product cards from a category page, paginating until 100 results:

const products = await axiom.scrape(
  null,                                  // stay on the current page
  { hierarchy: ".product-card" },        // record selector
  { hierarchy: "a.pagination-next" },    // pager selector
  100,                                   // max results
  {}                                     // default settings
);

Notes


  • The selector and pager shapes match the selector tokens emitted by the No-Code Tool's selector tool. If you're authoring by hand, capturing them once via the No-Code Tool and copy-pasting is the lowest-friction approach.
  • For single-record extraction (page title, OG image, single field), use axiom.scrapeMetadata() — it's faster and simpler.
  • The scraper handles lazy-loaded content by waiting briefly after each pager click. Sites with slow infinite-scroll may need a higher minWait in settings.
  • Multiple URLs in url are scraped sequentially and the results concatenated; failures on one URL fail the whole call.