How to scrape data with ChatGPT from any website


This template builds a web scraper that loops through links stored in a Google Sheet, opening each page in a browser. It then scrapes the page using code and passes the data to ChatGPT to extract the specific values defined by the user. After each loop, the scraper deletes a row in the Google Sheet, preparing it to process the next. A ChatGPT API key is required.

Get started: create your Google Sheet

Create a new Google Sheet. You can do this in your Chrome browser by entering the shortcut sheet.new, assuming you already have an account. Name your sheet something like ChatGPT Scrape' and set up a tab called Links. Add another tab called Data.

Install the ChatGPT template

To install this ChatGPT template click Install template. If you're a new user, you'll have to click Install Chrome extension and create a free axiom.ai account before you can edit the template.

Once installed, click start.

axiom.ai will guide you through the steps you need to configure in the app.

Configure your ChatGPT scraper in Five easy steps

Please note in step 2.2 Get data from bot's current page you need to select a single block of content with the selector tool.

  • 1.0 Read data from a Google Sheet
    • Spreadsheet: Search for the Google Sheet you created. Once found, click to select.
    • Sheet name: Choose the sheet tab called "Amazon product links".
  • 2.0 Loop through data
    • 2.1 Go to page
      • Enter URL: Click 'Insert Data', select google-sheet-data, and choose the column with the links.
    • 2.2 Get data from bot's current page
      • Select: Point and click to select the data you wish to scrape using a single selector.
      • Max Results: Set to 1.
    • 2.3 Extract data with ChatGPT
      • ChatGPT API key: Enter your API key.
      • Data: Insert scrape-data.
      • Extract values: Enter values to extract (e.g. name, email, job title).
    • 2.4 Write data to a Google Sheet
      • Spreadsheet: Search for the Google Sheet you created. Once found, click to select.
      • Sheet name: Choose the tab you created.
      • DATA: Select chatgpt-data.
      • Clear data before writing | Add to existing data: Set to "Add to existing data".
    • 2.5 Delete rows from a Google Sheet
      • Spreadsheet: Search for the Google Sheet you created. Once found, click to select
      • Sheet name: Choose the tab you created for links.
      • First row: Set to 1.
      • Last row: Set to 1.

Testing and running your ChatGPT web scraper

We suggest running a test first, stopping the bot after a few loops, and reviewing the data.

Customize your template

Like all Axiom templates, you can use our no-code bot builder to customize any bot according to your requirements.

Troubleshooting

We recommend you watch the video to troubleshoot.

  • Want to set a number of loops?
    Set a Last Cell in the Read data from a Google Sheet step — for example, AE50 will scrape 50 rows.
  • Not scraping content correctly?
    In Get data from bot's current page, select a single block of content using the selector tool.
  • Want to scrape the whole page?
    Use a custom selector like body in the Get data from bot's current page step.
  • Scrape running slowly?
    Toggle Configure scraper, then set No. of retry... to 1 in the same step.
  • Data not showing up in ChatGPT?
    Make sure Data is set to scrape-data