Can a no-coder build a web scraper with Puppeteer?

I’ve built many web scrapers and browser automations, and I’m becoming something of a master at automating web apps. With Axiom.ai’s no-code tool (opens new window), I can piece steps together, build automations quickly, and run them in the cloud with a single click. That gets me past the big hurdles.

But if I wanted to code a scraper, how would I get past those hurdles? With generative AI, it feels like I can now take the next step and code a scraper, with AI guiding me through it. And with Axiom’s new pro-code tool (opens new window), I’ll even be able to run automations directly from my own stack. I’m ready to take a crack at this!

axiom.ai's new pro-code tool web app

# What tools will I use?

First up, Cursor, an AI-driven coding assistant. I like using a coding assistant because I prefer building in steps instead of writing one giant prompt. From working with the dev team at Axiom, I know I’ll need the Puppeteer library installed. I also know enough command line to type npm i puppeteer. But Cursor can even handle all that setup for me.

Next, how do I run the bots? Luckily, we just launched our pro-code (opens new window) tool that runs automations in the cloud from your own stack.

# What am I going to build?

The very first web scraper I made was on the BBC site. It searched for Harry Kane and scraped the results. We still use it for testing to this day.

So I’ll recreate that scraper, but this time with code. Harry may have gone to Bayern Munich, but I’m still here at Axiom, still a Spurs fan.

axiom.ai's making a web scraper to extrat harry kane data

# My method

I’ve never coded a scraper — that’s true. The closest I’ve come is running a bot in Axiom’s sandbox. This time, I’ll run it from Cursor in my own stack.

The plan is simple: start with Axiom’s example code, work with Cursor to build it out, run some tests, and then add code to export the data into a CSV.

# Coding the web-scraper

I start off by setting up a folder with a file.

  1. Create a folder on my computer, then open it in Cursor.
  2. Open a new terminal in Cursor.
  3. In the terminal, run touch harry-kane-web-scraper.js to create the file.
  4. Click the file to open it.
  5. Cut and paste the example code into the file.

I paste in Axiom’s example. The endpoint is where my scraper will run. The hidden key is the API key. Viewport sets the window size. I also recognize the Puppeteer goto command — I’ve used the “goto page” step with the no-code tool. So some of this feels familiar.

const browser = await puppeteer.connect({
    browserWSEndpoint: "wss://cdp-lb.axiom.ai/?token=[HIDDEN_KEY]"
});
const page = await browser.newPage()
await page.setViewport({ width: 1960, height: 1080 });
await page.goto("https://axiom.ai")
await new Promise(resolve => setTimeout(resolve, 5000))
await page.close()
await browser.close()

Next, I make the script load the BBC search page, type “Harry Kane,” and press return.

  1. Replace the URL with https://www.bbc.co.uk/search?d=HOMEPAGE_PS.
  2. Ask Cursor: “expand my script to input Harry Kane into the search field, then press return. Add selector placeholders.”

My adjusted code looks familiar. Typing text, pressing enter — those are the same steps I’d do with no-code.

const browser = await puppeteer.connect({
    browserWSEndpoint: "wss://cdp-lb.axiom.ai/?token=[HIDDEN_KEY]"
});
const page = await browser.newPage()
await page.setViewport({ width: 1960, height: 1080 });
await page.goto("https://www.bbc.co.uk/search?d=HOMEPAGE_PS")

await page.waitForSelector('[PLACEHOLDER_SEARCH_SELECTOR]', { timeout: 10000 });
await page.type('[PLACEHOLDER_SEARCH_SELECTOR]', 'Harry Kane');
await page.keyboard.press('Enter');

await new Promise(resolve => setTimeout(resolve, 3000))
await page.close()
await browser.close()

# Adding a selector

Now I need the actual selector. I use Chrome Inspector and grab this:

  1. Find the selector so the code can locate the element and click the input field. I use Chrome Inspector to find input[placeholder="Search the BBC"].
  2. Insert it into the code, replacing the placeholders:
await page.type('input[placeholder="Search the BBC"]', 'Harry Kane');

# Quick test run

I tell Cursor to run the script. It adjusts the code to require Puppeteer and checks if it’s installed, which is handy. Then it throws an error: 403. Oops — I forgot the API key. After adding that, the script runs fine.

const browser = await puppeteer.connect({
    browserWSEndpoint: "wss://cdp-lb.axiom.ai/?token=eyJ1c2VyX2lkIjoyMzI5LCJ0b2tlbiI6IjIzMjkwZDQwYWFlMWFmODBkNzJmIn0="
});

# Scraping and exporting the data

Next I tell Cursor what data I want to scrape and that I want it in a csv. Cursor updates the script.

const puppeteer = require('puppeteer');
const fs = require('fs');

async function scrapeHarryKane() {
const browser = await puppeteer.connect({
    browserWSEndpoint: "wss://cdp-lb.axiom.ai/?token=eyJ1c2VyX2lkIjoyMzI5LCJ0b2tlbiI6IjIzMjkwZDQwYWFlMWFmODBkNzJmIn0="
});
const page = await browser.newPage()
await page.setViewport({ width: 1960, height: 1080 });
await page.goto("https://www.bbc.co.uk/search?d=HOMEPAGE_PS")
// Wait for the search field to be available
await page.waitForSelector('input[placeholder="Search the BBC"]', { timeout: 10000 });

// Type "Harry Kane" into the search field
await page.type('input[placeholder="Search the BBC"]', 'Harry Kane');

// Press Enter to submit the search
await page.keyboard.press('Enter');

// Wait for search results to load
await new Promise(
    resolve => {
        setTimeout(() => {resolve()}, 3000)
    }
)

// Scrape the search results
console.log('Scraping search results...');
const results = await page.evaluate(() => {
    const links = document.querySelectorAll('[role="list"] a');
    const data = [];
    
    links.forEach(link => {
        const title = link.textContent.trim();
        const url = link.href;
        
        if (title && url) {
            data.push({
                title: title,
                url: url
            });
        }
    });
    
    return data;
});

console.log(`Found ${results.length} search results`);

// Convert to CSV format
const csvHeader = 'Title,URL\n';
const csvRows = results.map(result => 
    `"${result.title.replace(/"/g, '""')}","${result.url}"`
).join('\n');
const csvContent = csvHeader + csvRows;

// Write to CSV file
const filename = 'harry-kane-search-results.csv';
fs.writeFileSync(filename, csvContent, 'utf8');
console.log(`Results saved to ${filename}`);

// Log first few results to console
console.log('\nFirst 5 results:');
results.slice(0, 5).forEach((result, index) => {
    console.log(`${index + 1}. ${result.title}`);
    console.log(`   ${result.url}\n`);
});

await page.close()
await browser.close()
}

// Run the function
scrapeHarryKane().catch(console.error);

# Round up

Did I get a scraper working? Yes. Did it download a CSV? Yes. Was the data right? Not really — my selector was too loose, so it grabbed a lot of other links. But selectors are always tricky when scraping the web, and I can fix that. So yes, 100%, I just coded a scraper. That is so cool, it made me smile.

 making a web scraper with cursor

Did the LLM write most of it? Yes. Did I still feel cool running it? Absolutely. I learned more about Puppeteer than I thought I would, and now I know I can keep going.

Do I think no-coders can use AI to generate scripts and scrapers? Yes. It’s a new way of working. It was straightforward, even connecting to the Axiom pro-code endpoint was simple.

Will I give up no-code? No. It’s still faster for prototyping, building simple automation, and running it. But I want to learn more, so I’ll keep going and see what a no-coder can do with code.

Am I going to take our CTO’s job? No. But next stop: data entry into a web form — and I really want to see if it’s easy to read data from a Google Sheet without a no-code step!

Alex Barlow

Alex Barlow

Alex spent 14 years creating web apps, often automating repetitive tasks, before co-founding Axiom.ai. He’s hands-on with users and enjoys learning from them. He creates intricate automation the no-code way, and empowered by generative AI, he's extending his skill set to include code. Outside of work, he loves exploring the Scottish Highlands with his daughter and making sandcastles on Firemore Beach.

Contents

    Install the Chrome Extension

    Two hours of free runtime, no credit card required