How to Use Browser Automation with Claude Code for Web Scraping and Form Filling

Why Browser Automation Belongs in Your Claude Code Workflow

Repetitive browser tasks — copying data from websites, filling the same forms over and over, clicking through multi-page workflows — are exactly the kind of work that should be automated. They’re tedious, error-prone, and time-consuming in a way that adds nothing to the actual work you’re trying to do.

Claude Code, Anthropic’s terminal-based agentic coding tool, makes browser automation with Playwright straightforward enough that you don’t need to be a test automation engineer to pull it off. You describe what you want in plain English. Claude writes the code, runs it, checks the results, and fixes problems as they come up.

This guide covers everything you need to get Claude Code working with Playwright for real browser automation tasks: setting up the environment, scraping JavaScript-heavy pages, filling complex forms, handling authentication, and running parallel browser agents for large-scale work.

What Claude Code Actually Does (and Why It Works for This)

Claude Code is a command-line tool you run in your terminal. You talk to it in natural language, and it takes action: writing files, running shell commands, reading output, searching the web, and spawning sub-agents for parallel work.

It’s not a code generator that hands you a script to run yourself. It’s closer to a pair programmer who also has terminal access — it writes the code, executes it, reads what happened, and keeps going until the task is done or it hits a problem it needs your input on.

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

The tool that makes browser automation possible is bash access. Claude can write a Playwright script, save it to disk, run it with Node.js, read the console output and error messages, take stock of any screenshots generated, and then fix whatever broke. This feedback loop is what separates it from one-shot code generation.

The Tools Claude Code Has Access To

When you start a Claude Code session, it operates with several built-in capabilities:

Bash — Runs any shell command, including Node.js scripts and package installs
Read/Write — Creates, reads, and modifies files in your project directory
WebFetch — Fetches web page content for research (not the same as browser control)
Sub-agents — Spawns additional Claude instances to run tasks in parallel

The WebFetch tool is worth distinguishing from Playwright. WebFetch retrieves raw HTML from a URL — it doesn’t execute JavaScript, handle cookies, or interact with page elements. For static sites, it works fine. For modern web applications, JavaScript-rendered content, forms, or anything requiring login, you need Playwright. Claude Code uses bash to run Playwright scripts, which is how it gets full browser control.

Why Playwright Over the Alternatives

Playwright is Microsoft’s browser automation library and the best choice for new projects in 2024. It supports Chromium, Firefox, and WebKit from a single API, runs in headless or headed mode, and handles modern web complexity well.

Compared to Selenium, Playwright has better auto-wait behavior (it waits for elements to be ready before interacting, reducing flaky scripts), a cleaner async API, and significantly better handling of JavaScript-heavy pages. Compared to Puppeteer, Playwright adds multi-browser support and has more active development.

Claude Code works with either library, but when you don’t specify, it will usually reach for Playwright.

Setting Up Playwright with Claude Code

You need a few things in place before starting. Claude Code will handle the actual installation, but the prerequisites are on you.

Prerequisites

Node.js 18 or higher — Playwright requires a modern Node.js runtime. Check yours with node --version. If you’re below 18, update via nodejs.org or with nvm.
Claude Code installed — Install globally with npm install -g @anthropic-ai/claude-code, then authenticate by running claude and following the prompts.
A project directory — Claude Code operates within your working directory. Create a dedicated folder for your automation work.
Disk space — Playwright downloads full browser binaries. Chromium alone is around 300MB; installing all three browsers takes closer to 900MB.

If you’re planning to run these scripts on a Linux server or CI environment, you’ll need system-level dependencies that Playwright requires for browser rendering. Run npx playwright install-deps after installation, or use the official Docker images that come with these pre-installed.

Starting a Session and Installing Playwright

Open your project folder and start Claude Code:

mkdir browser-automation
cd browser-automation
claude

Then describe what you need:

“Set up a Playwright project in this directory using Node.js and JavaScript. Install the necessary npm packages and download the Chromium browser binary.”

Claude will run npm init -y to create a package.json, install playwright via npm, and run npx playwright install chromium to download the browser. If you need multiple browsers for cross-browser testing, it can install all three — just say so.

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

If you want to use Playwright Test (the testing framework, not just the library) rather than bare Playwright, specify that:

“Set this up with @playwright/test rather than the core Playwright library. I want to use the test runner.”

The distinction matters for how scripts are structured. playwright gives you a library you invoke directly. @playwright/test gives you a test framework with describe/test blocks and built-in parallelism, which is better suited for structured test suites.

Verifying the Install

Before moving to real tasks, have Claude confirm everything works:

“Write a short verification script that opens Chromium, navigates to example.com, takes a full-page screenshot, and saves it as verify.png. Run it and tell me if it worked.”

Claude will write and execute something like:

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage();
  
  await page.goto('https://example.com');
  await page.screenshot({ path: 'verify.png', fullPage: true });
  
  await browser.close();
  console.log('Done. Screenshot saved to verify.png.');
})();

If verify.png appears in your directory, you’re ready to go.

Headed vs. Headless Mode

Headless mode runs the browser without a visible window. It’s faster, uses less memory, and is necessary for server environments. Headed mode opens a real browser window you can watch — useful for debugging scripts when something isn’t working as expected.

Switch between them by changing the launch options:

// Headless (default, for production)
const browser = await chromium.launch({ headless: true });

// Headed (for debugging)
const browser = await chromium.launch({ headless: false, slowMo: 50 });

The slowMo option adds a delay (in milliseconds) between each action, which makes the browser easier to follow visually. When debugging with Claude Code, ask it to temporarily switch to headed mode — you can watch the automation play out and catch visual problems immediately.

Scraping Websites with Claude Code and Playwright

Web scraping is one of the most common reasons to reach for browser automation. Claude Code handles it well because it can look at a page, understand its structure, write appropriate selectors, and iterate when those selectors don’t work.

Telling Claude What You Want

You don’t need to inspect the DOM yourself. Describe the task in terms of what data you want, not how to get it:

“Scrape the job listings from [URL]. I need the job title, company name, location, and the URL of each listing. Save everything to a JSON file called jobs.json.”

Claude will fetch the page (or open it in a browser), examine the HTML structure, identify selectors, write a scraping script, run it, and check the output. If some fields come back empty or null, it’ll investigate why and fix the selectors.

This is meaningfully different from using a static code generator. Claude checks actual output against what you asked for and loops until it’s right.

Basic Page Scraping

A typical Claude-generated scraper for a listings page:

const { chromium } = require('playwright');
const fs = require('fs');

(async () => {
  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage();
  
  await page.goto('https://example-jobs.com/listings', {
    waitUntil: 'networkidle'
  });

  const listings = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.job-card')).map(card => ({
      title: card.querySelector('.job-title')?.textContent.trim(),
      company: card.querySelector('.company-name')?.textContent.trim(),
      location: card.querySelector('.location')?.textContent.trim(),
      url: card.querySelector('a.job-link')?.href
    }));
  });

  fs.writeFileSync('jobs.json', JSON.stringify(listings, null, 2));
  console.log(`Scraped ${listings.length} listings`);
  
  await browser.close();
})();

Wondering what the Hermes hype is about? Free 60-minute primer

waitUntil: 'networkidle' tells Playwright to wait until all pending network requests settle before evaluating the page. This is important for sites that load content asynchronously — without it, you might scrape an empty container while the content is still fetching.

Handling JavaScript-Rendered Content

This is where Playwright beats every HTTP-based scraper. Libraries like Python’s requests or urllib only see the raw HTML the server sends. Modern sites built with React, Vue, Angular, or Svelte often deliver a skeleton HTML file — the actual content is injected by JavaScript after the page loads.

Playwright runs a real browser, so JavaScript executes and content renders before you extract data.

For sites with loading spinners or skeleton screens:

// Wait for loading indicator to disappear
await page.waitForSelector('.loading-spinner', { state: 'hidden', timeout: 15000 });

// Or wait for a specific element to appear
await page.waitForSelector('.product-grid .product-card', { timeout: 15000 });

// Then scrape
const products = await page.$$eval('.product-card', cards =>
  cards.map(card => ({
    name: card.querySelector('h3')?.textContent.trim(),
    price: card.querySelector('[data-price]')?.getAttribute('data-price')
  }))
);

If you’re not sure what to wait for, tell Claude that and let it look at the page:

“The content loads asynchronously. Take a screenshot after the page loads and tell me what’s visible, then figure out what to wait for before scraping.”

Handling Pagination

Real scraping tasks almost always involve multiple pages. The two common patterns are click-based pagination (a “Next” button) and URL-based pagination (?page=2, ?page=3).

For click-based pagination:

const allItems = [];

while (true) {
  // Scrape current page
  const pageItems = await page.$$eval('.item', items =>
    items.map(item => ({
      name: item.querySelector('h2')?.textContent.trim(),
      price: item.querySelector('.price')?.textContent.trim()
    }))
  );
  
  allItems.push(...pageItems);
  console.log(`Page scraped. Total items: ${allItems.length}`);
  
  // Look for next page button
  const nextButton = await page.$('a[aria-label="Next page"]:not([disabled])');
  if (!nextButton) break;
  
  await nextButton.click();
  await page.waitForLoadState('networkidle');
}

For URL-based pagination (often faster since you skip the click/navigate cycle):

let page_num = 1;
const allItems = [];

while (true) {
  await page.goto(`https://example.com/products?page=${page_num}`);
  
  const items = await page.$$eval('.product', els =>
    els.map(el => ({ name: el.querySelector('h3')?.textContent.trim() }))
  );
  
  if (items.length === 0) break;
  
  allItems.push(...items);
  page_num++;
  
  // Polite delay
  await page.waitForTimeout(1000);
}

Many useful pages sit behind authentication. Claude Code can automate the login flow — but pass credentials via environment variables rather than hardcoding them in scripts.

Set your credentials in the shell before running Claude:

export SITE_EMAIL="you@example.com"
export SITE_PASSWORD="yourpassword"

Then tell Claude:

“The site requires login. Use environment variables SITE_EMAIL and SITE_PASSWORD. Log in first, then scrape the dashboard.”

Claude will write:

await page.goto('https://example.com/login');
await page.fill('input[name="email"]', process.env.SITE_EMAIL);
await page.fill('input[name="password"]', process.env.SITE_PASSWORD);
await page.click('button[type="submit"]');
await page.waitForURL('**/dashboard**');

For long-running scripts, save the session state after logging in so subsequent runs don’t have to re-authenticate:

// After successful login, save state
await context.storageState({ path: 'session.json' });

// In future runs, restore state instead of logging in
const context = await browser.newContext({
  storageState: 'session.json'
});

Exporting Data in Different Formats

Ask Claude to output data however your workflow needs it:

JSON — Default for structured data. Easy to process downstream.
CSV — Good for spreadsheet import. Use the csv-writer npm package or write a simple formatter.
SQLite — Good for large datasets or when you need to query the results. Use better-sqlite3.
Markdown table — Useful for quick human-readable summaries.

Other agents start typing. Remy starts asking.

YOU SAID "Build me a sales CRM."

REMY ASKS

01 DESIGN Should it feel like Linear, or Salesforce?

02 UX How do reps move deals — drag, or dropdown?

03 ARCH Single team, or multi-org with permissions?

Scoping, trade-offs, edge cases — the real work. Before a line of code.

You can also ask Claude to clean data during export — stripping currency symbols from price strings, parsing relative dates to absolute ones, removing duplicate entries based on a key field, or normalizing inconsistently formatted text.

Automating Form Filling with Claude Code

Form filling automation is the flip side of scraping: instead of reading data from pages, you’re writing data into them. The use cases are broad — submitting contact forms, registering accounts, entering survey responses, processing batches of data entry tasks.

Basic Form Interactions

For a simple form, describe what you’re filling in:

“Fill out the contact form at [URL] with: Name: Sarah Chen, Email: sarah@company.com, Subject: Partnership inquiry, Message: We’d like to discuss a potential partnership. Submit the form.”

Claude will navigate to the page, identify the form fields, and write something like:

await page.goto('https://example.com/contact');

await page.fill('#name', 'Sarah Chen');
await page.fill('#email', 'sarah@company.com');
await page.fill('#subject', 'Partnership inquiry');
await page.fill('#message', "We'd like to discuss a potential partnership.");

await page.click('button[type="submit"]');

// Verify submission succeeded
await page.waitForSelector('.success-message', { timeout: 10000 });
const confirmation = await page.$eval('.success-message', el => el.textContent);
console.log(`Success: ${confirmation}`);

Playwright’s fill() method clears existing content before typing. Use type() if you need to append to existing text. Use press() for keyboard shortcuts or when you need to simulate key presses like Tab or Enter.

Handling Non-Text Input Types

Real forms have more than text fields. Each input type needs different handling:

// Standard HTML select dropdown
await page.selectOption('select#country', 'US');

// Custom styled dropdown (not a real <select>)
await page.click('.dropdown-trigger');
await page.waitForSelector('.dropdown-menu', { state: 'visible' });
await page.click('.dropdown-menu li:has-text("United States")');

// Checkbox (use check() rather than click() to avoid toggling)
await page.check('input#agree-to-terms');
await page.uncheck('input#marketing-emails');

// Radio button
await page.click('input[type="radio"][value="monthly"]');

// Date picker (try fill first, then fall back to clicking calendar)
await page.fill('input[type="date"]', '2025-03-15');

// File upload
await page.setInputFiles('input[type="file"]', './documents/resume.pdf');

// Rich text editor (commonly built on contenteditable)
await page.click('.ql-editor'); // Click into the editor
await page.keyboard.type('Your content here');

When the form has unusual UI components — custom sliders, drag-and-drop elements, multi-select chips — describe them to Claude and let it figure out the right Playwright approach.

Filling Forms from a Data Source

The power of form automation scales with data. If you have a spreadsheet or database of records to submit, you can automate the entire batch.

“I have applicants.csv with columns: first_name, last_name, email, phone, years_experience. Fill out the job application form at [URL] once for each row.”

Claude will write a loop that processes the CSV:

const { chromium } = require('playwright');
const fs = require('fs');
const { parse } = require('csv-parse/sync');

const records = parse(fs.readFileSync('applicants.csv', 'utf-8'), {
  columns: true,
  skip_empty_lines: true
});

(async () => {
  const browser = await chromium.launch({ headless: true });
  const results = [];

  for (const [index, record] of records.entries()) {
    console.log(`Processing ${index + 1}/${records.length}: ${record.email}`);
    
    const page = await browser.newPage();
    
    try {
      await page.goto('https://example.com/apply');
      
      await page.fill('#first-name', record.first_name);
      await page.fill('#last-name', record.last_name);
      await page.fill('#email', record.email);
      await page.fill('#phone', record.phone);
      await page.selectOption('#experience', record.years_experience);
      
      await page.click('button[type="submit"]');
      await page.waitForSelector('.confirmation-number');
      
      const confirmation = await page.$eval('.confirmation-number', el => el.textContent);
      results.push({ email: record.email, status: 'success', confirmation });
      
    } catch (error) {
      results.push({ email: record.email, status: 'failed', error: error.message });
      console.error(`Failed for ${record.email}: ${error.message}`);
    }
    
    await page.close();
    
    // Rate limiting between submissions
    await new Promise(r => setTimeout(r, 2000));
  }

  fs.writeFileSync('submission-results.json', JSON.stringify(results, null, 2));
  await browser.close();
  
  const successful = results.filter(r => r.status === 'success').length;
  console.log(`Done. ${successful}/${records.length} submissions successful.`);
})();

Note the try/catch around each submission. You want to catch individual failures without stopping the entire batch.

Multi-Step Forms and Wizards

Checkout flows, onboarding sequences, and application forms often span multiple pages. Claude handles these by navigating each step in sequence:

“This application form has four steps. Step 1: personal info. Step 2: work history. Step 3: references. Step 4: review and submit. Fill in each step with the data I’ll give you.”

Claude will write a script with separate handling for each step, including checks that the right step is visible before filling fields, and navigation between steps:

// Step 1: Personal info
await page.fill('#first-name', data.firstName);
await page.fill('#last-name', data.lastName);
await page.fill('#email', data.email);
await page.click('button:has-text("Continue")');

// Wait for step 2 to appear
await page.waitForSelector('#work-history-section');

// Step 2: Work history
await page.fill('#current-employer', data.employer);
await page.fill('#job-title', data.title);
await page.click('button:has-text("Continue")');

For conditional forms — where your answer to one question changes what appears next — describe the logic:

“If the user selects ‘Self-employed’ as their employment status, a different set of fields appears. Handle both branches.”

What About CAPTCHAs?

CAPTCHAs block automated access by design. Playwright alone cannot solve them. You have a few options:

Third-party solving services — APIs like 2captcha or Anti-Captcha solve image-based CAPTCHAs by routing them to human solvers. Claude can integrate with these services in your script. They cost money and add latency, but work for most basic CAPTCHAs.
Avoiding the CAPTCHA — Some sites only show CAPTCHAs to sessions that look like bots. Using realistic browser fingerprints, human-like timing, and real browser profiles often avoids triggering them in the first place.
Manual intervention — For low-volume tasks, you can run in headed mode, have the script pause at the CAPTCHA step, solve it manually, then resume.

hCaptcha and reCAPTCHA v3 (which is invisible and score-based) are significantly harder to deal with than classic image CAPTCHAs. If a site uses these aggressively, automated form filling may not be practical.

Common Form Filling Problems and Fixes

A few issues come up repeatedly:

Fields don’t accept programmatic input. Some React/Vue controlled components require triggering change events explicitly. If fill() isn’t working, try: await page.evaluate(el => { el.value = 'text'; el.dispatchEvent(new Event('input', { bubbles: true })); }, await page.$('#field')).

Form validation fires incorrectly. Some validators run on blur (when you leave a field). After filling a field, trigger a tab: await page.press('#field', 'Tab').

Submission seems to work but confirmation doesn’t appear. Add explicit error checking: look for both success and error messages. A silent failure (like a server-side validation error displayed in a toast notification) can look like success if you’re only watching for the submit button click.

Timeouts on slow networks. Increase Playwright’s default timeout: page.setDefaultTimeout(30000). For submit actions that trigger email verification or slow server processing, increase further.

Running Parallel Browser Agents

For large-scale browser automation — scraping thousands of pages, submitting hundreds of forms, monitoring multiple sites simultaneously — sequential processing is too slow. Claude Code supports parallel execution through sub-agents.

How Sub-Agents Work in Claude Code

The Task tool in Claude Code lets it spawn additional Claude instances to work on subtasks concurrently. Each sub-agent is a separate process with its own context. Claude Code’s main instance acts as an orchestrator — it splits work, assigns batches to sub-agents, and collects results when they finish.

This is useful beyond just speed. Sub-agents can work on genuinely different tasks at the same time: one scraping Site A while another fills forms on Site B, with a third generating a report from previously scraped data.

For parallel browser automation specifically:

“I have a list of 500 company URLs in companies.txt. Scrape the name, description, and contact email from each one. Use 5 parallel agents to speed this up.”

Claude will split the list into 5 batches of 100 URLs each, assign each batch to a sub-agent, and wait for all five to complete before merging the results.

Structuring Work for Parallelism

For sub-agents to work well, your scripts should be stateless — given a list of inputs, they produce a list of outputs. No shared mutable state between agents.

A reusable batch-processing script that Claude can invoke with different input/output arguments:

// scrape-batch.js
// node scrape-batch.js input-urls.txt output-results.json

const { chromium } = require('playwright');
const fs = require('fs');

const inputFile = process.argv[2];
const outputFile = process.argv[3];
const urls = fs.readFileSync(inputFile, 'utf-8').trim().split('\n').filter(Boolean);

(async () => {
  const browser = await chromium.launch({ headless: true });
  const results = [];

  for (const url of urls) {
    try {
      const page = await browser.newPage();
      
      await page.goto(url, { timeout: 20000, waitUntil: 'domcontentloaded' });
      
      const data = await page.evaluate(() => ({
        title: document.title,
        description: document.querySelector('meta[name="description"]')?.content ?? null,
        email: document.body.innerText.match(/[\w.+-]+@[\w-]+\.[\w.]+/)?.[0] ?? null
      }));
      
      results.push({ url, ...data, status: 'ok' });
      await page.close();
      
    } catch (error) {
      results.push({ url, error: error.message, status: 'failed' });
    }
    
    await new Promise(r => setTimeout(r, 500));
  }

  fs.writeFileSync(outputFile, JSON.stringify(results, null, 2));
  await browser.close();
  console.log(`Processed ${results.length} URLs -> ${outputFile}`);
})();

Claude’s orchestrator splits the input, runs batches in parallel using bash’s & operator, waits for completion with wait, then merges output files.

Using Playwright’s Native Concurrency

Within a single script (without Claude sub-agents), Playwright supports multiple browser contexts running concurrently in one process. Each context is isolated — separate cookies, localStorage, and sessions — but shares the same browser binary.

const { chromium } = require('playwright');

const urls = [
  'https://example.com/page1',
  'https://example.com/page2',
  'https://example.com/page3',
  'https://example.com/page4',
  'https://example.com/page5'
];

(async () => {
  const browser = await chromium.launch({ headless: true });
  
  // Process 5 URLs concurrently
  const results = await Promise.all(
    urls.map(async (url) => {
      const context = await browser.newContext();
      const page = await context.newPage();
      
      await page.goto(url, { waitUntil: 'domcontentloaded' });
      const title = await page.title();
      
      await context.close();
      return { url, title };
    })
  );
  
  console.log(results);
  await browser.close();
})();

This is lighter-weight than Claude sub-agents and works well for moderate concurrency. Each browser context uses roughly 20–50MB of memory, so a typical machine can run dozens simultaneously without issue. For higher concurrency or tasks requiring genuine AI reasoning per URL (not just data extraction), Claude sub-agents are better.

Managing Rate Limits

Running parallel browsers against the same site will get you blocked. Standard mitigation strategies:

Add delays between requests. Even 1–2 seconds per request dramatically reduces the signature of automated traffic. Build this into your batch scripts.

Limit concurrency per domain. If you’re scraping multiple pages from one site, don’t hit it with 20 simultaneous requests. Limit to 2–3 concurrent requests per domain and distribute load over time.

Rotate user agents. Set realistic user agent strings on each browser context. Playwright lets you configure this in the newContext() options:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...'
});

Block unnecessary resources. Images, fonts, and analytics scripts aren’t needed for scraping. Blocking them reduces bandwidth and request volume:

await page.route('**/*.{png,jpg,jpeg,gif,svg,ico,woff,woff2,ttf}', route => route.abort());
await page.route('**/{analytics,tracking,ads}/**', route => route.abort());

Advanced Techniques Worth Knowing

Intercepting API Responses

Many sites load their data via JSON APIs, even if you’re viewing them as regular web pages. If you can intercept the API call, you get structured data without any HTML parsing.

const [response] = await Promise.all([
  page.waitForResponse(resp => 
    resp.url().includes('/api/products') && resp.status() === 200
  ),
  page.goto('https://example.com/products')
]);

const data = await response.json();
console.log(data.products);

Open your browser’s Network tab on any site to see what API calls it makes. If there’s a clean JSON endpoint returning the data you need, this approach is faster and more stable than DOM scraping.

Screenshot-Based Debugging

When something breaks, screenshots are often more useful than error messages. Have Claude generate screenshots at key points:

// Take a screenshot at any point for debugging
await page.screenshot({ path: `debug-${Date.now()}.png`, fullPage: true });

Ask Claude to review screenshots as part of its debugging process:

“The scraper is failing on step 3. Add a screenshot before and after step 3, run it again, and look at both screenshots to figure out what’s wrong.”

Claude will examine the image output and identify whether the expected element is present, whether a CAPTCHA appeared, whether the page redirected unexpectedly, and so on.

Persistent Authentication Sessions

Logging in on every script run is slow and increases the chance of triggering bot detection. Save your login state after the first successful authentication:

// First run: log in and save state
const context = await browser.newContext();
const page = await context.newPage();

await page.goto('https://example.com/login');
await page.fill('#email', process.env.EMAIL);
await page.fill('#password', process.env.PASSWORD);
await page.click('[type="submit"]');
await page.waitForURL('**/dashboard');

await context.storageState({ path: 'auth-state.json' });

// Subsequent runs: restore saved state
const context = await browser.newContext({
  storageState: 'auth-state.json'
});

The state file contains cookies and localStorage. It’s equivalent to being already logged in. Session expiry will eventually require re-running the login flow.

Extending Your Automation with MindStudio

Claude Code and Playwright handle the browser automation side well. But browser automation rarely exists in isolation — you usually need to do something with what you’ve collected, or trigger automation as part of a larger workflow.

MindStudio’s Agent Skills Plugin is an npm SDK that Claude Code agents can call directly. It exposes 120+ typed capabilities as simple method calls, and handles the infrastructure (rate limiting, retries, auth) automatically. This means after your scraper collects data, it can immediately push results downstream without writing custom integrations.

For example, after a scraping run finishes:

const MindStudio = require('@mindstudio-ai/agent');
const agent = new MindStudio();

// Email the scraped report
await agent.sendEmail({
  to: 'team@company.com',
  subject: `Daily Competitor Price Report — ${new Date().toLocaleDateString()}`,
  body: formatPriceReport(scrapedProducts)
});

// Or push to a Google Sheet
await agent.runWorkflow({
  workflowId: 'update-price-tracker-sheet',
  input: { products: scrapedProducts }
});

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

This is particularly useful if your browser automation feeds into business tools — CRMs, project management platforms, reporting systems — where you’d otherwise have to build and maintain custom integrations for each one.

You can also trigger Claude Code browser automation from a MindStudio workflow via webhook. A new lead hits your CRM → MindStudio calls a webhook → a Claude Code agent opens a browser and researches the lead’s company website → results come back into MindStudio and get added to the CRM record. Each part does what it’s good at.

If you want to build browser automation workflows without writing the Playwright code yourself, MindStudio’s no-code visual builder is worth looking at. You can build agents that integrate with 1,000+ tools, use 200+ AI models, and run on schedules or triggers — all without leaving the browser. You can get started free at mindstudio.ai.

Frequently Asked Questions

Does Claude Code have a built-in browser it can control directly?

Claude Code includes a WebFetch tool that retrieves HTML from URLs, but this isn’t the same as browser control. WebFetch doesn’t execute JavaScript, handle cookies, or interact with page elements — it just fetches raw HTML. For actual browser control (clicking, form filling, scraping JavaScript-rendered content), Claude Code writes and executes Playwright scripts via its bash tool. Anthropic’s computer use feature, available in some configurations, allows direct visual browser control, but that’s a separate capability from standard Claude Code.

What’s the difference between Playwright and Puppeteer, and which should I use?

Both automate Chromium-based browsers, and both work with Claude Code. Playwright adds multi-browser support (Firefox and WebKit alongside Chromium), better auto-wait behavior that makes scripts less flaky, and a more consistent API across languages. Puppeteer is maintained by Google and is a fine choice if you’re already using it. For new projects, Playwright is generally the better default.

Can Claude Code scrape sites that block automated browsers?

Basic detection (checking for common headless browser tells like specific browser properties or user agent strings) can often be bypassed by using realistic browser configurations. More sophisticated bot protection — like Cloudflare’s bot management or reCAPTCHA v3 — is harder to deal with. Libraries like playwright-extra with stealth plugins can help with some detection systems. For sites with aggressive anti-bot measures, programmatic access may not be reliable regardless of the library you use.

How do I handle scraping large volumes of pages without getting blocked?

The key measures are: adding delays between requests (1–2 seconds minimum), limiting concurrent requests to any single domain, rotating user agents across sessions, blocking unnecessary resource types (images, fonts) to reduce request signatures, and using residential proxies if you need to operate at scale. Claude Code can implement all of these — describe the scale you’re working at and ask it to add appropriate rate limiting.

Is web scraping legal?

Hermes, walked through line by line — free 1-hour workshop

The legal picture varies by jurisdiction and use case. In the US, scraping publicly available information is generally legal, though the hiQ v. LinkedIn case and similar litigation have created some uncertainty around ToS violations. The clearest risk factors are: scraping behind authentication without permission, violating a site’s explicitly stated Terms of Service, using scraped data to harm the site operator’s business, and scraping personal data in jurisdictions with strict privacy laws (like GDPR in Europe). For your own sites or sites you have explicit permission to access, there’s no issue. For third-party sites, check their ToS and look for an official API first.

How much memory do parallel Playwright browsers use?

A Chromium browser instance with a few open pages typically uses 100–250MB of RAM. Browser contexts within a single browser instance are cheaper — roughly 20–50MB each. On a machine with 8GB of RAM, you can practically run 20–30 concurrent Chromium pages. For Claude Code sub-agents, each agent is an additional process with its own memory overhead, so factor that in. Practically, 3–10 parallel agents with one browser each is the range most tasks fall into.

Can I save scraped data directly to a database?

Yes. Claude Code can install any npm package you need. Common choices: better-sqlite3 for local SQLite, pg for PostgreSQL, mongodb for MongoDB, or any other database driver. Describe where you want the data to go and Claude will set up the connection and write the insertion logic. For cloud databases, you’ll need to provide connection credentials via environment variables.

Key Takeaways

Browser automation with Claude Code and Playwright covers a wide range of practical tasks without requiring deep expertise in either browser automation or web scraping.

Here’s what to carry from this guide:

Claude Code is an execution environment, not just a code generator. It writes Playwright scripts, runs them, reads the output, and iterates until the task works. The feedback loop is what makes it useful.
Playwright handles everything a real browser does — JavaScript rendering, authentication, multi-step forms, network interception, and parallel browser contexts. Static HTTP scrapers can’t do any of this.
Form filling at scale is straightforward: combine Playwright’s interaction methods with a CSV or database as input, add error handling per row, and let it run.
Parallel execution is available through Claude Code’s sub-agent system (multiple Claude instances working concurrently) or Playwright’s native browser context API (multiple pages in one process).
Rate limiting and responsible scraping matter. Add delays, limit concurrency, and only automate sites you have permission to access.
Downstream integration is often the missing piece. Connecting automation output to business tools — through something like MindStudio’s Agent Skills Plugin — completes the workflow without custom integration work.

If you’re building workflows that go beyond browser automation — triggering follow-up actions, routing data into business systems, or building the whole thing without writing Playwright code — MindStudio is worth exploring. You can try it free at mindstudio.ai.