Skip to main content
MindStudio
Pricing
Blog About
My Workspace

Claude Code with Playwright CLI: Browser Automation via Shell Commands

Drive Playwright from Claude Code using shell commands — no MCP server required. Setup, QA testing loops, scraping patterns, and session storage explained.

MindStudio Team RSS
Claude Code with Playwright CLI: Browser Automation via Shell Commands

Why Playwright CLI Changes What’s Possible with Claude Code

Most browser automation setups require a custom server, an MCP configuration, and a fair amount of glue code before anything useful happens. Playwright CLI skips that. It exposes browser control through standard command-line calls, which means Claude Code can drive it directly — no server process, no special protocol, just shell commands that open browsers, click things, fill forms, and return structured output.

The result is a tight loop. Claude Code reads a task, issues Playwright CLI commands, inspects the output, adjusts, and continues. That’s the core of what makes Claude Code and Playwright CLI a practical combination for automating browser tasks like QA testing, web scraping, and maintaining authenticated sessions across workflows.

This guide walks through the full setup and shows what you can actually build with it.


What Playwright CLI Is (and What It Isn’t)

Playwright is Microsoft’s open-source browser automation library. Most developers know it as a Node.js API for writing test scripts in TypeScript or JavaScript. But it also ships with a CLI that you can call from any shell environment.

The CLI lets you:

  • Launch Chromium, Firefox, or WebKit browsers (headless or visible)
  • Navigate to URLs and interact with page elements
  • Extract text, attributes, and structured content from pages
  • Take screenshots and record videos
  • Run existing test files and return results
  • Use --save-storage and --load-storage flags to persist cookies and session state between runs

This last point matters a lot. Authenticated sessions — where you need to log in once and carry that session forward — are one of the harder problems in browser automation. Playwright CLI handles it cleanly.

The key distinction: Playwright CLI is not the same as Playwright MCP, which is a model context protocol server that exposes browser tools to AI models via a different interface. If you’re weighing those options, it’s worth reading about Claude Code MCP server token overhead before deciding. CLI tends to be leaner in practice.


Setting Up Playwright CLI with Claude Code

Prerequisites

You’ll need:

  • Node.js 18+ installed
  • Claude Code installed and authenticated (via npm install -g @anthropic-ai/claude-code)
  • A terminal and basic familiarity with shell commands

Step 1: Install Playwright

npm init -y
npm install -D @playwright/test
npx playwright install

The playwright install step downloads browser binaries. By default it installs Chromium, Firefox, and WebKit. If you only need Chromium (the most common choice for automation), run:

npx playwright install chromium

Step 2: Verify the CLI Works

Run a quick sanity check:

npx playwright --version

Then try a basic navigation command to confirm browsers launch correctly:

npx playwright screenshot --browser chromium https://example.com screenshot.png

If a screenshot appears in your directory, the setup is working.

Step 3: Configure Claude Code to Use Playwright

Claude Code can invoke shell commands using its built-in Bash tool. You don’t need to configure anything special — as long as Playwright is installed in your project directory, Claude Code will find it when you ask it to run commands like npx playwright ....

The most important setup step is giving Claude Code a clear, explicit CLAUDE.md file that describes your project structure, what Playwright is installed for, and what kinds of tasks you want it to handle. Claude Code reads this file at the start of every session.

A minimal CLAUDE.md might look like:

# Project: Browser Automation

## Tools Available
- Playwright CLI via `npx playwright` (Chromium installed)
- Session storage saved to `./session-state/`

## Conventions
- Always check for existing session files before logging in
- Save screenshots to `./screenshots/` with descriptive names
- Return structured JSON where possible using page.evaluate()

Step 4: Test Claude Code Driving Playwright

Start Claude Code in your project directory:

claude

Then give it a test prompt:

“Navigate to https://news.ycombinator.com using Playwright, scrape the top 10 story titles and their URLs, and save the output as a JSON file called hn-top10.json.”

Watch it generate and execute the commands. If it works, you have a functioning setup.


Use Case 1: Automated QA Testing

This is where Playwright was originally designed to shine, and pairing it with Claude Code adds a layer of intelligence that static test scripts don’t have.

What Claude Code Adds to QA

A standard Playwright test script checks specific selectors and expected values. If the UI changes, tests break and someone has to update them. Claude Code can:

  • Read failing test output and diagnose why it failed
  • Suggest selector updates when elements have moved or been renamed
  • Write new test cases from a plain-language description of a feature
  • Re-run tests after making fixes and confirm they pass

For a deeper look at this loop, the guide on using AI for automated QA testing covers the build-test-fix cycle in detail.

Example: Running Tests and Fixing Failures

Tell Claude Code:

“Run the existing Playwright tests in ./tests/. If anything fails, read the error output, identify the cause, and fix the test. Re-run until all tests pass.”

Claude Code will:

  1. Execute npx playwright test
  2. Read stdout/stderr for failure details
  3. Open the failing test file
  4. Make targeted edits (selector changes, timing adjustments, assertion updates)
  5. Re-run the tests
  6. Repeat until green

This is a genuinely useful loop. It’s not perfect — Claude Code will sometimes misdiagnose a failure, especially if the issue is in the application code rather than the test — but for selector drift and minor test maintenance, it handles a lot without human intervention.

Writing Tests from Scratch

You can also describe a feature and have Claude Code write the test:

“Write a Playwright test that: logs in with test@example.com / password123, navigates to the dashboard, confirms the user’s name appears in the header, and verifies that the ‘Create Report’ button is visible.”

It will generate the test file, run it, and debug any issues. If you’re building tests for a new feature before it ships, this cuts the time significantly.


Use Case 2: Web Scraping

Playwright CLI is well-suited for scraping pages that require JavaScript rendering — anything that an HTTP request alone can’t handle because the content loads dynamically.

Claude Code can run a full scraping session: navigating pages, handling pagination, extracting structured data, and writing it to a file or database.

Basic Scraping Pattern

Here’s a simple command Claude Code might generate for scraping a product listing page:

npx playwright eval --browser chromium "
  const { chromium } = require('playwright');
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('https://example-store.com/products');
  const products = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.product-card')).map(el => ({
      name: el.querySelector('.product-name')?.textContent?.trim(),
      price: el.querySelector('.product-price')?.textContent?.trim(),
      url: el.querySelector('a')?.href
    }));
  });
  console.log(JSON.stringify(products));
  await browser.close();
"

Claude Code captures the stdout, parses the JSON, and does whatever you asked with the data — write it to a CSV, filter it, send it somewhere.

Handling Pagination

For paginated listings, Claude Code can write a loop that increments page numbers or clicks “Next” until no more pages exist. This is where having a structured approach to building scraping skills for AI agents pays off — particularly for stop conditions and incremental runs that don’t re-scrape pages you’ve already processed.

Token Efficiency

One practical issue with scraping inside an AI coding session: if you dump raw HTML into Claude Code’s context, you’ll burn tokens fast. It’s better to have Claude Code write a Playwright script that extracts only the fields you need and returns clean JSON. The script does the extraction work; Claude Code only sees the result.

For more on this, optimizing web scraping skills with token-saving techniques covers six concrete approaches.


Use Case 3: Authenticated Browser Sessions

This is one of the more underrated capabilities of the Playwright CLI + Claude Code combination. Many useful tasks — monitoring a dashboard, submitting forms behind a login wall, extracting data from internal tools — require an active authenticated session.

How Playwright Session Storage Works

Playwright supports saving the full browser state (cookies, localStorage, sessionStorage) to a JSON file:

npx playwright save-storage auth.json

You can then load that state at the start of any subsequent run:

npx playwright open --load-storage auth.json https://app.example.com/dashboard

The Workflow

  1. First run (manual): Claude Code launches a browser in visible mode, navigates to a login page, and waits. You log in manually. It then saves the session to auth.json.

  2. Subsequent runs (automated): Claude Code loads auth.json and starts from an authenticated state, skipping the login step entirely.

  3. Session refresh: When the session expires, Claude Code detects the redirect to the login page, flags it, and prompts for a manual re-auth if needed — or handles it automatically if credentials are safely stored.

This pattern works well for internal dashboards, SaaS tools, and any site that doesn’t aggressively rotate session tokens.

For platforms like LinkedIn and Instagram that actively detect automation, the session persistence approach is part of a broader strategy — bypassing browser automation blocks on LinkedIn and Instagram goes into more depth on what actually works there.


Structuring Multi-Step Automation Workflows

Single-task automation is straightforward. The harder challenge is building workflows where multiple browser tasks run in sequence — or in parallel.

Sequential Workflows

A typical sequential workflow might:

  1. Scrape a list of company URLs from a directory
  2. Visit each URL and extract contact information
  3. Log into a CRM and create records for each new contact
  4. Take a screenshot of each created record as confirmation

Claude Code handles this by executing one task, passing the output to the next step, and continuing. The key is writing clear instructions with explicit handoff points between steps.

Parallel Execution

For higher-volume tasks, you can run multiple browser instances simultaneously. Running parallel browser agents with Claude Code covers how to structure this — spawning multiple Claude Code instances, each responsible for a slice of the work, and merging results afterward.

Playwright supports parallel execution natively through its test runner (--workers flag). For scraping and automation tasks outside of testing, you’d manage parallelism at the agent level.

Scheduled Automation

For tasks that need to run on a schedule — daily price monitoring, weekly report generation, recurring form submissions — you can pair Claude Code’s headless mode with a scheduler. Building a scheduled browser automation agent with Claude walks through that architecture specifically.

Claude Code’s headless mode is the right tool here: it runs without an interactive terminal, accepts instructions via stdin or flags, and exits cleanly when done. That makes it composable with cron, GitHub Actions, or any job scheduler.


Common Errors and How to Handle Them

Selector Not Found

The most common failure. Elements have different selectors than expected, or the page rendered before the target element loaded.

Fix: Use waitForSelector before interacting with an element. Tell Claude Code to add explicit waits whenever it’s clicking or reading an element that might not be immediately present.

Timeout Errors

Navigation took longer than Playwright’s default timeout (30 seconds).

Fix: Increase the timeout with --timeout 60000 or handle slow pages by setting page.setDefaultNavigationTimeout(60000) in the script.

Anti-Bot Detection

Some sites use Cloudflare, reCAPTCHA, or behavioral fingerprinting to block automated browsers.

Fix: Playwright has stealth options via plugins like playwright-extra with the stealth plugin. This modifies browser fingerprints to look more like a real user. Not guaranteed to work on every site, but effective for many.

Session Expiry Mid-Run

Authenticated sessions time out partway through a long scrape or workflow.

Fix: Checkpoint your progress. Write partial results to disk after each batch. Claude Code can detect session expiry (usually a redirect to /login) and pause rather than silently failing.

Out-of-Date Selectors

UI changes break selectors in existing scripts.

Fix: Use Playwright’s built-in accessibility roles and text-based selectors where possible — getByRole('button', { name: 'Submit' }) is more durable than .submit-btn-v2. Claude Code can audit existing scripts and suggest more resilient selectors.


Where Remy Fits In

If your automation work eventually points toward building a full application — a scraping dashboard, a QA reporting tool, a monitoring interface — that’s where Remy becomes relevant.

The Playwright + Claude Code combination is excellent for running tasks in a terminal context. But if you want a web interface for scheduling those tasks, viewing results, managing session files, or triggering workflows manually, you’d typically need to build something. That means a backend, a database, auth, a frontend — the full stack.

Remy compiles that from a spec. You describe the application in annotated markdown — what it does, what data it stores, what the UI should show — and Remy builds the backend methods, SQL database, authentication system, and frontend from that spec. It’s not generating a prototype. It’s a deployable full-stack app.

So if you’ve built a useful Playwright automation workflow and you want to wrap it in something your team can actually use without opening a terminal, Remy is the practical next step. You can try it at mindstudio.ai/remy.


FAQ

What’s the difference between Playwright CLI and Playwright MCP for Claude Code?

Playwright MCP is a Model Context Protocol server that exposes browser tools directly to Claude as named functions it can call. Playwright CLI is just the command-line interface — Claude Code invokes it as shell commands. MCP gives Claude more structured access to browser state, but adds overhead in setup and token cost. CLI is simpler and more flexible for most scripting use cases. If you’re building complex multi-step agents that need persistent browser state and tight feedback loops, MCP has advantages. For general automation tasks, CLI is often enough.

Can Claude Code write Playwright scripts from scratch without examples?

Yes. Claude Code has solid knowledge of the Playwright API and can generate working scripts for common tasks — navigation, form filling, element extraction, screenshot capture, and test assertions. For more complex sites with unusual patterns, you may need to provide context about the page structure, but it rarely needs hand-holding on the Playwright API itself.

How do I handle sites that block Playwright automation?

A few approaches: use Playwright’s stealth plugins to reduce fingerprinting signals, set realistic viewport sizes and user agents, add delays between actions to mimic human timing, and use session storage to avoid repeated logins. Some sites block headless browsers specifically — running in headed mode (non-headless) can help. For sites with aggressive detection, strategies for bypassing browser automation blocks covers more advanced techniques.

Is Playwright CLI suitable for production automation?

It depends on what “production” means for your use case. For internal tools, scheduled tasks, and pipelines that run a few times a day, yes — Playwright CLI is stable and reliable. For high-volume, real-time scraping at scale, you’d typically want a dedicated infrastructure setup. The CLI itself is production-quality; the question is really about the surrounding architecture.

How do I pass data between Playwright steps in a Claude Code workflow?

Write intermediate results to files (JSON or CSV work well) and have each step read from the previous step’s output file. This is simpler than trying to pass data through environment variables or in-memory state across separate CLI invocations. Claude Code handles file I/O naturally — it can write, read, and transform files as part of any workflow.

Can I use Playwright CLI with other tools beyond Claude Code?

Yes. Playwright CLI is just a Node.js-based command-line tool — it works with any system that can run shell commands. That includes n8n, Make, GitHub Actions, cron jobs, and any other orchestration layer. Claude Code is particularly useful because it can write, debug, and adapt the scripts on the fly, but the scripts themselves are portable.


Key Takeaways

  • Playwright CLI connects to Claude Code through standard shell commands — no MCP server or special protocol required.
  • The core use cases are QA testing (write, run, fix loop), web scraping (especially JavaScript-rendered pages), and authenticated browser sessions using Playwright’s storage persistence.
  • Session state files (auth.json) let Claude Code carry authenticated sessions across multiple runs without re-logging in each time.
  • For parallel or scheduled execution, combine Claude Code’s headless mode with your existing job scheduler.
  • Token efficiency matters in scraping workflows — have Playwright extract structured data before it reaches Claude Code’s context.
  • If you’re building a full application around your automation workflows, Remy can generate the full-stack backend, database, and frontend from a spec — no assembly required.

Presented by MindStudio

No spam. Unsubscribe anytime.