How to Build an AI Video Production Workflow with Claude Code and HeyGen MCP

From Script to Screen in Minutes

Video production used to require a production team, a studio, and a calendar full of scheduling conflicts. Today, the same output — a polished, on-brand video with a realistic avatar, cloned voice, and professional editing — can be assembled by an AI agent running a single workflow.

That’s exactly what becomes possible when you combine Claude Code with HeyGen’s Model Context Protocol (MCP) server. Claude handles the reasoning: writing scripts, structuring content, making editorial decisions. HeyGen handles the rendering: avatar generation, voice synthesis, and video output. The MCP layer connects them, letting Claude call HeyGen’s capabilities directly — no manual handoffs, no copy-pasting between tabs.

This guide walks through the full setup of an AI video production workflow using Claude Code and HeyGen MCP — from initial configuration to a running pipeline that takes a topic and outputs a finished video.

What Claude Code and HeyGen MCP Actually Do

Before getting into steps, it’s worth being precise about what each component brings.

Claude Code

Claude Code is Anthropic’s agentic coding tool. It runs Claude directly in your terminal, giving it access to your filesystem, the ability to run shell commands, and the capacity to act on multi-step instructions autonomously. It’s not just a code assistant — it can plan, execute, revise, and chain together complex tasks.

Wondering what the Hermes hype is about? Free 60-minute primer

For video production, Claude Code is useful because it can write and refine scripts, generate structured JSON payloads for APIs, handle conditional logic (e.g., “if the video is for LinkedIn, shorten the script to 60 seconds”), and orchestrate the entire production sequence without a human in the loop.

HeyGen MCP

HeyGen is a video generation platform that lets you create AI-generated videos using digital avatars and synthetic voices. Their MCP server exposes HeyGen’s capabilities — avatar selection, voice cloning, video rendering, and more — as tools that MCP-compatible AI clients can call.

The Model Context Protocol is an open standard developed by Anthropic that lets AI models connect to external tools and data sources through a standardized interface. When HeyGen runs an MCP server, Claude can call it the same way it calls any other tool: by name, with typed inputs, and with a response it can reason about.

Together, the two systems give you a fully agentic video pipeline. Claude reasons and writes; HeyGen renders and produces.

What You Need Before Starting

This workflow assumes a few things are already in place. Here’s a quick checklist:

Claude Code installed — available via Anthropic’s CLI tools
A HeyGen account with API access enabled (Pro plan or above)
HeyGen’s MCP server — available via their developer documentation
Node.js (v18+) installed for running the MCP server locally
A cloned voice or selected avatar set up inside HeyGen’s dashboard
Basic familiarity with terminal commands — you don’t need to be a developer, but you’ll be running commands and editing config files

If you haven’t created a HeyGen avatar or cloned a voice yet, do that first. Both are done through HeyGen’s web interface and take about 10–15 minutes. The avatar and voice IDs you generate there will be used throughout this workflow.

Setting Up the HeyGen MCP Server

HeyGen’s MCP server acts as the bridge between Claude Code and HeyGen’s video API. When it’s running, Claude can call tools like create_video, get_video_status, and list_avatars as if they were built-in capabilities.

Install and Configure

Start by cloning or installing HeyGen’s MCP server package. In your terminal:

npm install -g @heygen/mcp-server

Then create a configuration file (heygen-mcp-config.json) in your project directory:

{
  "apiKey": "YOUR_HEYGEN_API_KEY",
  "defaultAvatarId": "YOUR_AVATAR_ID",
  "defaultVoiceId": "YOUR_VOICE_ID"
}

Replace the placeholder values with your actual HeyGen credentials. Your API key is available in HeyGen’s developer settings. Avatar and voice IDs can be found in the respective sections of your HeyGen dashboard.

Register the MCP Server with Claude Code

Claude Code needs to know about your MCP server before it can use it. Open (or create) your Claude Code configuration file — usually at ~/.claude/mcp_servers.json — and add an entry:

{
  "mcpServers": {
    "heygen": {
      "command": "heygen-mcp-server",
      "args": ["--config", "/path/to/heygen-mcp-config.json"]
    }
  }
}

Restart Claude Code after saving. When it loads, you should see the HeyGen tools listed in the available toolset. If you run /tools in Claude Code’s interactive mode, heygen.create_video, heygen.get_video_status, and related tools should appear.

Writing Scripts with Claude

With the connection established, the first stage of the production workflow is script generation. This is where Claude earns its place — not just filling in a template, but producing structured, usable content tailored to the video format.

Defining the Script Brief

The most reliable way to get consistent output from Claude Code is to give it a structured prompt that defines the parameters upfront. Create a brief.md file in your project:

# Video Brief

**Topic:** [Your topic here]
**Platform:** [YouTube / LinkedIn / TikTok / Internal training]
**Duration:** [60 seconds / 2 minutes / 5 minutes]
**Tone:** [Professional / Conversational / Educational]
**Call to action:** [What should the viewer do at the end?]
**Key points to cover:**
- Point 1
- Point 2
- Point 3

You can populate this manually for each video, or automate it by feeding in data from a spreadsheet, CMS, or webhook — more on that later.

Running the Script Generation Step

With Claude Code running in your project directory, use a prompt like this:

Read brief.md and write a video script for a [duration] video. 
Format the output as a JSON file called script.json with the following structure:
{
  "title": "...",
  "segments": [
    { "id": 1, "text": "...", "duration_estimate": "..." }
  ],
  "total_estimated_duration": "..."
}

Keep each segment under 300 characters for optimal avatar rendering. 
Use natural, spoken language — not written prose.

Claude Code will read your brief, generate the script, and write it directly to script.json. If the output doesn’t match the brief, you can iterate in the same session: “Shorten segment 3,” “Make the tone more casual,” “Add a statistic to the intro.”

Reviewing and Approving the Script

Before moving to rendering, review script.json quickly. Claude occasionally generates segments that are too long for a single avatar clip, or uses punctuation patterns that throw off text-to-speech rhythm. Common fixes:

Break segments longer than 400 characters into two
Replace semicolons with periods
Spell out numbers (“3” → “three”)
Avoid unusual proper nouns unless your voice clone handles them well

Once the script looks good, you’re ready to render.

Rendering Video with HeyGen via MCP

Now Claude calls HeyGen’s MCP tools to turn the script into a video. This is where the automation gets concrete.

Building the Video Creation Logic

Tell Claude Code to read the script and submit each segment to HeyGen:

Read script.json. For each segment, call heygen.create_video with:
- avatar_id: [your avatar ID]
- voice_id: [your voice ID]  
- input_text: the segment's text
- background: "white" (or your preferred background)

Store each video ID returned in a file called render_queue.json, 
mapped to the segment ID.

Claude will loop through each segment, make the API calls, and collect the video IDs. Depending on the number of segments and HeyGen’s rendering queue, this takes 2–10 minutes per segment.

Polling for Render Completion

HeyGen’s video generation is asynchronous — you submit a job, get back a video ID, and then poll the status endpoint until it’s done. Claude Code handles this gracefully:

Read render_queue.json. For each video ID, call heygen.get_video_status 
every 30 seconds until the status is "completed". 
When complete, save the download URL to completed_videos.json, 
mapped to the segment ID.

Claude will run the polling loop, wait for completions, and update completed_videos.json incrementally. You can monitor the terminal output as each segment finishes.

Downloading the Clips

Once all segments are marked complete:

Read completed_videos.json. For each entry, download the video file 
from the URL and save it to the /clips directory, 
named by segment ID (e.g., segment_01.mp4, segment_02.mp4).

At this point, you have all your rendered clips in a local directory, ready to assemble.

Assembling the Final Video

With individual clips rendered and downloaded, the last step is merging them into a single video. This can be done with FFmpeg, which Claude Code can call as a shell command.

Generating the FFmpeg Merge Command

Using FFmpeg, merge all .mp4 files in the /clips directory 
in order of segment ID into a single file called final_output.mp4. 
Generate and run the appropriate FFmpeg concat command.

Claude will construct a concat file, run the merge, and save the output. A basic merge looks like:

ffmpeg -f concat -safe 0 -i clips_list.txt -c copy final_output.mp4

Adding Optional Post-Processing

Claude Code can also handle additional production steps if you need them:

Subtitles — Generate an SRT file from the script segments and burn it in with FFmpeg
Intro/outro — Prepend or append a pre-rendered branded clip
Background music — Mix in an audio track at a lower volume level
Thumbnail — Use a frame extraction command to pull a still from the video

Each of these is an additional step you can add to the workflow, either in the same Claude Code session or as a separate script that runs afterward.

Automating the Full Pipeline

Running this workflow interactively in Claude Code is useful for one-off videos. For recurring production — say, a weekly product update video, a daily news briefing, or batch content for social media — you want the pipeline to run without manual input.

Saving the Workflow as a Shell Script

Claude Code can write a shell script that runs the entire pipeline end-to-end:

Write a bash script called run_pipeline.sh that:
1. Reads brief.md
2. Generates script.json using Claude
3. Renders all video segments via HeyGen MCP
4. Polls until all renders are complete
5. Merges clips into final_output.mp4
6. Sends a Slack notification when done (optional)

You can then schedule this script with a cron job, trigger it from a CI/CD pipeline, or connect it to an external event (a new row in a spreadsheet, a form submission, a webhook from your CMS).

Parameterizing the Brief

Instead of manually editing brief.md every time, you can make it dynamic. Pass arguments to the shell script:

./run_pipeline.sh --topic "Q3 Product Updates" --platform "LinkedIn" --duration "90 seconds"

Claude Code reads these arguments, populates the brief, and runs the pipeline. With this in place, you can generate a video from a single command — no interface, no manual steps.

Where MindStudio Fits Into This

Claude Code and HeyGen MCP give you a powerful command-line pipeline. But if you want to expose this workflow to non-technical teammates, trigger it from other business tools, or build a custom interface around it — that’s where MindStudio becomes useful.

MindStudio’s AI Media Workbench is built specifically for AI-driven video and image production. It gives you access to all major video generation models in one place, plus 24+ post-production tools — subtitle generation, clip merging, face swap, upscaling, background removal — without any local setup or FFmpeg commands.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

More relevantly for this use case: MindStudio lets you build the entire video production workflow as a visual, automated agent. You can chain together steps — “write script → generate video segments → merge clips → notify team” — using a drag-and-drop interface, with Claude, HeyGen, and other tools connected as steps in a single workflow. Team members can trigger it from a simple web form, a Slack message, or a scheduled run, without ever touching a terminal.

The platform also includes the Agent Skills Plugin, an npm SDK that lets Claude Code and other agentic frameworks call MindStudio’s 120+ capabilities — including video tools, notifications, and workflow triggers — as simple method calls. So if you want to keep your Claude Code setup but augment it with MindStudio’s media tooling and infrastructure, you can do that without rebuilding from scratch.

You can try MindStudio free at mindstudio.ai.

Common Mistakes and How to Fix Them

Segments Too Long for Clean Rendering

HeyGen’s avatar renderer works best with shorter text inputs — generally under 350 characters per segment. Longer inputs can cause pacing issues or unnatural pauses. Fix this by adding a validation step after script generation: have Claude check each segment length and split anything over the threshold.

MCP Server Connection Drops Mid-Run

The HeyGen MCP server occasionally loses its connection during long render jobs. Add a reconnection check before each polling loop, and consider splitting large video projects (10+ segments) into batches of 3–5.

Inconsistent Voice Timing Across Clips

When clips are merged, slight timing inconsistencies can make the final video feel choppy. Adding a 0.2-second silence pad between clips with FFmpeg’s apad filter usually fixes this.

Rate Limiting on HeyGen’s API

HeyGen enforces rate limits on video creation requests. If you’re generating many segments simultaneously, add a 5–10 second delay between API calls. Claude Code can handle this with a simple sleep command between submissions.

Frequently Asked Questions

What is HeyGen MCP and how does it connect to Claude?

HeyGen MCP is an implementation of Anthropic’s Model Context Protocol that exposes HeyGen’s video generation API as callable tools. When registered with Claude Code, it allows Claude to create avatar videos, check render status, and retrieve completed files directly — no manual API calls required. Claude treats HeyGen’s capabilities the same way it treats any other tool in its toolset.

Do I need coding experience to build this workflow?

Some comfort with the terminal is helpful, but you don’t need to be a developer. Claude Code itself can write most of the scripts and commands you need — your job is to give it clear instructions and review the output. The trickiest parts are initial setup (installing Node.js, registering the MCP server) rather than ongoing use.

How long does it take to generate a finished video?

A 60-90 second video with 5–8 segments typically takes 15–30 minutes end-to-end, depending on HeyGen’s rendering queue. Script generation and the final FFmpeg merge are fast (under a minute each). The bulk of time is HeyGen’s asynchronous rendering, which runs in the background while Claude polls for completion.

Can I use my own voice and likeness with HeyGen?

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

Yes. HeyGen supports voice cloning from a short audio sample (typically 2–5 minutes of clean recordings) and custom avatar creation from video footage. Both features require a HeyGen Pro account or above. Once created, your cloned voice and avatar IDs can be set as defaults in the MCP configuration, so every video uses your voice and appearance automatically.

How much does this workflow cost to run?

Costs depend on usage. Claude Code is billed per token through Anthropic’s API. HeyGen charges per video minute generated. A 90-second video with 8 segments might cost $0.50–$2.00 in Claude API costs and $2–$5 in HeyGen credits, depending on your plans. Running this at scale — say, 20 videos per week — is still significantly cheaper than traditional video production.

Can this workflow be connected to a content calendar or CMS?

Yes. The cleanest approach is to expose your video brief data (topic, platform, duration, key points) from whatever system holds your content calendar — Notion, Airtable, Google Sheets — via a webhook or API. Your shell script reads from that source instead of a static brief.md file, triggering automatically when new entries are added. MindStudio’s workflow builder makes this kind of multi-tool connection particularly straightforward if you want a no-code approach.

Key Takeaways

Claude Code and HeyGen MCP connect via Anthropic’s Model Context Protocol, letting Claude call HeyGen’s video generation API as native tools — no manual handoffs required.
The core workflow has four stages: script generation, segment rendering, render polling, and clip merging. Each stage can be run interactively or fully automated.
HeyGen works best with short, spoken-language text segments (under 350 characters). Structuring your script output with this in mind produces cleaner, more natural-sounding video.
The full pipeline can be saved as a shell script and triggered from a cron job, webhook, or external event — making recurring video production nearly hands-free.
MindStudio’s AI Media Workbench offers a visual, no-code alternative for teams who want the same video automation without terminal-based setup, with 24+ post-production tools and integrations with business tools already built in.

If you want to build this workflow without touching a terminal — or extend it with scheduling, notifications, and team access — MindStudio is worth exploring. You can start free and have a working video automation agent running in under an hour.