Claude's PowerPoint Add-In Can't Actually See Your Slides — And 4 Other Limitations to Know Before You Use It

Claude’s PowerPoint Add-In Processes Your Slides as Markdown — Here’s What That Costs You

The Claude by Anthropic PowerPoint add-in cannot see your slides. Not in any visual sense. When you load a presentation, the add-in converts the file to markdown internally before Claude ever touches it — and that conversion is where things get interesting, and occasionally wrong.

This is the central limitation you need to understand before building anything serious with it. The others — a 30MB file cap, no iPad or Android support, chart value hallucinations, and session memory that resets every time you close PowerPoint — all flow from the same underlying architecture. None of them are dealbreakers on their own. Together, they define a tool with a specific, bounded use case.

Here’s what each limitation actually means in practice.

The Markdown Conversion Problem

When you upload a .pptx file to the Claude add-in, Claude doesn’t receive a rendered image of each slide. It receives a markdown representation of the file’s XML structure. Text comes through cleanly. Layout relationships come through approximately. Graphics, images, and complex visual elements? Those can get dropped entirely.

The official framing from Anthropic is that “graphics analysis is limited” — the add-in “processes PowerPoint to markdown internally, so it can lose some information contained in graphics or images on existing slides.” That’s accurate but understates the practical impact.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

If your deck has a diagram built from grouped shapes, a custom infographic, or an embedded image with text overlaid on it, Claude may have no idea what’s in it. You can ask Claude to “update the chart on slide four” and it will try — but it’s working from a text description of the chart’s structure, not from what the chart actually looks like.

This is architecturally similar to how Claude handles documents in other contexts. The model is extremely good at reasoning over text. It’s not a vision model in this workflow. Understanding that distinction saves you from a lot of confused debugging.

Chart Hallucinations Are the Most Visible Symptom

The markdown conversion issue has one particularly visible failure mode: chart values.

When Claude generates or edits a chart, it produces native PowerPoint elements — not images. The values are real, editable numbers in the chart’s data table. That’s the good news. The bad news is that Claude can hallucinate those values, especially when it’s trying to infer what’s already in an existing chart from the markdown representation.

In testing with the add-in (using Opus 4.6, the more capable of the two available models), a prompt like “Add a pie chart to show your values on slide three” produced a chart with plausible-looking percentages — but also triggered a hallucinated complaint about text overlap that didn’t exist visually. Claude was reasoning about the slide’s layout from its markdown model of the slide, not from what a human would see.

The practical fix is straightforward: treat every chart Claude generates as a first draft. The values are editable native elements, not locked images, so you can open the chart’s data table and correct anything that’s wrong. But you do have to check. Don’t assume the numbers are accurate just because they look reasonable.

This is especially important if you’re using the PDF-to-deck workflow — uploading a sales report and asking Claude to “turn this into a five-page slide deck showing where the company is going to go.” Claude can pull numbers from the PDF accurately, but if it’s also trying to infer or extrapolate, the chart values deserve scrutiny.

The 30MB File Cap and What It Cuts Off

The add-in has a hard 30MB file cap on PowerPoint files. For most decks, this isn’t a problem. A typical 20-slide corporate presentation with text and basic shapes is well under 10MB.

Where it bites you: high-resolution embedded images, video thumbnails, or presentations that have been through multiple export/import cycles and accumulated bloat. If you’re working with a deck that’s been built up over months — slides added from different sources, images pasted in at full resolution — you may hit the cap before you realize it.

The workaround is to work in sections. Build the deck in chunks, then merge them afterward. This is also good practice for a different reason: the add-in’s generation limits are tied to complexity, not just file size. A request to generate a 30-slide deck in one shot will take longer, consume more credits, and be more likely to produce inconsistencies than generating it in three 10-slide passes.

How Remy works. You talk. Remy ships.

YOU14:02

Build me a sales CRM with a pipeline view and email integration.

REMY14:03 → 14:11

Scoping the project

Wiring up auth, database, API

Building pipeline UI + email integration

Running QA tests

✓ Live at yourapp.msagent.ai

The two available models — Opus 4.6 for complex generation and Sonnet 4.6 for quick edits — have different credit costs. If you’re on a Claude Pro plan ($20/month, the minimum required tier), you’ll want to use Sonnet 4.6 for anything that’s just reformatting or copy adjustments, and reserve Opus 4.6 for full deck generation from source documents.

Session Memory Resets Every Time You Close PowerPoint

This one catches people off guard. Chat history does not persist between PowerPoint sessions. Every time you close and reopen PowerPoint, the add-in starts fresh. Claude has no memory of what you built in the last session, what instructions you gave, or what decisions were made.

The practical consequence: if you’re iterating on a deck across multiple work sessions, you need to re-establish context each time. This is less painful than it sounds if you use the persistent instructions setting — the settings panel lets you write standing instructions that Claude always follows, like “always use Helvetica Neue,” “always use our brand color #1A2B3C,” or “add speaker notes to every slide.” Those persist across sessions even when the chat history doesn’t.

The pro tip from the add-in’s documentation is worth repeating: always load your template before you start prompting. Claude reads the slide master layouts, fonts, and color schemes from whatever is loaded, and uses that as its reference. First-pass accuracy with a pre-loaded template is around 90%. Without one, Claude invents a visual style, and you spend the rest of the session correcting it.

The session reset also means the Ask before edits mode is worth keeping on, especially early. It’s the toggle in the bottom-left of the sidebar — the alternative is Accept all edits, which lets Claude modify your deck without confirmation. For a new session where Claude has no memory of your previous work, Ask before edits gives you a checkpoint before anything gets overwritten.

No Mobile Support

The add-in does not work on iPad or Android. Desktop and web only — meaning PowerPoint for Windows, PowerPoint for Mac, or PowerPoint on the web at office.com.

This is a Microsoft add-in architecture constraint, not an Anthropic decision. The Office JavaScript API that add-ins use has limited support on mobile Office apps. It’s unlikely to change soon.

If your workflow involves reviewing or editing decks on a tablet, you’ll need to do the Claude-assisted work on desktop and then switch to mobile for review. The generated content is standard PowerPoint — nothing about the output is add-in-specific, so the resulting .pptx opens normally on any device.

What the Markdown Architecture Gets Right

It’s worth being precise about what the markdown conversion approach actually enables, because the limitations above exist alongside some genuine capabilities.

Because Claude is working from a structured text representation of the file, it can do things that a vision-based approach would struggle with. Translation works well — “translate this entire deck into French” produces a complete translated deck because Claude is operating on the text content directly, not trying to OCR text from slide images. Speaker notes generation (“add speaker notes in English to each slide”) works for the same reason.

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE

AUTH

roles · sessions · tokens ✓ LIVE

DEPLOY

git-backed, live URL ✓ LIVE

Real backend. Real database. Real auth. Real plumbing. Remy has it all.

The web search capability is the most unexpected feature. You can paste a URL into the prompt, and the add-in will fetch the site, extract its color scheme and content, and generate a pitch deck that matches the site’s visual identity. The color matching isn’t pixel-perfect — fonts won’t be identical — but the overall palette comes through. This works because Claude is reading the site’s HTML and CSS, not screenshotting it.

For teams building AI workflows that need to pull data from external sources and format it into presentations, platforms like MindStudio offer a different path: 200+ models, 1,000+ integrations, and a visual builder for chaining agents together — so you can, for example, pull from a CRM, summarize with a model, and push formatted output into a template without writing orchestration code.

The Underlying Tradeoff

The markdown-based architecture is a deliberate tradeoff. It makes the add-in fast, keeps the output fully editable (no locked images, no static exports), and lets Claude apply its text reasoning strengths directly to the content. The cost is visual fidelity — Claude cannot see what your slides look like, only what they contain.

For decks built primarily from text and data, this tradeoff is usually fine. For decks that rely heavily on custom graphics, complex layouts, or visual information embedded in images, it’s a real constraint.

The hallucination risk on charts is the most important thing to internalize. Claude’s models — including Opus 4.6, which sits below the upcoming Claude Mythos tier in Anthropic’s lineup — are not making things up maliciously. They’re filling in gaps in their markdown representation of the slide with plausible-sounding values. The fix is verification, not avoidance.

This is also a useful reminder about how AI-generated content works more broadly. The compute constraints that shape Claude’s behavior affect what’s possible in any given session — and understanding the model’s internal representation of your data is more useful than treating it as a black box that either works or doesn’t.

The add-in is genuinely useful for the workflows it’s designed for: generating decks from PDFs, Excel files, and URLs; translating existing decks; adding speaker notes at scale; and iterating on copy and structure. Just go in knowing that Claude is reading your slides, not seeing them — and verify anything numerical before you present it.

The same principle applies to any AI tool that processes structured documents. When Claude Code’s memory architecture was examined after the source leak, the same pattern emerged: the model works from a structured text representation of its context, and the gaps in that representation are where errors concentrate. Understanding the representation is the thing.

For builders thinking about the next layer of abstraction above this kind of AI-assisted editing — where the source of truth is a structured spec rather than a finished document — Remy takes a related approach: you write an annotated markdown spec, and it compiles into a complete TypeScript full-stack application. The spec is the source; the generated output is derived. Same principle, different domain.

The PowerPoint add-in is in beta. The markdown conversion limitation is the kind of thing that could improve with better multimodal integration. For now, treat it as a capable text-and-data tool that happens to output slides — not as a tool that understands slides visually.