GPT-5.5 Instant Memory Now Shows Which Saved Facts It Used — And Lets You Correct Them Inline

GPT-5.5 Instant Now Shows Which Memories It Used — And Lets You Fix Them On the Spot

GPT-5.5 Instant, OpenAI’s new default model across all ChatGPT plans, quietly shipped a memory transparency feature that changes how you can audit and correct what the model thinks it knows about you. Under responses that draw on saved memories, you now see an inline Sources section listing exactly which stored facts were pulled. Click the three-dot menu next to any source and you get a Make a correction option — right there, in the chat, without hunting through settings.

That’s the specific thing worth paying attention to. Not the model upgrade in general, not the benchmark claims — this particular change to how memory surfaces and how you interact with it.

What the Memory Update Actually Changed

Before GPT-5.5 Instant, memory in ChatGPT was a black box with a vault. You could go to Settings → Personalization → Memories and see a list of stored facts, but you had no visibility into which of those facts influenced any given response. The model might have pulled three memories, or thirty, or none — you couldn’t tell.

The new behavior is different. When GPT-5.5 Instant uses saved memories to shape a response, it shows you a Sources section inline, beneath the answer. Each source is a specific saved memory — a discrete entry from your memory vault — and you can see exactly which ones contributed.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

The three-dot menu on each source gives you two options: view all saved memories (which takes you to the full vault), or make a correction. The correction flow lets you edit or remove that specific memory entry without leaving the conversation.

This is a small UI change with a meaningful practical effect. You can now close the feedback loop in real time instead of doing a separate audit pass.

Why This Matters More Than It Looks

Memory in ChatGPT is supposed to make the model more personalized. The pitch is that it remembers your preferences, your context, your past interactions, and uses that to give you better answers. The problem has always been that “better” was unverifiable.

If the model told you “since you prefer high mountain Taiwanese teas over sugary boba, here are my recommendations,” you knew it was using memory — but you didn’t know which memory entry triggered that, or whether the entry was accurate. If you’d mentioned something offhand in a conversation six months ago and the model had stored it incorrectly, you’d have no way to catch it unless you manually audited every entry in the vault.

The inline source citations fix this. Now when the model says “since you already hit Asha Tea House pretty often,” you can see the specific stored memory that claim came from. If it’s wrong — maybe you mentioned that place once, not “pretty often” — you can correct it immediately.

This matters especially for anyone using ChatGPT in professional contexts. The model’s memory vault can accumulate stale or incorrect entries over time. Without transparency into which entries are being used, you’re flying blind. With source citations, you can at least catch errors when they surface in responses.

For builders thinking about persistent memory in their own AI applications, this is a useful design pattern to study. The OpenBrain personal AI memory database approach takes a different angle — giving you a Supabase database you own and control, connected to any AI via MCP — but the underlying problem is the same: memory needs to be auditable, not just persistent.

The Context That Makes This More Interesting

GPT-5.5 Instant is the successor to GPT-5.3 Instant. It’s available to all ChatGPT users including the free tier, and it’s now the default model — the model selector moved from the top-left of the interface to an inline position in the chat, where you click the “thinking” label to switch between instant, thinking, and pro modes.

The memory transparency feature shipped alongside this model update, which means it’s easy to attribute it to the model itself. But it’s really a product change to the memory system that happens to coincide with the 5.5 rollout.

The model also shows up inside Microsoft 365 Copilot, which is worth knowing if you’re in an enterprise environment where ChatGPT proper isn’t the primary interface.

One thing that didn’t change: the instant model still doesn’t meaningfully improve on tasks involving websites, visuals, or games. OpenAI’s own documentation is clear that extended thinking models are still needed for those use cases. The 5.5 Instant improvements are concentrated in everyday text tasks — conciseness, personalization, accuracy in specific domains.

OpenAI claims GPT-5.5 Instant reduces hallucinations by over 50%. Some studies cited in their documentation show hallucination rates dropping from around 20% to roughly 3%, depending on the model and domain. The model specifically targets accuracy in medical, legal, and financial contexts — domains where a confidently stated wrong number causes real harm.

Here’s the connection that’s easy to miss: hallucinations and bad memory entries are the same failure mode from different angles. A hallucination is the model inventing a fact it doesn’t have. A bad memory entry is the model confidently using a fact that was stored incorrectly. Both result in the model presenting false information as true.

The inline correction feature addresses the memory side of this. If the model has a stored entry that says you work in finance when you actually work in healthcare, every response that draws on that entry will be subtly miscalibrated. Now you can catch and fix that when it surfaces.

This is also why the memory transparency feature pairs well with the hallucination reduction work. Reducing hallucinations from the model’s base behavior is one lever. Giving users a way to correct the persistent context the model draws on is another. They’re complementary.

For anyone building AI applications where persistent user context matters — think customer-facing agents, personalized assistants, anything that needs to remember user preferences across sessions — the Claude Code three-layer memory architecture that surfaced in the source leak is worth reading alongside this. The design problems are similar even if the implementations differ.

How to Actually Use the Correction Flow

The workflow is straightforward once you know it exists.

Start a chat with memories enabled. Ask something that would draw on stored personal context — “what should I read next,” “what kind of work do I do,” “what are my preferences for X.” If the model uses memories, you’ll see a Sources section below the response.

Each source entry shows the specific memory text. Click the three-dot menu on any entry. You get two options: View all saved memories (opens the full vault in settings) or Make a correction.

The correction flow lets you edit the memory text directly. You can fix inaccurate details, remove outdated entries, or clarify ambiguous ones. The change takes effect immediately — the next response in the same conversation should reflect the corrected memory.

A few things to keep in mind. Not every response will show sources — only responses where the model actively drew on stored memories. If you ask something generic that doesn’t require personal context, you won’t see the sources section. This is expected behavior, not a bug.

Also, the correction flow edits the stored memory entry, not the response. The response you already got doesn’t change. If you want a corrected response, you’ll need to re-run the prompt after making the correction.

What This Means If You’re Building With Memory

Plans first. Then code.

PROJECTYOUR APP

SCREENS12

DB TABLES6

BUILT BYREMY

1280 px · TYP.

yourapp.msagent.ai

A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

If you’re building AI applications that use persistent memory — whether in ChatGPT’s native memory system or in a custom implementation — the design principle here is worth internalizing.

Memory transparency isn’t just a nice-to-have UX feature. It’s a correctness mechanism. Without it, users have no way to audit what the model believes about them, and errors compound silently over time. With it, errors surface at the moment they cause a visible problem, which is exactly when users are motivated to fix them.

The three-dot correction menu is a minimal implementation of this principle. It’s not sophisticated — it’s just an edit field attached to a specific memory entry. But it closes the feedback loop in a way that the previous “go to settings and scroll through your memory vault” approach didn’t.

For builders thinking about how to implement this in their own systems, the key design decisions are: how do you surface which memories influenced a response, and how do you make corrections low-friction enough that users actually make them? ChatGPT’s current implementation answers both questions with a single inline UI element.

Platforms like MindStudio handle the broader orchestration challenge here — 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — but the memory transparency problem is one that every persistent-context application needs to solve at the product layer, regardless of which models or infrastructure you’re using.

The Model Selector Change Is Worth Knowing About

One practical note that’s easy to miss in the coverage of GPT-5.5 Instant: the model selector moved.

It used to live in the top-left of the ChatGPT interface. It’s now inline in the chat, attached to the “thinking” label. Click that label and you get a dropdown with instant, thinking, and pro options (pro availability depends on your plan). You can also hit Configure to access previous model versions.

This matters because if you’re used to switching models frequently — or if you’ve built workflows that assume a specific model is selected — the UI change might catch you off guard. The default is now GPT-5.5 Instant with auto-switching to thinking mode when the model determines it needs more reasoning time.

For most everyday tasks, the instant model is the right default. The side-by-side comparison OpenAI published shows GPT-5.3 Instant failing to find a valid solution to a math problem (concluding “no real solution” after initially saying the work looked correct), while GPT-5.5 Instant works through the same problem and arrives at x ≥ 1 as a valid solution. That’s a meaningful accuracy improvement for a default model.

What to Watch

The memory transparency feature is a first step, not a complete solution. A few things worth watching as this develops.

First, the correction flow currently edits individual memory entries. There’s no bulk correction interface, no way to see a diff of what changed, and no history of corrections. For users with large memory vaults, this is still a fairly manual process.

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

Second, the sources section only appears when the model actively uses memories. There’s no way to see “what memories does the model have available for this conversation” before it responds. A pre-response memory audit view would be more useful for catching problems proactively.

Third, the feature is currently limited to ChatGPT’s native memory system. If you’re using custom system prompts to inject context, or if you’re building on the API with your own memory layer, none of this applies directly. The API doesn’t expose the same memory transparency features that the consumer product now has.

For anyone building applications where memory accuracy is critical — and that’s a lot of applications — the GPT-5.4 vs Claude Opus 4.6 comparison is worth reading for context on how these models handle long-context tasks differently, which is adjacent to how they handle persistent memory. And if you’re thinking about building memory systems that give users genuine ownership and control, Andrej Karpathy’s LLM wiki approach to personal knowledge bases is a useful reference point for what “auditable memory” looks like when you own the underlying data.

The inline correction menu is a small thing. But it’s the right direction, and it’s the kind of feature that compounds — every correction makes the model’s context more accurate, which makes every subsequent response better. That feedback loop is what makes memory useful rather than just persistent.

If you’re building something where Remy compiles your annotated spec into a full-stack TypeScript application, the memory transparency pattern here is worth encoding directly into your spec — not as a UI afterthought, but as a first-class requirement for any feature that stores and retrieves user context.

GPT-5.5 Instant Memory Now Shows Which Saved Facts It Used — And Lets You Correct Them Inline

GPT-5.5 Instant Now Shows Which Memories It Used — And Lets You Fix Them On the Spot

What the Memory Update Actually Changed

Why This Matters More Than It Looks

The Context That Makes This More Interesting

How to Actually Use the Correction Flow

What This Means If You’re Building With Memory

Plans first. Then code.

The Model Selector Change Is Worth Knowing About

What to Watch

Everyone else built a construction worker.
We built the contractor.

Related Articles

GPT-5.5 Instant Memory Now Shows Which Saved Memory It Used — And 4 Other Hidden UI Changes

Three-Tier LLM Routing: Fast, Smart, and Power Model Stacks

What is GPT-5 and How to Use It for AI Agents

What Is GPT 5.5 Instant? OpenAI's Smarter, More Concise Default Model

GPT-5.5 Instant Now Shows Which Memories It Used — And Lets You Fix Them On the Spot

What the Memory Update Actually Changed

Why This Matters More Than It Looks

The Context That Makes This More Interesting

The Non-Obvious Part: Hallucination Rates and Memory Accuracy Are Related Problems

How to Actually Use the Correction Flow

What This Means If You’re Building With Memory

Plans first. Then code.

The Model Selector Change Is Worth Knowing About

What to Watch

Everyone else built a construction worker.We built the contractor.

Related Articles

GPT-5.5 Instant Memory Now Shows Which Saved Memory It Used — And 4 Other Hidden UI Changes

Three-Tier LLM Routing: Fast, Smart, and Power Model Stacks

What is GPT-5 and How to Use It for AI Agents

What Is GPT 5.5 Instant? OpenAI's Smarter, More Concise Default Model

Everyone else built a construction worker.
We built the contractor.