Nano Banana Is Already Live in Google AI Studio — Here's What It Can (and Can't) Do

Nano Banana Is Live in Google AI Studio Before IO Even Happens

Google didn’t wait for its annual developer conference to ship Nano Banana. The image generation feature is already integrated into Google AI Studio right now — generating custom image assets as your app builds, with a redesigned edit tool that gives you visual control over individual components. You can annotate specific UI elements and update them directly. That’s the short version.

The longer version involves some real limitations worth understanding before you build anything around it.

What Nano Banana Actually Does in AI Studio

When you’re building an app in Google AI Studio, Nano Banana runs alongside the generation process. As the app scaffolds out, it creates image assets to go with it — not as a separate step you trigger manually, but as part of the same flow.

The redesigned edit tool is the more interesting piece. Instead of regenerating the whole app when you want to change something visual, you can select specific components and annotate them. Want to change a button style, swap a header image, or adjust a UI section? You point at it, describe what you want, and the edit tool handles that component in isolation.

This is a meaningful workflow improvement. Whole-app regeneration is slow and often breaks things you weren’t trying to change. Component-level editing is closer to how a human designer would work — touch the thing you want to change, leave everything else alone.

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

remy.msagent.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

The comparison that keeps coming up is OpenAI’s Codex, which has a similar image generation capability built into its agentic app-building flow. Codex can generate image assets autonomously during a /goal run — including generating all game assets for a top-down shooter in one session, which has been demoed publicly. Both tools are solving the same problem: apps need images, and stopping to generate them manually breaks the flow.

Where Nano Banana Falls Short

Here’s the limitation that matters most if you’re building anything with UI assets: Nano Banana has no native transparency support.

Codex’s image generation does support transparent assets natively. Nano Banana doesn’t. That’s not a minor footnote — transparent PNGs are the standard format for UI components, icons, logos, and anything that needs to sit on top of a background without a white box around it. If you’re generating app icons, button graphics, or overlay elements, you’ll need to post-process the outputs to remove backgrounds, which adds a step and introduces quality loss depending on the tool you use.

This is the kind of limitation that’s easy to miss in a demo and annoying to discover in production. A generated image of a button that looks great on a white background looks broken the moment you put it on a colored or gradient background.

Google’s own tooling elsewhere in the ecosystem handles transparency fine — this appears to be specific to the Nano Banana model series. Whether it gets added before or after Google IO is unknown, but it’s a gap worth tracking.

If you’re building apps where visual polish matters and you need transparent assets, you have a few options right now: generate assets externally with a tool that supports transparency, use Codex’s image gen for that specific step, or design your app to use solid-background images where transparency isn’t required. None of these are ideal, but they’re workable.

Why This Shipped Before IO

Google IO is a few weeks out from early May 2026. Shipping Nano Banana integration now, before the conference, is an interesting choice.

One read: it’s a developer preview. Get builders using it, collect real feedback, and show up to IO with a feature that already has usage data behind it. Another read: the bigger announcements are coming at IO, and this is a warm-up act — something concrete to point to while the more significant stuff stays under wraps.

The leak landscape around IO supports the second interpretation. There are reports of a new Omni model in testing, hinted at through a “video UI powered by Omni” reference — which suggests native audio input and output capabilities, not just video. There’s also a reportedly new video model beyond the current VO systems, with better quality and control, possibly surfacing as native video output inside Gemini itself.

On the model side, Gemini 3.2 and 3.5 are reportedly in testing with a focus on speed and efficiency. Gemini Ultra is apparently evolving in a different direction — toward a memory-heavy, long-context system designed for multi-step workflows rather than raw speed. A memory feature codenamed “Team Food” is part of this, aimed at improving how Gemini uses past conversations and long-term context across sessions.

There’s also a codename “Spark Robin” floating around, which observers are reading as a signal of a strong push on image and video capabilities broadly — which would make Nano Banana’s early arrival make more sense as part of a larger visual AI story Google is building toward.

If even half of this lands at IO, Nano Banana’s current limitations might look very different in a month.

The DeepMind Paper That Might Be Signaling VO4

Buried in the same week’s news: a Google DeepMind paper on diffusion models that’s getting attention from people watching the video generation space.

The paper addresses a fundamental trade-off in diffusion-based image and video generation: the tension between the information content of the latent representation and the reconstruction quality of the output. More information in the latent means more detail preserved, but reconstruction gets harder. Less information means cleaner reconstruction but you lose fidelity.

The paper’s framing is that it provides “a map to navigate the trade-off systematically” — not a solution that eliminates the trade-off, but a principled way to understand and control where you land on it. For video generation specifically, this matters a lot. VO2 and VO3 have been impressive but still show the artifacts you’d expect from models that haven’t fully solved this problem.

The speculation is that this paper is laying theoretical groundwork for VO4. That’s not confirmed, but the timing — published weeks before IO, from DeepMind, focused on exactly the problem that limits current video generation quality — is suggestive.

This connects to the broader image and video push that Spark Robin and the native video output leaks are pointing toward. Google appears to be working on multiple fronts simultaneously: the model-level quality improvements (the DeepMind paper), the generation infrastructure (Nano Banana in AI Studio), and the user-facing product integration (native video output in Gemini). Whether these converge into something coherent at IO is the open question.

The Arena Codenames Worth Watching

One more piece of context that’s easy to overlook: new model codenames are appearing in arena blind tests. Ajax, Hercules, Hector, and Orpheus are all showing up. Arena tests are where models get evaluated without users knowing which model they’re talking to — so these codenames represent real models in active testing.

The interesting wrinkle: at least one observer in the community has suggested that Ajax might be an Apple model, not a Google one. If that’s right, it means the arena is picking up models from multiple companies simultaneously, and not everything appearing there is Google IO-related.

This matters for how you interpret the pre-IO signal. Some of what’s showing up in leaks and arena tests is probably Google IO material. Some of it might be from other companies entirely. The codenames are real, but their provenance isn’t always clear.

For builders trying to plan around what’s coming, the honest answer is: watch IO, not the leaks. The leaks give you a rough shape of the direction, but the actual capabilities, pricing, and API availability won’t be clear until the announcements land.

What to Actually Do Right Now

If you’re building in Google AI Studio today, here’s the practical picture:

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE

AUTH

roles · sessions · tokens ✓ LIVE

DEPLOY

git-backed, live URL ✓ LIVE

Real backend. Real database. Real auth. Real plumbing. Remy has it all.

Nano Banana is live and usable for generating image assets during app creation. The redesigned edit tool for component-level visual control is also live. These are real features you can use right now, not previews.

The transparency limitation is real. If your app needs transparent image assets — icons, overlays, logos, anything that sits on a non-white background — plan for a post-processing step or source those assets elsewhere. This isn’t a dealbreaker, but it’s a workflow addition you should account for.

For teams building more complex AI-powered applications that chain models and tools together, the component-level image editing in AI Studio is one piece of a larger puzzle. Platforms like MindStudio handle the orchestration layer — 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — which becomes relevant when your app needs image generation, a language model, and external data sources all working together.

Watch the IO announcements specifically for: native video output in Gemini, the Omni model’s actual capabilities (especially audio I/O), and whether the Team Food memory feature ships in a form that’s usable in production workflows. Those three would meaningfully change what’s possible in AI Studio and Gemini-based apps.

The Broader Context for App Builders

The pattern across both Google and OpenAI right now is the same: image generation is moving from a standalone tool you call separately into something embedded in the app-building flow itself. Codex generates assets during a /goal run. Nano Banana generates them as the app scaffolds. The friction of “build the app, then go find images for it” is being removed.

This is a meaningful shift for anyone building AI-native apps. The asset pipeline — which used to be a separate design workflow — is becoming part of the development workflow. That changes how you think about prototyping. A rough spec can now produce a rough app with rough-but-functional visual assets in one pass.

If you’re thinking about how this connects to full-stack app development more broadly, tools like Remy take a similar “compile from intent” approach at the code level: you write an annotated markdown spec, and it compiles into a complete TypeScript backend, SQLite database, auth, and deployment. The image generation layer and the code generation layer are converging on the same idea — the artifact you hand-write should be the spec, not the output.

For Google AI Studio specifically, the Nano Banana integration fits naturally alongside other recent Google tooling. If you’ve been using Google Stitch to build design systems and export to AI Studio, Nano Banana adds an image generation layer on top of that workflow — though the transparency gap means Stitch-generated design tokens and Nano Banana-generated assets may not always play together cleanly without post-processing.

Similarly, if you’re running Gemma 4 locally with Ollama for development work, the cloud-side AI Studio features like Nano Banana represent the hosted complement — faster iteration in the browser, with the trade-off that you’re working within Google’s infrastructure rather than your own.

Day one: idea. Day one: app.

DAY

DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

The Gemma 4 model family’s Apache 2.0 licensing is also worth keeping in mind as you evaluate how much of your stack you want to lock into Google’s hosted services versus running open-weight models you control. Nano Banana is a hosted feature — there’s no local equivalent yet.

The Honest Assessment

Nano Banana in AI Studio is useful today, with a real limitation. The redesigned edit tool is the more interesting piece — component-level visual control is a better workflow than whole-app regeneration. The transparency gap is annoying but workable.

The more interesting question is what IO brings. If the Omni model, the new video model, and the Team Food memory feature all land in usable form, the current Nano Banana integration will look like a small piece of a much larger system. If IO is mostly announcements with long waitlists and limited access, then what’s live in AI Studio right now is what you’re working with for a while.

Either way, the transparency limitation needs to be fixed. That’s the one thing that would make Nano Banana genuinely production-ready for UI asset generation rather than just useful for prototyping.