Skip to main content
MindStudio
Pricing
Blog About
My Workspace

How to Use Imagen 2 for E-Commerce Product Photography: A Practical Guide

Imagen 2 can place a single product image into unlimited scenes with consistent detail. Here's how to use it for e-commerce product photography.

MindStudio Team
How to Use Imagen 2 for E-Commerce Product Photography: A Practical Guide

What Imagen 2 Actually Does for Product Photography

E-commerce teams spend thousands of dollars per month on product photography. Renting studios, hiring photographers, sourcing props, booking models — it adds up fast, and it still takes days or weeks to get usable images. Imagen 2, Google’s image generation model, changes the math considerably.

The core capability that makes Imagen 2 useful for e-commerce is scene generation from a single product image. You shoot your product once — clean, well-lit, against a white or neutral background — and Imagen 2 can place that product into dozens of different scenes, environments, and contexts while preserving the product’s visual integrity. One photo session. Unlimited variations.

This guide covers how to actually use Imagen 2 for product photography: what it does well, how to get consistent results, prompt strategies that work, and where the tool has real limits you should know about before you commit time to it.


Understanding Imagen 2’s Core Capabilities

Imagen 2 is a text-to-image and image-editing model developed by Google DeepMind. It’s part of Google’s broader family of generative AI models and powers features across Google products, including Vertex AI and Google Gemini.

For product photography specifically, the relevant features are:

  • Inpainting and outpainting — the ability to replace parts of an image (like a background) or extend the image canvas
  • Image editing via text prompts — describing changes and having the model apply them
  • Product placement in scenes — taking a product and generating a contextually appropriate background or environment around it
  • High-fidelity detail preservation — maintaining brand colors, logos, textures, and product proportions

Imagen 2 is accessible through Google’s Vertex AI platform, through the Gemini API, and through several third-party tools and no-code platforms that have integrated it. Access via direct API is straightforward for developers; access via platforms like MindStudio is easier for non-technical users.

How It Compares to Traditional Product Photography Workflows

In a standard e-commerce photography workflow:

  1. Product arrives at studio
  2. Photographer sets up lighting for each scene
  3. Multiple shots per setup, multiple setups per product
  4. Photo editor processes RAW files
  5. Background removal and retouching
  6. Export and upload

That process can take 2–5 days per product batch and costs anywhere from $50 to several hundred dollars per final image depending on the studio and product complexity.

With Imagen 2, the workflow looks more like:

  1. Shoot product once on neutral background (or use existing product images)
  2. Remove background programmatically (or manually)
  3. Generate scene descriptions via prompt
  4. Review and select outputs
  5. Minor retouching if needed

The time and cost difference is significant. The quality gap still exists in some cases — especially for products where material texture, reflections, and fine details matter — but it’s narrowing.

What Makes Imagen 2 Different from Earlier Models

Earlier diffusion models like Stable Diffusion and the first generation of DALL-E struggled with product consistency across generations. Ask the model for “a red ceramic mug on a kitchen counter” twice, and you’d get two different mugs — similar concept, but inconsistent details, handle shape, color saturation.

Imagen 2 has better coherence between the input product image and the generated scene. It can be given a reference image and a scene description, and it does a better job of placing the original product (rather than generating a similar-looking product) into the new environment.

This consistency is what makes it practical for e-commerce at scale. You can run one product through fifty different scene prompts and get fifty images where the product looks like the same product, just in different contexts.


Setting Up Your Product Images for Imagen 2

The quality of your input image has a direct impact on output quality. Imagen 2 can work with a range of source images, but certain preparation steps significantly improve results.

Shoot for Extraction, Not Final Presentation

When photographing products specifically for AI-assisted scene generation, you’re optimizing for clean extraction rather than final aesthetics. That means:

  • Use a high-contrast background — pure white (#ffffff) or a neutral gray makes background removal cleaner. Avoid off-white or cream backgrounds that blend with light-colored products.
  • Even, diffused lighting — avoid harsh shadows that fall on the background. Side shadows can get picked up as part of the product during extraction.
  • Multiple angles — shoot front, 3/4, side, and top-down. Different scenes will call for different product orientations.
  • Shoot at high resolution — minimum 2000px on the shortest side. Imagen 2 outputs at high quality; starting with a low-res input limits how the final image renders.
  • Keep the product centered with breathing room — leave margin around the product. This gives the model room to blend edges into the new background.

Background Removal

Clean background removal is the most important pre-processing step. A sloppy extraction — fringe artifacts, missing parts of the product, halos — will show up in every generated scene.

Tools that work well for this:

  • Adobe Photoshop’s Remove Background — good for complex products with fine edges (jewelry, plants, transparent items)
  • remove.bg — fast, API-accessible, good for most standard products
  • Canva’s background remover — adequate for simple products
  • GIMP with Select by Color — free option for simple backgrounds

For products with complex geometry, transparent materials, or fine details like hair or plant leaves, manual masking or at least manual cleanup of automated removal is worth the time.

Prepare a Clean PNG with Transparency

Export your extracted product as a PNG with transparent background. This is the file you’ll feed into Imagen 2’s editing workflow.

Some platforms take a flat JPG and handle background removal internally. But starting with a clean transparent PNG gives you more control and usually produces better blending at product edges.


Crafting Prompts That Work

Prompt quality determines output quality. There’s a meaningful difference between vague prompts and well-structured scene descriptions when it comes to e-commerce product photography.

The Basic Prompt Structure

For placing a product in a scene, a useful prompt structure is:

[Scene/environment] + [lighting conditions] + [surface/props] + [mood/style] + [technical photography descriptors]

Example for a skincare product:

“Marble bathroom countertop, soft natural window light from the left, small eucalyptus branches nearby, clean and minimal aesthetic, product photography, 85mm lens, shallow depth of field”

Example for a coffee mug:

“Rustic wooden table in a warm café, morning sunlight streaming through a window, a book open in the background, cozy lifestyle photography, warm tones”

Example for a sneaker:

“Urban concrete surface, overcast outdoor light, slight motion blur in background suggesting city movement, editorial sports photography”

What to Include in Your Prompts

Environment specifics — don’t just say “outdoor scene.” Say “forest floor covered in autumn leaves” or “rooftop terrace with city lights at dusk.” Specificity gives the model more to work with.

Lighting descriptions — lighting is one of the most impactful things you can specify. “Soft golden hour light” produces a very different result than “harsh midday sunlight” or “cool studio lighting with a ring light.”

Surface material — specify what the product is sitting on: “weathered oak table,” “brushed aluminum surface,” “folded linen napkin,” “wet stone.”

Negative prompts — Imagen 2’s Vertex AI interface supports negative prompts. Use these to exclude things you don’t want: “no text, no watermarks, no other products, no hands, no distortion.”

Photography style terms — phrases like “product photography,” “editorial,” “lifestyle photography,” “flat lay,” “hero shot” signal the composition and treatment you’re after.

Prompts to Avoid

Some prompt patterns consistently produce poor results:

  • Too abstract — “Make it look premium” doesn’t give the model anything to work with. Describe what “premium” means visually.
  • Conflicting cues — “Rustic outdoor scene with ultra-modern minimalist aesthetics” sends conflicting signals.
  • Overly complex scenes with many specific elements — asking for twelve specific props, a specific window, a dog in the background, and a particular wall color tends to produce a cluttered mess where elements compete.
  • Describing the product itself in the scene prompt — if you’re providing a reference image, don’t re-describe the product. Focus only on the environment. Describing the product again can cause the model to generate a slightly different version.

Iterating on Prompts

Plan for iteration. Even well-crafted prompts rarely produce a perfect image on the first generation. A practical approach:

  1. Generate 4–8 images from a single prompt
  2. Identify what’s working (good lighting, right surface, correct mood)
  3. Identify what’s not working (wrong scale, clashing tones, edge artifacts)
  4. Refine the prompt and regenerate
  5. Repeat 2–3 times until you have a strong output

Keep notes on prompts that produced good results for specific product categories. Over time, you’ll build a library of reliable starting prompts for your product types.


Step-by-Step: Running a Product Photography Session with Imagen 2

Here’s a concrete walkthrough of using Imagen 2 for a product — in this case, a glass water bottle.

Step 1: Prepare Your Source Image

  • Start with a hero product photo: glass water bottle centered on white background, natural overhead/side lighting
  • Run background removal via remove.bg or Photoshop
  • Clean up any fringe artifacts manually
  • Export as PNG with transparent background at 2000x2000px minimum

Step 2: Access Imagen 2

You can access Imagen 2 through several channels:

Via Google Vertex AI:

  • Go to console.cloud.google.com
  • Enable the Vertex AI API for your project
  • Navigate to Vertex AI > Model Garden > Imagen
  • Use the “Edit Image” or “Generate” endpoints depending on your use case

Via Gemini API:

  • Access through Google AI Studio (aistudio.google.com)
  • Use the imagegeneration@006 model or the latest Imagen 2 version
  • API calls accept an image upload + text prompt

Via no-code platforms:

  • MindStudio and similar platforms have Imagen 2 integrated directly
  • No API keys or cloud project setup required
  • Easier to batch-process multiple products and scenes

Step 3: Set Up the Image Editing Task

In Vertex AI:

  • Select “Edit Image” (not “Generate”)
  • Upload your transparent-background PNG
  • Set the editing mode to “Product Image Generation” or use the inpainting/outpainting functionality
  • Input your scene prompt

Step 4: Run Initial Generations

Run 4–8 variations at once. In Vertex AI, you can set the sample count parameter to generate multiple outputs from one prompt. This saves iteration time.

Review outputs for:

  • Product integrity (does it still look like your product?)
  • Edge blending (do the product edges blend naturally into the scene?)
  • Lighting consistency (does the product’s lighting match the scene’s light source?)
  • Scene quality (does the background look realistic and match your brand aesthetic?)

Step 5: Refine and Select

Shortlist 2–3 candidates per scene type. For borderline outputs:

  • Minor edge artifacts: fix in Photoshop or Canva
  • Color inconsistency in the scene: add color grading in post
  • Scale issues (product too large or small relative to scene): crop/resize to reframe

Step 6: Quality Check Against Brand Standards

Before any image goes to your product listing or marketing materials:

  • Check color accuracy against your brand’s color palette
  • Verify product proportions look natural (distortion is a common Imagen 2 issue with irregular shapes)
  • Confirm no text or logo on the product has been altered or distorted
  • Check that the image resolution and aspect ratio meet platform requirements (Amazon, Shopify, Instagram all have different specs)

Use Cases Across E-Commerce Categories

Imagen 2 performs differently across product categories. Here’s a practical breakdown of where it excels and where it struggles.

Apparel and Footwear

Works well for:

  • Flat lay photography where the garment is laid out and surroundings are generated
  • Footwear placed on various surfaces (concrete, grass, studio floor)
  • Accessories placed in lifestyle contexts

Challenges:

  • Fabric texture and drape are hard to preserve perfectly when placing clothing in new environments
  • Complex garment shapes (ruffles, complex folds) often get subtly distorted
  • Color matching for textiles can drift — a dusty blue hoodie might render as slightly different shades across generations

Practical tip: For clothing, focus on flat lay and product-only shots for Imagen 2 generated scenes. For on-model shots, you’re likely better off with traditional photography or dedicated virtual try-on tools.

Beauty and Skincare

This is one of Imagen 2’s strongest categories for e-commerce. Bottles, tubes, compacts, and dropper bottles have defined shapes that hold up well across generations.

Works well for:

  • Minimal marble or bathroom counter scenes
  • Lifestyle flat lays with botanical props
  • Seasonal scenes (holiday, summer, etc.)
  • Stack/arrangement shots with multiple products

Challenges:

  • Transparent or partially transparent packaging (clear serums, glass bottles) can lose transparency and look opaque in some generations
  • Reflective metallic packaging can look inconsistent

Practical tip: Shoot products in slightly angled 3/4 view rather than straight-on. This gives Imagen 2 better edge data to work with and produces more natural-looking placements.

Food and Beverage

Works well for:

  • Packaged goods placed on kitchen counters, wooden tables, or café contexts
  • Canned or bottled beverages in lifestyle settings
  • Branded cups and containers in café scenes

Challenges:

  • Actual food (not packaged) is notoriously difficult — Imagen 2 tends to generate food that looks slightly artificial or styled in a way that reads as “AI-generated”
  • Liquids in transparent containers can behave unpredictably

Practical tip: For packaged products, this category works very well. For images where the actual food is the hero, consider using AI generation for backgrounds only and keeping food photography traditional.

Home Goods and Furniture

Works well for:

  • Small to medium items: candles, ceramics, kitchen tools, decorative objects
  • Room setting shots for small items

Challenges:

  • Large furniture is difficult because scale relationships are hard to get right
  • Anything requiring perspective matching (a sofa against a specific wall angle) often produces spatially inconsistent results

Practical tip: Use Imagen 2 for accent and detail shots of furniture pieces (close-up of texture, small accessories nearby) rather than full room staged shots. For full room staging, specialized AI tools like Virtual Staging AI are built specifically for that use case.

Electronics

Works well for:

  • Lifestyle placements (laptop on desk, earbuds near a workout bag)
  • Hero shots with minimal background manipulation

Challenges:

  • Small text, buttons, and screen content on devices often get distorted or regenerated incorrectly
  • Reflective screens can render with inconsistent reflections

Practical tip: Mask out any screens before using Imagen 2. Generate the scene around the device, then composite the actual screen content back in using photo editing software.


Maintaining Brand Consistency at Scale

One of the practical problems with using generative AI for product photography is maintaining consistent look and feel across a catalog. If every product image looks like it was generated from a different aesthetic, your product pages look incoherent.

Define a Scene Library Before You Start

Before generating images at scale, define 5–10 scene templates for your brand. For each scene, document:

  • Full prompt text
  • Any negative prompts
  • Target aspect ratio and resolution
  • The aesthetic category (e.g., “minimal studio,” “lifestyle outdoors,” “seasonal warm”)
  • Sample reference images to show the target look

Test each scene template across 3–5 different products before committing to it. Some scenes that look great for one product category won’t work for another.

Use Consistent Prompt Language

Slight variations in how you describe the same scene can produce noticeably different outputs. Standardize your prompt language:

  • Decide on consistent terms for lighting (e.g., always use “soft natural light from the left” rather than switching between “natural lighting” and “window light”)
  • Keep surface descriptions consistent within scene categories
  • Lock in your photography style terms and use them in every prompt for that scene type

Post-Processing for Brand Cohesion

Even with consistent prompting, AI-generated images will have subtle variations in color temperature, contrast, and saturation. Applying a consistent Lightroom preset or color grade to all product images before publishing helps unify the look.

Create a simple preset that:

  • Aligns white balance to your brand’s typical image temperature
  • Applies consistent contrast and shadow treatment
  • Gently harmonizes color saturation to match your brand palette

This 30-second post-processing step on each image makes a significant difference in how unified your catalog looks.


Automating Product Photography Workflows with MindStudio

Running individual images through Imagen 2 manually works fine when you have ten products. When you have a hundred — or when you’re adding new SKUs every week — manual processing doesn’t scale.

MindStudio’s AI Media Workbench includes direct access to Imagen 2 (and other image generation models) as part of a broader visual production environment. You can chain together the steps that currently require manual coordination:

  1. Ingest product images from a folder, Airtable, Shopify, or Google Drive
  2. Remove backgrounds automatically as a workflow step
  3. Generate scene variations using Imagen 2 with pre-configured prompt templates
  4. Run quality checks or sort outputs by confidence score
  5. Push final images to your product catalog, Slack for review, or an asset management system

For a business that regularly needs to generate product imagery — a brand with seasonal campaigns, a marketplace with many seller SKUs, an agency working across multiple clients — this kind of workflow removes most of the manual overhead.

The workflow builder in MindStudio is visual and no-code. You connect steps by dragging and dropping, configure prompts and parameters per step, and set triggers (manual run, scheduled, webhook, or when a new file appears in a connected folder).

You can also use MindStudio to run different scene prompt variations across a product at once, then output a structured review where someone approves or rejects each variation before it moves to the next stage. This keeps humans in the loop without requiring them to manually run each generation.

If you’re already doing AI-assisted content work across your business — copy generation, social media, product descriptions — centralizing image generation in the same platform simplifies your stack. MindStudio handles 200+ AI models in one place, so your product photography workflow can sit alongside your copywriting and SEO workflows without needing separate tools and accounts.

You can try MindStudio free at mindstudio.ai.


Common Problems and How to Fix Them

Imagen 2 is capable, but it’s not perfect. Here are the failure modes you’ll run into most often and what to do about them.

Product Edges Don’t Blend Into the Scene

Symptom: There’s a visible halo or hard edge around the product where it meets the generated background. The product looks “pasted in.”

Causes:

  • Imprecise background removal (fringe pixels from the original background are still present)
  • High contrast between the product edge and the generated scene
  • The model didn’t blend lighting consistently between product and scene

Fix:

  • Improve background removal quality before input
  • In the prompt, specify that the product should cast a natural shadow: “product casting a soft shadow on the surface”
  • Use the feathering or edge-softening option if your platform supports it
  • In post-processing, use Photoshop’s blend mode tools to soften the product edge

Product Details Are Distorted or Changed

Symptom: The label text is wrong, a logo has changed, the product shape looks subtly different, or color has drifted.

Causes:

  • The model is generating a “similar” product rather than placing the actual product
  • High-frequency detail (fine text, complex patterns) is being smoothed or reinterpreted

Fix:

  • Use the “reference image strength” or similar parameter to increase fidelity to the input image
  • Mask text and logo areas before generation, then composite them back after
  • For color drift, do color correction in post-processing rather than trying to solve it in the prompt

Lighting Doesn’t Match

Symptom: The product looks like it was lit from a different direction or with different quality of light than the generated background.

Causes:

  • The source product image has strong directional lighting that doesn’t match the generated scene’s light direction
  • Studio lighting in the product image (soft, even, shadowless) doesn’t look natural in lifestyle scenes

Fix:

  • Specify light direction explicitly in prompts: “soft light from the upper left”
  • Shoot your source product images with natural (not studio) lighting when possible, especially if you plan to use lifestyle scene generation
  • In post-processing, add a subtle shadow or light overlay on the product to match the scene’s lighting

Generated Scene Looks Artificial

Symptom: The background looks like an AI image — slightly too perfect, surfaces that aren’t quite real-looking, inconsistent depth of field.

Causes:

  • Over-specified prompts that produce artificial perfection
  • The model defaulting to a “stock photo” aesthetic
  • Low variation in seed/generation parameters

Fix:

  • Add slight imperfections to your scene prompts: “slightly worn wooden table,” “subtle dust on the surface,” “a few natural marks”
  • Include depth of field cues: “foreground blur, background slightly out of focus”
  • Generate more variations and select the most realistic-looking one rather than taking the first output

Scale and Proportion Are Off

Symptom: The product looks too large or too small relative to the scene. A coffee mug looks the size of a dinner plate.

Causes:

  • The model doesn’t have perfect spatial reasoning about product sizes
  • Unusual camera angles or perspectives in the source image

Fix:

  • Include scale references in prompts: “a small 12oz glass bottle” or describe the surface it’s on with size context
  • Crop and reframe the output rather than trying to regenerate — sometimes a slightly different crop makes the proportions read correctly
  • Avoid dramatic perspective angles in source images; straight-on or gentle 3/4 angles give the model better reference data

Frequently Asked Questions

Is Imagen 2 the same as Imagen 3?

Imagen 2 and Imagen 3 are sequential versions of Google DeepMind’s image generation model. Imagen 3, announced in 2024, offers improved photorealism, better prompt adherence, and reduced artifacts compared to Imagen 2. As of 2025, Imagen 3 is available through Google’s Gemini API and Vertex AI, and is the version you’ll access when using Google’s current image generation capabilities. The techniques and workflow in this guide apply to both versions — Imagen 3 simply produces better results with the same approach.

Can Imagen 2 generate product photos with people or models wearing the product?

Imagen 2 can generate images of people, but it’s subject to Google’s safety policies, which include restrictions on generating specific real people. For lifestyle images that include models interacting with or wearing a product, the results can vary in quality and consistency. Dedicated virtual try-on tools (like those built on Google’s own Virtual Try-On model) are better suited for apparel on-model shots. For general lifestyle scenes where a person is holding or using the product in the background, Imagen 2 performs reasonably well.

How much does it cost to use Imagen 2 for product photography?

Pricing depends on how you access the model. Through Vertex AI, image generation is priced per image — as of 2025, pricing is in the range of $0.02–$0.04 per image depending on resolution and the specific operation (generation vs. editing). Through platforms that include Imagen 2 access, pricing is typically wrapped into a platform subscription. At scale — generating hundreds of product images per month — AI-generated product photography costs a fraction of traditional studio photography.

What image formats and resolutions does Imagen 2 support?

Imagen 2 supports JPEG and PNG input formats. For image editing tasks (background replacement, inpainting), input images should be at least 512x512px, though 1024x1024 or higher produces better quality outputs. The model outputs at up to 1024x1024 in standard configurations, with higher resolution available through upscaling tools. For e-commerce platforms that require 2000x2000 or larger images, running outputs through an AI upscaler (like Magnific, Topaz Gigapixel, or Canva’s upscale) is a common workflow step.

Can I use Imagen 2-generated images commercially?

According to Google’s terms for Vertex AI and the Gemini API, customers own the outputs they generate using the API, and commercial use is permitted. You should review the current terms of service for the specific access method you’re using, as these can change. Images generated through third-party platforms that use the Imagen 2 API are typically governed by both Google’s and the platform’s terms.

How do I maintain product color accuracy across multiple Imagen 2 generations?

Color accuracy is one of the more challenging aspects of AI-generated product photography. The most reliable approach is a combination of two things: first, run a color correction step in post-processing to align each generated image to a known reference color value for the product; second, use consistent prompt language that avoids color-shifting scene descriptions (avoid “warm golden light” for products where accurate blues or purples matter — that light will shift those colors visually). Some teams also establish a color grading preset specifically designed to restore product colors to accurate values after generation.


Key Takeaways

  • Imagen 2 is most valuable for scene generation around an existing product image — not full product creation from scratch. Start with a clean, well-shot product image and let the model handle the environment.

  • Input quality determines output quality. Invest time in clean background removal and proper source image preparation before running any generation.

  • Prompts should be specific about environment, lighting, surface, and photography style — and should avoid over-describing the product itself, which causes the model to regenerate rather than place it.

  • Consistency at scale requires discipline: standardize your scene library, use consistent prompt language, and apply post-processing color grading to unify your catalog’s look.

  • Imagen 2 works best for packaged goods, beauty products, and small home goods — and has more challenges with apparel, food, and large furniture.

  • Automating the workflow with a platform like MindStudio removes the manual overhead once you’re working with more than a handful of products. You can build an end-to-end pipeline from image ingestion to final delivery without writing any code.

For teams ready to move beyond manual image generation and build a scalable AI-powered product photography operation, MindStudio is worth exploring. You can build your first automated workflow for free and see how it fits your catalog process before committing.