Skip to main content
MindStudio
Pricing
Blog About
My Workspace

Bach Model vs. LTX 2.3 + IC Loras: Which Gives You Better Character Consistency?

Bach targets facial consistency out of the box. LTX 2.3 with IC and ID Loras does it in ComfyUI. Here's which approach actually holds up.

MindStudio Team RSS
Bach Model vs. LTX 2.3 + IC Loras: Which Gives You Better Character Consistency?

Two Ways to Chase Character Consistency in AI Video — and Only One Is Actually Accessible

Bach (Video Rebirth) versus LTX 2.3 with IC Loras, ID Loras, and a ComfyUI prompt relay workflow: these are the two most interesting answers right now to the same stubborn problem in AI video. That problem is keeping a character’s face consistent across shots without spending a fortune on Seedance 2.0 or hand-stitching every cut in post.

You’ve probably felt this. You generate a great first shot, then the second shot has a slightly different nose. The third shot is a different person entirely. Every model promises consistency. Almost none deliver it reliably. So when two genuinely different approaches to solving this land in the same week, it’s worth pulling them apart carefully rather than just watching the demo reels.

The comparison here is not Bach versus LTX 2.3 as raw models. It’s Bach’s out-of-the-box character consistency pipeline versus the IC Loras + ID Loras + prompt relay ComfyUI workflow that’s been circulating on r/stablediffusion — specifically the one posted by user brief_8831, available as a JSON drag-and-drop on Civitai. Two philosophies. Two very different onboarding experiences. Meaningfully different results depending on what you’re actually trying to do.


What “Character Consistency” Actually Means in Practice

Before comparing tools, it’s worth being precise about what you’re measuring. “Character consistency” gets used loosely, and the looseness lets a lot of mediocre tools hide behind it.

There are at least three distinct things people mean when they say it:

Facial identity lock. Does the character’s face stay recognizably the same person across multiple shots? Not just the same hair color — the same bone structure, the same eyes, the same proportions.

Style and costume continuity. Do the clothes, the color palette, the visual style hold across cuts? This is a separate problem from facial identity and often breaks independently.

Temporal coherence within a single clip. Does the face drift mid-shot? Does the character’s tattoo appear and disappear? Does the camera movement introduce artifacts that break the illusion?

Bach is primarily targeting the first of these. The IC Loras + ID Loras + prompt relay workflow is trying to solve all three simultaneously, which is why it’s more complex and why the ComfyUI setup is genuinely intimidating.

The distinction matters because if you only need facial identity across shots and you’re comfortable with some style drift, Bach might be enough. If you need a coherent visual world — same character, same look, same feel, shot after shot — you’re in ComfyUI territory whether you like it or not.


The Dimensions That Actually Separate These Approaches

Accessibility. Bach is a web platform. You log in, upload an image, write a prompt, generate. The IC Loras workflow is a ComfyUI JSON that you drag into your node editor, then spend time resolving missing nodes, downloading the right model weights, and debugging whatever errors your specific setup throws. These are not comparable experiences.

Cost structure. Bach charges $12/month for 800 credits, $40 for 3,000, or $108 for 120,000. LTX 2.3 can be run locally for free if you have the hardware, or via API at rates that are substantially lower than Seedance 2.0 — which is rumored to be a 200-billion parameter model, which would explain both why it produces the best results and why it costs what it does.

Output resolution and duration. Bach currently offers 720p or 1080p at 6 seconds per clip on the free tier. The LTX 2.3 workflow doesn’t have those hard limits, though your local VRAM and patience are the practical ceiling.

Consistency mechanism. Bach is doing this at the model level — it’s trained or fine-tuned to prioritize facial and character consistency. The IC Loras approach is doing it at inference time through in-context conditioning: you’re feeding the model reference frames and identity information via the Lora structure, then using prompt relay to lock style across the timeline. These are architecturally different bets.

Ceiling. Bach’s ceiling is Bach’s model. The LTX 2.3 workflow’s ceiling is whatever LTX 2.3 can do, plus however well the IC and ID Loras are trained, plus how carefully you’ve constructed your ComfyUI graph. Higher ceiling, much higher floor of required skill.


Bach: What It Actually Does and Where It Breaks

Bach from Video Rebirth is a new model — it literally just dropped — so any assessment here is a preview, not a verdict. The pitch is simple: consistent characters, no complicated setup, web interface, done.

Cursor
ChatGPT
Figma
Linear
GitHub
Vercel
Supabase
remy.msagent.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

The early tests are instructive. Anne Hathaway, run through Bach, comes out looking like Anne Hathaway almost all the way through. That’s actually a meaningful result. Most models drift significantly on celebrity likenesses because the training data is inconsistent. Bach holds the face.

But Lucy Liu tells a different story. The frontal shots hold. The side profile breaks. The crying scene recovers. What you’re seeing is that Bach’s consistency mechanism is strongest when the face is in a favorable orientation — roughly frontal, reasonably lit — and degrades on profiles and extreme angles. That’s not a Bach-specific failure; it’s the hard version of the problem that no model has fully solved. But it does mean you’ll still be cutting around bad shots even with a model that’s explicitly designed to avoid them.

The montage feature is interesting in theory — it looks like it’s designed to string consistent-character clips together — but the free tier ran out of credits before a full test was possible. The style presets (real estate, movie trailer, social media) suggest Bach is targeting content creators who want a fast pipeline, not researchers or power users.

The 6-second duration limit on the free tier is a real constraint. Whether that extends on paid tiers isn’t confirmed, but $12/month for 800 credits at 6 seconds per clip is a limited production budget. If you’re making anything longer than a short social clip, you’re doing math on credit costs pretty quickly.

One thing Bach doesn’t have to worry about: the LTX 2.3 limitation where shots shorter than roughly 2 seconds cause lip-sync problems. That’s a known issue with LTX that requires a workaround — either halftiming short clips or adding a 2-second tail and cutting it in post. Bach sidesteps this entirely because it’s a different model with different architecture.


LTX 2.3 + IC Loras + ID Loras: The Power User Path

The brief_8831 workflow on r/stablediffusion is the most complete public implementation of what LTX 2.3 can do for character consistency right now. It combines three things: IC Loras (in-context Loras, which condition the model on reference frames), ID Loras (identity Loras, which transfer facial and voice identity), and a prompt relay custom node that locks style and timing across the generation timeline.

The results in the demo are genuinely impressive. The Terminator-style GPT-6 comedy bit that circulated from this workflow shows consistent character rendering across multiple shots with coherent style — the kind of thing that would have required significant post-production work to achieve even six months ago.

But here’s the honest part: if you have ComfyUI anxiety, this workflow will not cure it. It will make it worse. The JSON is complex. The node graph is large. When you drag it in, you will get errors. You will need to resolve missing custom nodes, download the right model checkpoints, and probably spend an afternoon getting it to run before you generate a single frame. That’s not a criticism of the workflow — it’s a description of the reality of ComfyUI power-user territory.

The workflow is available on Civitai as a JSON file. Drag it into ComfyUI and it populates. That’s the easy part. Everything after that requires familiarity with the ecosystem.

RWORK ORDER · NO. 0001ACCEPTED 09:42
YOU ASKED FOR
Sales CRM with pipeline view and email integration.
✓ DONE
REMY DELIVERED
Same day.
yourapp.msagent.ai
AGENTS ASSIGNEDDesign · Engineering · QA · Deploy

The LTX 2.3 video-to-video controls — pose, depth, and edge modes — add another layer of capability that Bach doesn’t have. Pose mode applies your input video’s movement to a static image. Depth mode preserves camera movement for motion shots. Edge mode preserves the structural edges of the input. These aren’t part of the IC Loras workflow specifically, but they’re available in LTX Studio right now and will presumably hit the open-source release soon. The combination of video-to-video controls plus IC Loras conditioning is where LTX 2.3’s ceiling actually lives.

The Starship Troopers test from the source material is a good illustration of both the capability and the workarounds required. To get consistent facial identity, the clip had to be run backwards — because the original footage starts with establishing shots (a character’s back, someone running toward camera) rather than a clear face. The video-to-video process needs a strong facial reference at the start of the clip. Running it backwards puts the face-forward shot first, generates consistently, then you flip it back. It works, but it’s not obvious, and it’s the kind of thing you only know if you’ve spent time with the model’s failure modes.

The cost advantage of LTX 2.3 is real. Running it locally costs you electricity and VRAM. API calls are cheap compared to Seedance 2.0. For anyone building a production pipeline around character-consistent video, that cost differential compounds fast. If you’re curious about how production pipelines get built around AI-generated video assets, the animated 3D websites with Claude Code and AI video generation post covers how these components get assembled into something deployable.


The Workflow Complexity Tax

There’s a real cost to the LTX 2.3 path that doesn’t show up in the credit pricing: your time.

Getting the IC Loras + ID Loras + prompt relay workflow running is not a one-hour project if you’re not already deep in ComfyUI. It’s a weekend project, possibly longer. And once it’s running, you’re still managing a complex graph where small changes can break things in non-obvious ways.

Bach’s entire value proposition is that it eliminates this tax. You pay in dollars instead of hours. Whether that trade is worth it depends entirely on your situation.

If you’re a solo creator making social content, Bach’s $12/month entry point and web interface are probably the right call. You’ll get 80% of the consistency benefit with 10% of the setup friction.

If you’re building a production pipeline — say, a character-driven series, a training dataset, or an automated content system — the LTX 2.3 workflow’s flexibility and lower per-generation cost justify the upfront investment. Platforms like MindStudio handle this kind of orchestration across 200+ models and 1,000+ integrations, which matters when you’re chaining video generation with other AI steps in a larger workflow rather than treating each generation as a one-off.

The open-source angle also matters for the LTX 2.3 path in a way it doesn’t for Bach. LTX 2.3 is open-source. The IC Loras and ID Loras are community-developed. The prompt relay node is a community contribution. You own the stack. Bach is a hosted service. If Video Rebirth changes pricing, changes the model, or shuts down, your pipeline breaks. With LTX 2.3, you have the weights.


Verdict: Which One for Which Situation

Use Bach if: You need character consistency for social-length content (under 6 seconds per clip), you don’t want to touch ComfyUI, and you’re comfortable with a hosted service. The $12/month entry point is reasonable for occasional use. Expect to cut around profile shots and extreme angles — the consistency holds best on frontal, well-lit faces.

Use the LTX 2.3 + IC Loras workflow if: You’re building anything longer than a few clips, you need style consistency in addition to facial identity, you want to run locally or keep API costs low, and you’re willing to spend the time getting ComfyUI configured correctly. The ceiling is substantially higher. The floor requires real work to reach.

Use neither if: You need the absolute best results right now and cost isn’t the constraint. Seedance 2.0 is still the best video model available, and its rumored 200B parameter scale explains why. The IC Loras workflow running on LTX 2.3 is impressive for what it costs. It is not Seedance 2.0.

One thing worth watching: the IC Loras + ID Loras approach will eventually filter down to platforms that abstract away the ComfyUI complexity. That’s how this ecosystem works — a workflow that requires a weekend to configure today becomes a one-click feature in six months. When that happens, the LTX 2.3 approach wins on every dimension except the ones where Bach has specifically trained for consistency. The question is whether Bach’s model-level consistency advantage holds up once the workflow complexity tax disappears from the LTX side.

For anyone building AI video workflows and thinking about how to structure the surrounding tooling — the prompt engineering, the model selection, the output routing — the how to generate an AI video from an image tutorial covers the basics of getting a video generation pipeline running, and the GPT-5.4 vs Claude Opus 4.6 comparison is useful context for thinking about which language model you’d use to drive prompt generation in an automated pipeline. And if you’re thinking about building a spec-driven system around any of this — something that takes character descriptions and outputs structured generation parameters — Remy compiles annotated markdown specs into full-stack TypeScript applications, which is one way to turn a video generation workflow into a deployable product rather than a local script.

The character consistency problem in AI video is not solved. Bach makes a real attempt at solving it at the model level. The IC Loras workflow makes a real attempt at solving it at inference time. Both are partial answers. Both are worth knowing about. Pick the one that matches your actual constraints, not the one with the better demo reel.

Presented by MindStudio

No spam. Unsubscribe anytime.