How to Audit Your Job for AI Risk in 10 Days: The TCLD Framework Explained

One Sitting, 10 Days, Four Buckets — Know Exactly Where You Stand

Your calendar is full. Your manager is happy. The work is getting done. And you have a quiet, nagging suspicion that less and less of it actually needs you.

That suspicion is worth taking seriously. Not because the apocalypse is coming — the macro data doesn’t support that story — but because the pattern of how AI changes knowledge work is already visible if you know where to look. OpenAI and University of Pennsylvania researchers estimate that roughly 80% of US workers could have at least 10% of their tasks affected by language models. Anthropic’s Economic Index found that 49% of jobs have already had at least a quarter of their tasks performed using Claude. These aren’t projections. They’re current measurements.

The TCLD audit framework — tag every calendar item over 10 business days into Theater, Commodity, On-the-Line, or Durable buckets — is the fastest way to get an honest picture of your own exposure. One sitting. No consultants. No survey. Just your actual last two weeks, sorted into four categories that tell you which parts of your job are on thin ice and which parts are worth building around.

Here’s the full method.

What You’ll Know When You’re Done

The output of this audit is not a score. It’s a ratio.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

Theater plus Commodity is the fraction of your current week where your personal claim on the work is weakest. That doesn’t mean the work disappears tomorrow. It means AI can already do a reasonable version of it, and when the next budget freeze or reorg forces the question — “why is this role bundled this way?” — those are the hours that won’t have a good answer.

The travel agent pattern is instructive here. Expedia didn’t erase that profession overnight. The routine booking layer eroded first. Nothing looked broken from the outside. The visible break came later, during downturns, when the industry was forced to admit what had already changed. A lot of knowledge work is sitting in exactly that lag right now. If you’re curious how AI agents are already reshaping personal productivity workflows, 6 AI Agents for Personal Productivity is a useful companion read.

The audit gives you the chance to see the lag before the organization sees it for you. That’s the whole point.

What You Need Before You Start

No special tools required. You need access to:

Your calendar for the last 10 business days
Your sent email folder
Your Slack DMs (or Teams, or whatever your org uses)
Any docs, tickets, code commits, spreadsheets, or memos where your actual work lives

That’s it. The audit runs on the raw material of your real week, not a cleaned-up version of it.

One honest prerequisite: you need to be willing to tag things accurately. The audit is private — no one sees it but you — but it only works if you don’t round up. The temptation is to tag things as more durable than they are because your professional identity is often built around the durable parts. Resist that. The whole value of the exercise is in the cold count.

If you want AI assistance with the tagging, it’s possible but requires chunking. A tool like Codex with computer use can help process calendar items or email threads, but you’ll need to feed it your definitions clearly and work in pieces — a two-week span across multiple UIs is hundreds of objects, too many for a single pass. The more useful move is to do the first pass yourself and use AI to help you spot patterns in what you’ve tagged. For a deeper look at how agents and workflows can be structured to assist with exactly this kind of pattern recognition, the WAT Framework breakdown is worth reading before you start.

The Four-Bucket Method, Step by Step

Step 1: Open everything at once

Open your calendar. Open sent email. Open Slack. Open whatever medium contains your actual work output. Have them all visible or tabbed.

You’re going to go line by line, item by item. The unit of analysis is not the project, not the role, not the week. It’s the individual item: one meeting, one memo, one decision, one conversation, one document. Tag each one separately.

Now you have your raw material in front of you.

Step 2: Tag T — Theater

T is work that exists because the organization performs it, not because it produces examined value.

The test: if this work disappeared and the main consequence was that the organization had to admit it had been performing rather than producing, tag it T.

The status meeting where nothing changed. The deck nobody read carefully. The alignment call that produced no alignment but let everyone say alignment was attempted. The recurring update that gets sent because someone asked for it 18 months ago. The review process that once solved a real problem, but the problem is gone.

Theater is the hardest category to use honestly because tagging something T means admitting you spent professional time on something that didn’t need to happen. It’s uncomfortable. Do it anyway.

Microsoft researchers looked at 200,000 Bing Copilot conversations and found the most common work people bring to AI is gathering information and writing. Theater was already operating below the threshold of real human attention — if no one was reading the deck closely, a model can write the deck. AI doesn’t need to make theater great. It only needs to make it adequate, because adequate is what theater already was.

Now you have your T items tagged.

Step 3: Tag C — Commodity

C is real work that produces real value. It just doesn’t need you specifically.

Summarizing, routing, applying known rules to known situations, coordinating on things already decided, writing a status report someone does read but anyone could have written, turning a meeting into next steps, producing the first draft of a document where the shape is already well-known.

The test: could you write a spec and have someone else in your org produce an output that’s roughly as useful? If yes, it’s probably C.

This one stings differently than T. Commodity work often took years to get good at. You learned how to summarize messy context, how to route decisions, how to write the update that calms people down. That was real skill. The problem is that a skill can be real and still become less scarce. Markets don’t protect a skill because it once took a decade to build. They protect whatever is scarce now.

Now you have your C items tagged.

Step 4: Tag L — On the Line

L is the uncomfortable middle. Work that doesn’t cleanly fit into commodity or durable.

Pattern recognition where the patterns are structured. Relationship management that depends on history you carry. Editorial calibration in an established format. Routine synthesis across familiar inputs. Work that used to feel hard and now feels a little too repeatable.

The signal for L: a strong junior person could do 70% of it, the last 30% feels like yours, but if someone asked you to explain exactly what judgment you applied, you’d struggle to articulate it.

Don’t overthink the L tags. If you’re agonizing over an item, tag it L and keep moving. The point isn’t precision — it’s seeing where the line is moving. Some L work will drift toward C. Some will become durable. Most of your career anxiety is probably sitting right here, because L is where your professional identity and the moving frontier of AI capability are currently colliding.

Now you have your L items tagged.

Step 5: Tag D — Durable

D is work where the output depends on something you cannot fully describe in advance.

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE

AUTH

roles · sessions · tokens ✓ LIVE

DEPLOY

git-backed, live URL ✓ LIVE

Real backend. Real database. Real auth. Real plumbing. Remy has it all.

You changed the question more than you answered it. You read what was going on in a room. You saw that the stated problem was not the real problem. Your presence in the conversation visibly changed the outcome in a way that goes beyond competence or speed.

The key distinction is question-holding versus question-answering. Most organizations reward question-answering: someone asks for a plan, you make the plan. That’s commodifiable — the question is already a given, the frame is set, the output can be judged against the prompt. Durable work often starts before that. It starts when someone asks a question and the right move is to say “I think we’re asking the wrong question.” That’s a different skill, and it’s uncomfortable, because the system wants resolution.

D is not just hard work. Some hard work is commodity work. Some durable work looks almost invisible from the outside — the bad hire that didn’t get made, the product detour that didn’t consume six months, the customer escalation that didn’t become a crisis. Performance systems are terrible at crediting avoided damage. But avoided damage is often where senior judgment lives.

Tag D quickly on first instinct. If you’re unsure, it’s probably L.

Now you have your D items tagged.

Step 6: Count and read the ratio

Add up your T, C, L, and D counts.

T + C = your thin ice number. This is the fraction of your week where your personal claim on the work is weakest.

D = the fraction worth building your career around.

L = the work in motion — watch which direction it drifts over the next few months.

Based on how this audit tends to go: your T number will be bigger than you want it to be. Your C number will also be bigger than you want. Your D number will be smaller than you expected, not because you have no durable work, but because your week isn’t organized around it.

That’s the finding. Now you have an honest picture.

The Real Failure Modes

Undercounting theater. The most common mistake. People confuse “this was professionally expected” with “this created value.” Those are not the same thing. Something can be professionally expected and still be theater. If you find yourself tagging almost nothing as T, you’re probably rounding up.

Overcounting durable work. Your professional identity is often built around the D category. The audit asks a colder question: how many hours of the last two weeks did you actually spend on that? Not how much of your self-image depends on it. How many hours? For most people, the answer is less than they thought.

Treating the audit as a verdict. It’s not. It’s a starting point. A high T+C ratio doesn’t mean you’re bad at your job. It means your week isn’t organized around the work that compounds to you rather than to the organization or to nothing.

Doing it once and stopping. The audit is most useful as a periodic check, not a one-time event. The L bucket is the one to watch — items that feel like judgment today may feel like commodity in six months as the tools get better.

Paralysis after the count. The audit is designed to produce action, not anxiety. If you finish and feel stuck, the next move is simple: pick one T item and stop doing it. Watch what happens. Usually nothing happens. That’s the point.

What to Do With the Results

Six moves, roughly in order.

Stop performing theater you can stop. Start with the theater that exists by inertia — the recurring report nobody reads, the check-in that made sense two years ago. Cancel it. Send a short version. Watch what happens. Most of the time, nothing happens.

Don’t pour recovered time into more commodity work. This is the trap. AI helps you write the update faster, so you write more updates. You become twice as productive at the part of your job whose value is collapsing. The better move is to put recovered time into harder cases — projects where the framing is unclear, not just the execution.

Build a private track record of durability calls. At the end of every week, write down one call you made where the outcome depended on judgment you can’t fully reduce to rules. After a year, you have roughly 50 entries. After three years, you have a portfolio of judgment. When someone asks why the work should come to you instead of a cheaper process, you have evidence.

Use that record to gradually refuse commodity work. You can’t simply declare “I don’t do routine work anymore.” First become visibly valuable on your non-routine work. Then use that value to renegotiate the routine load. This usually happens through project selection before it happens through formal authority.

Make durable work partially legible. Durable work has to be visible enough that the system values it, but not so specified that the system can run it without you. Talk about outcomes: “I was concerned we were solving the wrong problem, and I got us to have the conversation. We changed the plan.” That’s a visible claim. It doesn’t turn your judgment into a recipe. The mechanism stays in your head. The outcome gets credited.

Move if the role has no durable path. If most of your week is T and C, and the L work is drifting toward C with no realistic path to build meaningful durable skills inside the role, the answer may not be better time management. Some roles are theater-heavy because the organization is theater-heavy. Some roles were designed for an earlier era. If you’re evaluating a new role, don’t read the job description — look at the people actually doing it. Ask what they spent time on last week. Ask what calls they made that couldn’t have been made by a process.

Where to Take This Further

The audit tells you where you are. The harder question is what to build toward.

Durable work compounds differently than commodity work. Commodity work compounds to the organization — if you build a better routing system or summary pattern, the institution gets smarter and the tool absorbs the pattern. Durable work compounds to you. The pattern recognition you build from holding ambiguous questions for years doesn’t live entirely in a document or a prompt. It lives in your calibration, in the scar tissue of having been wrong before.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

There’s a power law at work here. A handful of hours across a career are extraordinarily impactful — the moment you made the right call in a crisis, the time you saw that the strategy sounded good but wouldn’t survive contact with reality. The rest is support. The audit helps you see how much of your current week is support versus the work that actually compounds.

If you want to build agents that help with the commodity layer — automating the summarizing, routing, and first-draft work that the audit reveals — MindStudio gives you a no-code path to chain models and workflows together, with 200+ AI models and 1,000+ integrations available without writing orchestration code. The commodity work doesn’t disappear; it gets compressed, and the question is what you do with the recovered time.

The Ezra Klein framing is worth sitting with here. His argument in the NYT essay “Why the AI Job Apocalypse Probably Won’t Happen” isn’t that nothing changes — it’s that targeted displacement is harder to handle than mass displacement, because mass unemployment forces restructuring while targeted hollowing-out gets ignored. The travel agent pattern plays out quietly. The audit is how you stop being quiet about it with yourself.

One more thing worth building: if you’re thinking about how to systematize the durable work you do identify — turning judgment calls into structured specs that others can act on — Remy takes a useful approach to that problem. You write a spec in annotated markdown, and it compiles into a complete full-stack TypeScript application: backend, database, auth, and deployment included. The spec becomes the source of truth. It’s a clarifying model for thinking about what can genuinely be specified versus what can’t — which is exactly the question the TCLD audit is asking about your own work.

For a related angle on how AI coding tools are reshaping the spec-to-output pipeline, the Claude Code source code leak breakdown surfaces several features directly relevant to automating the commodity layer of technical work.

The audit is a one-time cost with compounding returns. You do it once, you see the ratio, and you can’t unsee it. Then you make small moves — cancel one theater item, choose one harder project, write three private lines at the end of the week. Small moves compound when they’re aimed in the right direction.

The question after the count is simple: what part of your week are you going to stop defending, and what part are you going to start feeding?

That’s the whole game. Run the audit and you’ll know which is which.