What Is the Agent Discovery Problem? Why AI Agents Need an App Store to Find Each Other

The Proliferation Problem No One Saw Coming

AI agents are multiplying fast. Enterprise teams are building them for customer support, sales outreach, data analysis, contract review, HR onboarding, and dozens of other workflows. And that’s just inside one company.

The multi-agent discovery problem — how one AI agent finds, evaluates, and communicates with another — is quickly becoming one of the most pressing unsolved challenges in enterprise AI. It sounds abstract, but it’s already breaking real workflows. As agent deployments scale, the inability to discover and coordinate across agents creates bottlenecks that cancel out much of the efficiency AI was supposed to deliver.

This article explains what agent discovery actually means, why it’s hard, what an agent-native app store might look like, and where the field is heading.

What Agent Discovery Actually Means

In human organizations, discovery is easy. You ask a colleague, check a directory, or search an intranet. You find the person or tool you need, confirm it can help, and get to work.

Agents don’t have that luxury — at least not yet.

Agent discovery refers to the process by which one AI agent identifies that another agent (or tool, or service) exists, determines what it can do, figures out how to communicate with it, and decides whether to use it. It’s the “yellow pages” problem for AI systems.

The three layers of discovery

Discovery isn’t one problem. It’s at least three:

Existence discovery — Does an agent that can do X exist somewhere I can reach it?
Capability discovery — What exactly can it do? What inputs does it expect? What does it return? What are its limitations?
Trust and quality discovery — Is this agent reliable? Does it produce accurate outputs? Is it safe to invoke in this context?

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

Current AI infrastructure handles almost none of this systematically. Most multi-agent setups are hand-wired — a developer explicitly tells Agent A about Agent B, hardcodes the API endpoint, and documents the schema manually. That works for small systems. It breaks down fast when you have dozens or hundreds of agents across a large organization.

Why This Is Harder Than It Sounds

You might think: can’t agents just call APIs the same way humans use web services? In theory, yes. In practice, several things make agent-to-agent discovery significantly harder.

Agents aren’t static services

Traditional APIs are stable. They have versioned documentation, published schemas, and predictable behavior. Agents are different. They reason, improvise, and produce variable outputs. Two calls with identical inputs might return different results. That’s a feature when you want flexible reasoning — it’s a problem when another agent needs to depend on predictable behavior.

Capability descriptions are hard to formalize

When you describe what a human does, you use natural language with implied context. “She handles contract negotiations” tells a colleague a lot without spelling everything out. Agents need formal capability descriptions that are machine-readable, but also expressive enough to capture nuance.

Current approaches — like OpenAI’s function calling schemas or Anthropic’s tool use definitions — are a start, but they describe individual function signatures, not agent-level capabilities, reliability profiles, or appropriate use cases.

Trust is an unsolved problem

If an agent you’ve never used before offers to process your customer data, should you let it? In human organizations, trust is built through reputation, organizational hierarchy, and compliance processes. Agents don’t have established mechanisms for any of this.

A rogue or compromised agent could theoretically inject malicious outputs into a multi-agent pipeline. Without trust verification, discovery creates a security surface.

The incentive problem

Who maintains the catalog? In human organizations, directories get stale because nobody wants to update them. Agent registries face the same problem — whoever deploys an agent has to describe it accurately, keep the description current, and flag deprecations. That’s overhead that gets skipped when teams are moving fast.

How Multi-Agent Systems Currently Handle This

Most production multi-agent systems today use one of a few approaches, none of which fully solves discovery.

Hardcoded orchestration

The most common pattern: a developer writes an orchestrator agent that explicitly knows about every sub-agent it can call. The orchestrator’s system prompt includes descriptions of available agents, their endpoints, and when to use each one.

This works for closed, well-defined systems. It doesn’t scale. Every time you add a new agent, you update the orchestrator manually. Every time an agent changes its behavior, you update the description manually. It’s the equivalent of maintaining a company directory in a Word document.

Shared tool registries

Some frameworks — LangChain, CrewAI, and others — provide tool registries where agents can register capabilities and other agents can query them. This is a step forward, but registries within a framework don’t communicate across frameworks, and they still rely on accurate manual descriptions.

Model Context Protocol (MCP)

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

Anthropic’s Model Context Protocol is the most significant recent development in this space. MCP is an open standard that lets AI models connect to external tools and data sources through a standardized interface. An MCP server exposes capabilities in a structured, discoverable format. Any MCP-compatible client — Claude, Cursor, or another agent — can find and call those capabilities without custom integration work.

MCP solves part of the discovery problem: it standardizes how capabilities are described and invoked. But it doesn’t yet address trust, quality signals, or cross-organizational discovery at scale. It’s more like a power outlet standard than a full app store.

Agent-to-agent APIs

Some enterprise teams are building proprietary agent-to-agent APIs — essentially internal service buses where agents can query a registry and invoke other agents through a common interface. This is architecturally sound but expensive to build and maintain, and it keeps the system closed to external agents.

What an Agent-Native App Store Would Actually Look Like

The “app store” framing is useful because it captures what’s missing: a centralized (or federated) place where agents can be listed, discovered, evaluated, and invoked.

But an agent app store isn’t just a list of API endpoints. It needs capabilities that traditional app stores don’t.

Structured capability manifests

Every listed agent would publish a capability manifest — a machine-readable description of what it does, what inputs it accepts, what outputs it produces, what its latency profile looks like, and what its failure modes are.

This goes beyond a simple API schema. It’s closer to a nutrition label for agents: what’s in here, how should you use it, and what should you watch out for.

Semantic search, not keyword search

Humans searching an app store type keywords. Agents querying an agent store would describe a task and need semantic matching — “I need something that can extract structured data from unstructured legal text” should return relevant agents even if none of them use those exact words in their title.

This requires embedding-based search over capability descriptions, with enough granularity to distinguish between agents that do similar things in different domains.

Quality and reliability signals

Reviews for agents can’t work the same way as app store reviews. What matters is measurable reliability: success rate, latency distribution, error types, and context-specific performance.

An agent-native app store would need automated quality signals — essentially continuous benchmarking — so that an orchestrator agent can make informed choices between two agents offering similar capabilities.

Trust and provenance verification

Before invoking an agent, a calling system needs to know: who built this? Is it verified? What data does it access? What are its security properties?

This is roughly analogous to app store vetting processes, but adapted for the specific risks of AI agents — prompt injection, data leakage, hallucinated outputs being passed downstream.

Versioning and compatibility signals

Agents evolve. A calling agent needs to know whether the agent it’s depending on has changed in ways that might affect its outputs. Version pinning, deprecation notices, and migration paths are standard in software development — they’d need to be formalized in an agent store context.

Invocation standards

Remy doesn't write the code. It manages the agents who do.

AGENTS ASSIGNED TO THIS BUILD

Remy

Product Manager Agent

Leading

Design

Engineer

Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

Discovery only matters if you can act on it. An agent store would need to pair discovery with a standard invocation protocol — a way to call any listed agent regardless of which framework built it or where it’s hosted. MCP is the leading candidate here, but the ecosystem hasn’t converged yet.

The Federated vs. Centralized Debate

Should there be one agent store or many? This mirrors debates that have played out in web services, package management, and AI model hosting.

The case for centralization: A single authoritative registry makes discovery simple. You query one place, get comprehensive results, and have a single trust authority. This is the Apple App Store model — high friction to list, high trust for users.

The case for federation: Enterprises won’t put proprietary internal agents in a public registry. Organizations need private catalogs with controlled access. A federated model — multiple registries that can query each other with appropriate permissions — preserves privacy while enabling cross-organizational discovery when desired. This is closer to how email or DNS works.

The likely outcome is a hybrid: public registries for general-purpose agents (productivity tools, data enrichment services, communication utilities) and private organizational registries for internal agents, with standard protocols for cross-registry queries when authorized.

How MindStudio Fits Into This Picture

MindStudio is directly relevant here because it already solves part of this problem — specifically the “make your agents discoverable and callable by other AI systems” layer.

When you build an agent in MindStudio, you can expose it as an agentic MCP server. That means any MCP-compatible system — Claude Desktop, Cursor, a custom LangChain agent, or another MindStudio workflow — can discover and invoke your agent through a standard interface. You’re not writing custom integration code. You’re publishing a capability that other agents can find and call.

This is exactly the pattern that a future agent discovery ecosystem needs at scale. MindStudio acts as the hosting and exposure layer: you build the agent visually (the average build takes 15 minutes to an hour), configure what it does, and publish it as a callable capability. The infrastructure — authentication, rate limiting, retries — is handled for you.

For teams building multi-agent workflows, this matters practically. Instead of hardcoding every agent-to-agent dependency, you can compose workflows from agents exposed through standard interfaces, making it easier to swap, upgrade, or extend individual capabilities without rebuilding everything around them.

The Agent Skills Plugin extends this in the other direction: if you’re building agents in Claude Code, LangChain, or CrewAI, you can use the @mindstudio-ai/agent npm SDK to call 120+ typed MindStudio capabilities — agent.searchGoogle(), agent.generateImage(), agent.runWorkflow() — as simple method calls. Your custom agent gains a set of pre-built, reliable capabilities without needing to build or discover them from scratch.

You can try MindStudio free at mindstudio.ai.

The Organizational Challenge Is Bigger Than the Technical One

Here’s something that doesn’t get said enough: the agent discovery problem is as much an organizational problem as a technical one.

Even if you had a perfect technical protocol for agent discovery tomorrow, most enterprises would still struggle because:

No one owns the catalog. Agent inventories fall through the cracks between IT, data science, and individual business units.
Agents are built ad hoc. Most enterprise agents today are built by individual teams solving immediate problems, not as part of a coherent agent architecture.
Governance is missing. Before an agent can be discoverable, someone has to approve it, document it, and take responsibility for its behavior. Most organizations don’t have that process.

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

The teams making the most progress on agent discovery aren’t waiting for a universal standard. They’re building internal agent catalogs now — even simple ones — and establishing conventions for capability documentation. The technical standards will mature; the organizational muscle needs to develop in parallel.

Where This Is Heading

Several forces are pushing toward a more structured agent discovery ecosystem in the near term.

Protocol convergence: MCP has significant momentum behind it, and major AI labs are building MCP support into their products. As it becomes the default way to expose agent capabilities, the discovery problem gets more tractable — at least within MCP-compatible systems.

Agent marketplaces: Several companies are building or planning agent marketplaces — curated directories of pre-built agents for specific tasks. These are early versions of what a full agent app store might look like. They solve discovery for general-purpose agents but don’t yet address the harder problems of trust verification or organizational-scale orchestration.

AI-native operating systems: Some researchers and builders envision an “agent OS” layer that manages agent discovery, orchestration, and resource allocation the way an operating system manages processes and memory. This is further out, but the architectural thinking is useful.

Standardized agent identity: Just as web services developed OAuth for identity and authorization, agent systems will likely develop standardized agent identity protocols — a way for one agent to verify the identity and permissions of another before accepting its outputs.

The honest answer is that we’re still in early innings. The infrastructure for large-scale agent discovery doesn’t exist yet in mature form. But the outlines of what it needs to look like are becoming clearer, and the teams building toward it now will have a significant advantage when it does.

Frequently Asked Questions

What is the agent discovery problem?

The agent discovery problem refers to the challenge of how one AI agent finds, evaluates, and communicates with another agent. It has three layers: discovering that a relevant agent exists, understanding what it can do, and verifying that it’s reliable and trustworthy enough to use. As AI agent deployments scale across enterprises, the inability to solve this problem creates coordination bottlenecks that limit the value of multi-agent systems.

Why can’t AI agents just use regular APIs for discovery?

They can use APIs, but regular APIs don’t solve the full problem. Traditional API documentation is written for human developers, not for agents that need to dynamically evaluate capabilities at runtime. Agents also need semantic matching (finding services based on task descriptions, not exact names), quality signals (reliability and accuracy data), and trust verification — none of which are part of standard API infrastructure.

What is Model Context Protocol (MCP) and does it solve agent discovery?

MCP is an open standard developed by Anthropic that lets AI models connect to external tools and data sources through a standardized interface. It standardizes how capabilities are described and invoked, which addresses part of the discovery problem. However, MCP doesn’t yet address cross-organizational discovery at scale, quality and reliability signals, or trust verification between agents from different sources. Think of it as necessary infrastructure, not a complete solution.

What would an agent-native app store need to include?

A complete agent discovery system would need: structured capability manifests (machine-readable descriptions of what each agent does), semantic search over those descriptions, quality and reliability signals based on actual performance data, trust and provenance verification, version management, and a standard invocation protocol. Most current solutions address one or two of these, not all of them.

How should enterprises prepare for agent discovery today?

The most practical steps now are organizational: build an internal agent inventory (even a simple spreadsheet), establish conventions for documenting agent capabilities, assign ownership for maintaining the catalog, and adopt standardized capability description formats where possible. Technically, adopting MCP-compatible tooling positions you well for when broader standards mature. Building agents on platforms that support standard exposure protocols also reduces future integration work.

Is agent discovery only relevant for large enterprises?

No, but the urgency scales with complexity. A small team with three internal agents can manage discovery manually. A mid-size company with 50 agents across multiple departments starts to feel the pain. Large enterprises with hundreds of agents — or organizations that need to coordinate agents across company boundaries — face an acute version of this problem. The foundational concepts apply at any scale, but the investment in formal discovery infrastructure makes more sense as the number of agents grows.

Key Takeaways

The agent discovery problem — how agents find, evaluate, and communicate with each other — is emerging as a critical bottleneck in enterprise AI deployments.
Discovery has three layers: existence, capability, and trust. Current infrastructure addresses none of them completely.
Hardcoded orchestration works for small systems but doesn’t scale. MCP is the most promising protocol-level step forward but doesn’t solve the full problem.
A true agent-native app store would need structured capability manifests, semantic search, reliability signals, trust verification, and standard invocation protocols.
The organizational challenge — owning the catalog, establishing governance, documenting agents systematically — is as important as the technical one.
MindStudio lets you expose agents as MCP servers today, making your agents discoverable and callable by any compatible system without custom integration work.

The teams that build structured agent catalogs now — even imperfect ones — will be far better positioned when the discovery ecosystem matures. Start with what you can control: document your agents, standardize your capability descriptions, and adopt tooling that supports open protocols.