What Is OpenShell? Nvidia's Open-Source Security Runtime for AI Agents

The Growing Security Gap in Enterprise AI Agent Deployment

Most enterprise security frameworks weren’t built for AI agents.

Traditional security models — firewalls, role-based access control, network segmentation — work well for deterministic systems. An application either has permission to access a database or it doesn’t. A user can run a report or they can’t. The rules are static, and the behavior is predictable.

AI agents break these assumptions. An agent is designed to reason dynamically, select actions based on context, and take steps that weren’t explicitly programmed. A customer support agent might legitimately access a CRM and a billing system in one interaction — then, if manipulated through a prompt injection attack, attempt to forward sensitive data somewhere unintended in the next.

This is the security gap that OpenShell, Nvidia’s open-source security runtime, is built to close. OpenShell provides a structured execution environment for AI agents: sandboxing what they can do, controlling how data moves through their workflows, and generating a complete audit trail for every action they take.

For enterprise teams deploying AI agents in production, understanding OpenShell’s approach to agent security has become increasingly relevant.

What OpenShell Is

OpenShell is an open-source security runtime developed by Nvidia specifically for AI agent deployments. It sits between your AI agents and the systems they interact with, enforcing policies at execution time.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

It doesn’t modify how an AI model thinks or what it generates. Instead, it intercepts agent actions as they happen — checking each action against a set of security policies before permitting or blocking it.

Those policies are written in YAML, a simple, human-readable configuration format that’s already widely used in DevOps tooling. This design choice has practical implications: security policies become infrastructure-as-code, subject to version control, peer review, and automated testing.

The Basic Mental Model

Think of OpenShell as a security checkpoint that every agent action has to clear. Before an agent can:

Call an external API
Read from or write to a database
Route output to a downstream system
Send data to a third-party service

…it first passes through OpenShell’s policy evaluation. The policy — defined in YAML — specifies what’s permitted. Anything not explicitly allowed is blocked by default.

This deny-by-default model is a foundational principle in security engineering. It’s the same logic behind firewall allowlists and least-privilege access control. OpenShell applies it consistently to AI agent behavior.

Why YAML-Based Policies Matter

Enterprise AI agent security needs to satisfy several constraints at once. It needs to be understandable by non-engineers — compliance officers and auditors need to read and verify what an agent is permitted to do, without parsing code. It needs to be version-controlled — policy changes require a tracked history, a review process, and a rollback path. And it needs to fit existing DevOps workflows rather than requiring teams to adopt an entirely new toolchain.

YAML satisfies all three. Because OpenShell policies live in plain text files, they can be stored in Git, reviewed through pull requests, and deployed through the same CI/CD pipelines that manage the rest of an organization’s infrastructure. Security governance for AI agents becomes part of standard engineering workflow — not a separate, manual process running in parallel.

How OpenShell Works at Runtime

OpenShell operates as a runtime enforcement layer — active while agents are running, evaluating each action before it executes.

Here’s the basic execution flow for a secured agent:

A trigger — a user input, a scheduled job, a webhook — activates the agent
The agent uses the underlying AI model to reason about what actions to take
The agent proposes an action (calling a tool, reading data, routing an output)
OpenShell evaluates the proposed action against the active YAML policy
Permitted actions execute; blocked actions are rejected and logged
This evaluation repeats for each subsequent action the agent takes
A structured audit log captures the full history of actions and policy decisions

The evaluation adds minimal latency — typically milliseconds per action — which is negligible compared to the time taken by LLM inference or external API calls.

Why Separation of Concerns Matters

One of OpenShell’s most practically useful properties is that security policy and agent logic remain fully decoupled.

In many agent frameworks, security logic ends up embedded in prompts (“never share financial data with unauthorized users”) or in agent code (“check permissions before calling this API”). Both approaches are fragile. Prompt instructions can be overridden by adversarial inputs. Code changes can accidentally remove security checks. When security is intertwined with functionality, auditing either becomes difficult.

Not a coding agent. A product manager.

Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.

BY MINDSTUDIO

OpenShell externalizes the policy layer entirely. Agent code handles reasoning and action selection. YAML policies handle enforcement. These can be maintained by separate teams — a platform security team managing policies, a product team managing agent logic — with neither needing to modify the other’s work.

The Core Security Features

Agent Sandboxing

Sandboxing creates an explicit boundary around what an agent can interact with. Inside the sandbox, defined operations proceed normally. Outside it, everything is blocked.

OpenShell implements sandboxing through its YAML policy configuration. A sandboxed agent has an explicit allowlist covering:

Which tools and APIs it can call
Which data sources it can read from and write to
Which users or systems can trigger it
What categories of output it’s permitted to produce

The security value of sandboxing becomes clear in adversarial contexts. Prompt injection attacks — malicious instructions embedded in content the agent processes, like emails, documents, or web pages — are among the most common attack vectors against AI agents. A properly configured sandbox limits what an attacker can accomplish even if they successfully inject instructions, because the agent can only take actions the sandbox permits.

Sandboxing also limits blast radius. If an agent malfunctions or is manipulated, the damage is constrained by the sandbox definition. Incidents become easier to detect, contain, and remediate.

Data Routing Controls

Data routing controls govern how information moves through agent workflows. This is one of the more underappreciated aspects of enterprise AI security.

A typical enterprise agent might pull customer records from a CRM, access financial data from an ERP system, query an internal knowledge base, and route outputs to a ticketing system, an email thread, and a Slack channel. Each connection is a potential compliance risk. GDPR, HIPAA, SOX, and similar regulations often specify exactly where certain categories of data can go — and what must happen to it before it gets there.

OpenShell makes those requirements technical constraints, not just documented policies. You can configure:

That PII must be masked before passing to any third-party API
That financial data can only be routed to specific approved internal systems
That outputs containing certain data categories require human review before delivery
That specific data types can never leave the organization’s infrastructure perimeter

These rules are enforced at runtime. If an agent tries to route sensitive data to an unlisted destination, OpenShell blocks it and logs the attempt — regardless of what the agent’s model or code logic intended.

Audit Logging

For regulated industries, being able to demonstrate what an AI agent did — and under what policies it was operating — is often a compliance requirement, not an optional feature.

OpenShell generates structured logs for every agent action, capturing:

What action was attempted
What policy was evaluated
What decision was made (permitted or blocked)
What data was involved
Timestamps for each event

These logs can be exported to existing observability platforms — Splunk, Datadog, Grafana, OpenSearch — integrating AI agent monitoring into the same dashboards and alerting pipelines already used for other infrastructure. When a compliance team needs to reconstruct what an agent did with customer data on a specific date, the answer is a structured query, not a manual investigation.

Why Nvidia Released OpenShell as Open Source

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

A security runtime that’s a black box creates its own category of risk. If you’re relying on a tool to govern what your AI agents can do, you need to be able to verify that it behaves as documented.

Open source addresses this directly. Organizations can inspect the codebase, identify edge cases, and confirm behavior independently. Security researchers can probe it publicly. The community can contribute fixes and improvements. This kind of distributed scrutiny often catches issues that closed internal audits miss.

Open source also removes vendor dependency. Organizations can self-host OpenShell, integrate it into their existing infrastructure stack, and maintain control over the runtime without being subject to a vendor’s commercial roadmap or pricing decisions.

Nvidia has applied this philosophy elsewhere in the AI ecosystem. NeMo Guardrails, Nvidia’s open-source toolkit for conversational AI safety, has benefited substantially from community contribution since its release. OpenShell follows a similar model, targeting the agent execution layer that NeMo Guardrails doesn’t address.

OpenShell in Multi-Agent Systems

Security complexity scales significantly in multi-agent architectures. When multiple agents coordinate to complete tasks — one researching, another drafting, another executing actions — the security surface area expands in ways that are hard to reason about manually.

Multi-agent systems introduce specific security concerns:

Agent-to-agent communication: Messages from one agent to another could contain adversarial content, just like messages from external sources
Privilege escalation: A low-privilege agent could potentially manipulate a high-privilege agent into performing actions on its behalf
Shared state: Multiple agents writing to the same systems create potential for data conflicts or unauthorized state changes
Attribution: When something goes wrong in a multi-agent workflow, identifying which agent caused the issue requires detailed per-agent logging

OpenShell addresses these concerns through per-agent policy enforcement. In a multi-agent workflow, each agent has its own YAML policy scoped to its specific role. Inter-agent communication is treated as a data routing event, subject to the same enforcement rules as agent-to-external-system communication. An agent receiving messages from another agent still operates within its own sandbox — privilege doesn’t transfer through message passing. Audit logs capture actions at the individual agent level, making attribution tractable even in complex pipelines.

This composable model means that adding a new agent to a workflow means adding a governed component — not silently expanding the security surface of the whole system.

Deploying OpenShell in Practice

Understanding OpenShell’s capabilities is one thing. Getting it running in an enterprise environment involves some practical groundwork.

Audit before you configure. Before writing YAML policies, document what each agent actually does: which systems it connects to, what data it handles, and what outputs it produces. Policies written without this understanding will either be too restrictive (breaking legitimate workflows) or too permissive (providing little security value).

Start with observation, then enforce. Running in a logging-only mode first — capturing what actions agents take without blocking them — gives you empirical data on actual agent behavior. Policies built from real behavior data are more accurate than those written from first principles.

How Remy works. You talk. Remy ships.

YOU14:02

Build me a sales CRM with a pipeline view and email integration.

REMY14:03 → 14:11

Scoping the project

Wiring up auth, database, API

Building pipeline UI + email integration

Running QA tests

✓ Live at yourapp.msagent.ai

Treat policies as code. Store YAML policy files in the same repositories as the rest of your infrastructure. Use pull request reviews for policy changes. Add automated validation tests. Deploy policy updates through CI/CD. This ensures changes are tracked, reviewed, and reversible.

Integrate logs with existing monitoring. OpenShell’s audit logs are most useful when they flow into your existing SIEM or observability platform. Configuring this export ensures AI agent events surface alongside other infrastructure signals — you don’t want agent security monitored in an isolated silo.

Plan for policy maintenance. As agents evolve and business requirements change, policies drift out of sync with actual needs. Build periodic reviews into your governance process — quarterly audits comparing agent behavior against declared permissions catch drift before it becomes a gap.

How MindStudio Fits the Enterprise Agent Picture

One question enterprises face when building AI agents is where the development layer ends and the security layer begins.

MindStudio is a no-code platform for building and deploying AI agents, used by teams at Microsoft, Adobe, TikTok, and others. It lets teams create agents that connect to real business systems — Salesforce, HubSpot, Google Workspace, Slack, and 1,000+ other tools — without requiring engineering resources for every new workflow.

The structural design of MindStudio agents is relevant here. Rather than allowing open-ended agents that can call arbitrary tools, MindStudio workflows are explicitly defined step by step — each stage has defined inputs, outputs, and tool connections. This explicit structure means agent behavior is observable and constrained by design, which aligns naturally with the runtime enforcement approach that OpenShell formalizes.

For enterprise deployments, combining the two layers is practical:

Build and iterate on agent logic using MindStudio’s no-code workflow builder — the average build takes 15 minutes to an hour
Enforce runtime security policies with a tool like OpenShell at the execution layer
Use MindStudio’s workflow structure and OpenShell’s audit logs together for complete observability

MindStudio also supports webhook and API endpoint agents that expose agents to external systems — enabling integration with security orchestration infrastructure at the deployment level. Agents built on MindStudio can be called through controlled API endpoints, giving teams full authority over when and how agents execute.

You can start building on MindStudio for free at mindstudio.ai.

Frequently Asked Questions

What is OpenShell by Nvidia?

OpenShell is an open-source security runtime developed by Nvidia for AI agent deployments. It uses YAML configuration files to define and enforce security policies at execution time — controlling what actions agents can take, how data flows through agent workflows, and generating structured audit logs for compliance purposes. It operates at the execution layer independently of the underlying AI model, meaning security policy can be updated without touching agent code.

How does OpenShell sandbox AI agents?

OpenShell creates a policy-defined boundary around each agent — an explicit allowlist of tools, APIs, data sources, and output destinations the agent is permitted to access. Anything outside that allowlist is blocked by default. This deny-by-default model limits what an attacker can accomplish through prompt injection and constrains the blast radius if an agent malfunctions or is otherwise compromised.

Is OpenShell really free and open source?

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

Yes. OpenShell is released as an open-source project. Organizations can inspect the codebase, self-host the runtime, customize it for their specific environments, and deploy it without licensing fees or vendor dependency. For security tooling specifically, the ability to audit the tool itself is often as important as its documented feature set — open source makes that possible.

How is OpenShell different from NeMo Guardrails?

NeMo Guardrails controls what AI models say — it operates at the conversational layer, steering language models away from harmful or off-topic responses using a policy language called Colang. OpenShell controls what AI agents do — it operates at the execution layer, governing which actions agents are permitted to take. They address different parts of the AI safety problem and can be used together in a comprehensive security architecture.

What types of organizations should use OpenShell?

OpenShell is most relevant for organizations deploying AI agents in regulated or security-sensitive environments: financial services (SOX, PCI-DSS), healthcare (HIPAA), legal, and government. It’s also valuable for any enterprise running multi-agent workflows where the security surface area is complex enough that manual auditing isn’t practical. If you have agents touching sensitive customer data or operating in regulated contexts, runtime policy enforcement is worth taking seriously.

Does OpenShell work with any AI model or agent framework?

OpenShell is designed to be model-agnostic and framework-agnostic. Because it operates at the execution layer — intercepting and evaluating agent actions — it doesn’t depend on how the underlying agent is built or which model it uses. This makes it applicable across a range of agent frameworks, whether you’re running LangChain, CrewAI, custom Python agents, or proprietary orchestration systems.

Key Takeaways

OpenShell is Nvidia’s open-source security runtime for AI agents — it enforces security policy at the execution layer without modifying agent code or model behavior.
YAML-based configuration makes security policies human-readable, version-controllable, and auditable by compliance and security teams, not just developers.
Core capabilities include deny-by-default agent sandboxing, data routing controls that can technically enforce regulatory requirements, and structured audit logging for compliance.
Open source means organizations can inspect, self-host, and customize the runtime — critical for security tooling where trusting a closed-source vendor isn’t sufficient.
Multi-agent architectures benefit from OpenShell’s per-agent policy model, which governs each component individually rather than trying to reason about security at the system level.
For teams building enterprise AI agents, MindStudio provides a structured, no-code building environment where every workflow is explicitly defined — a practical starting point that complements runtime security tools like OpenShell.