Abstract

Organizations invest heavily in building domain environments — the tools, data pipelines, knowledge bases, access controls, and operational procedures that make a specific domain functional. But once built, distributing the capability of these environments to users who work outside them is an unsolved problem. Current approaches — platform UIs, REST APIs, tool servers — each force a tradeoff between capability and accessibility.

We present Environment as a Service (EaaS), an architecture that distributes the capability of domain environments by embedding an AI agent inside the environment and exposing it as the service interface. The caller — typically another AI system acting on behalf of a user — communicates with the service agent through natural language. The agent, which has mediated access to the environment's tools, data, knowledge, and security policies, processes the request and returns structured results.

This single design decision — agent as the interface — simultaneously achieves three properties that are impossible to deliver together through tool-level interfaces:

  1. Capability abstraction: The caller does not need to understand the environment's tool stack, data model, or operational procedures.
  2. Semantic security: The security boundary is enforced by an agent that evaluates intent, not just permissions — and the caller never touches domain credentials.
  3. Knowledge amplification: Every interaction enriches the environment's accumulated knowledge, benefiting all future callers regardless of their entry point.

We further argue that EaaS is not merely a better architecture but a necessary one. In a world where AI agent environments are inherently insecure — vulnerable to supply chain attacks, prompt injection, and credential exfiltration — any architecture that distributes tools or credentials to the caller's environment distributes attack surface. EaaS is the only pattern that makes an environment's capability accessible without distributing the underlying tools, credentials, or attack surface to the caller, because the only thing that crosses the boundary is language.

We ground EaaS in a production domain intelligence system for data analytics, but the architecture is domain-agnostic. The pattern applies wherever a rich operational environment exists and external users need access to its capability without adopting its interface.


1. The Distribution Problem

When a team builds a domain environment, they assemble a remarkable concentration of capability: databases and query engines, visualization tools, experiment platforms, monitoring dashboards, ETL pipelines, accumulated standard operating procedures, and — increasingly — a body of learned institutional knowledge. The environment is powerful precisely because everything is connected: the query engine knows which tables are reliable, the SOPs know the correct workflow for each task type, the knowledge base remembers edge cases discovered over months of operation.

This capability is locked inside the environment.

A data scientist working in a Jupyter notebook cannot access it. An engineer in their IDE cannot access it. A product manager using their preferred AI assistant cannot access it. To use the environment's capability, they must leave their tools, enter the environment's interface, learn its conventions, and work within its constraints.

This creates an organizational paradox: the more powerful the environment becomes, the more friction it creates for users who could benefit from it but work elsewhere. The environment's capability is both its greatest asset and its distribution bottleneck.

The problem is not technical in the narrow sense — it is architectural. How do you distribute the capability of an environment without requiring users to be inside it?

2. Current Approaches and Their Limitations

Three approaches exist today. Each makes a different tradeoff, and none resolves the fundamental tension.

2.1 Platform UI

The most common approach: build a web application as the environment's front end. Users log in, interact through the UI, and access the full capability of the environment.

This works well for dedicated users who spend their working hours inside the platform. But it fails for occasional users, users who prefer different tools, and users whose workflows span multiple environments. A data scientist analyzing data in a notebook does not want to context-switch to a separate web UI to look up a metric definition, then switch back to continue analysis.

The platform UI is also the highest-investment approach: every new capability requires building new UI surfaces, and the UI itself becomes a maintenance burden.

2.2 APIs and SDKs

The engineering approach: expose the environment's capabilities as REST or GraphQL endpoints, and optionally wrap them in language-specific SDKs. Users call the API from their own tools.

This works for well-defined, atomic operations — create a report, run a query, fetch a metric. But domain environments are valuable precisely because they support complex, multi-step workflows that require judgment: choosing the right data source, applying the correct metric definition, validating results against known edge cases, following SOPs for specific task types.

APIs expose actions. They do not expose judgment. A caller who does not understand the environment's data model, metric definitions, and operational conventions will use the API incorrectly — not because the API is bad, but because domain expertise cannot be serialized into endpoint documentation.

2.3 Tool Servers and MCP

The AI-native approach: expose the environment's tools through a protocol like MCP (Model Context Protocol) that allows AI systems to discover and invoke them. The caller's AI agent iterates through the available tools, understands their schemas, and orchestrates them to accomplish tasks.

This is the closest existing approach to what EaaS achieves, and it is instructive to understand precisely where it falls short.

The orchestration burden falls on the caller. When an AI system discovers a tool server with twenty tools — search reports, run query, get card metadata, create draft, manage pipeline — it must decide which tools to use, in what order, with what parameters. This requires domain knowledge that the caller's AI does not have. It does not know that "search the index first, then fetch metadata, then run the report" is the correct workflow. It does not know that one data source is more reliable than another for retention queries. It does not know the SOP for report optimization.

Security is at the tool level, not the intent level. Tool servers enforce access control per tool: can this token call this endpoint? This cannot distinguish between "run a query to check last week's retention" and "run a query to export all user email addresses." Both are calls to the same query tool with different SQL.

Knowledge does not travel. Even if the environment has accumulated months of institutional knowledge — metric definitions, data source reliability notes, edge case warnings — this knowledge is not accessible through tool interfaces. The caller's AI operates without it.

In short, MCP and tool servers expose mechanism without context. They give the caller tools but not the knowledge of how to use them, not the judgment of when to use them, and not the security awareness of what should not be done with them. (A deeper structural analysis of why this gap cannot be closed within MCP's architecture appears in Section 9.)

3. Environment as a Service

We propose a different approach. Instead of exposing the environment's tools, we expose the environment itself — through an AI agent that lives inside it.

3.1 The Environment as the Unit of Service

The key conceptual shift is in what constitutes the service boundary.

In APIs, the unit of service is a function: one endpoint, one operation, one response.

In tool servers, the unit of service is a tool: one capability, discoverable and invocable.

In EaaS, the unit of service is the environment: the complete operational context — tools, data, knowledge, procedures, security policies — mediated by an agent that understands all of it.

The caller does not interact with individual tools. The caller interacts with the environment through its agent. The distinction matters in the same way that hiring a domain expert differs from subscribing to a toolkit. The expert brings judgment, context, and accumulated experience. The toolkit brings only mechanism.

3.2 Agent as the Interface

The defining design decision of EaaS is: the interface between the caller and the environment is an AI agent.

This agent is not a thin wrapper around tools. It is a full reasoning system that:

The critical insight is that this is not merely a convenience layer. Agent-as-interface changes the fundamental properties of the service.

With tool interfaces, the service's quality is bounded by the caller's domain knowledge. A caller who does not understand the tools will use them poorly, regardless of how good the tools are.

With an agent interface, the service's quality is bounded by the environment's domain knowledge — which, as we argue in Section 5, compounds over time. The caller only needs to describe intent. The agent supplies the expertise.

3.3 The Two-AI Delegation Pattern

EaaS naturally gives rise to a new computational pattern: two AI systems collaborating through natural language, each expert in its own context.


User ←→ Caller AI ←→ EaaS Agent
         (user context)    (domain context)

The Caller AI — running in the user's environment (notebook, IDE, terminal, or another AI assistant) — understands:

The EaaS Agent — running inside the domain environment — understands:

Neither AI needs to understand the other's domain. The Caller AI does not need to know how Metabase works. The EaaS Agent does not need to know what notebook the user is running. Natural language is the protocol between them, and it is sufficient because each system operates at the level of intent, not mechanism.

This pattern is fundamentally different from function calling, API composition, or tool orchestration. Those patterns require the caller to understand the callee's interface at a structural level. The Two-AI Delegation Pattern requires understanding only at the semantic level — what, not how.

Consider a concrete example. A data scientist in a Jupyter notebook asks their local AI assistant: "What was our 7-day retention last week, and how does it compare to the previous month?"

Without EaaS, the local AI would need to: discover which tool to query, understand the retention metric definition, know which table to use, write correct SQL, handle the comparison logic, and validate the result. It has none of this knowledge.

With EaaS, the local AI sends a single natural language request to the EaaS Agent. The EaaS Agent consults the environment's knowledge base for the retention metric definition, selects the appropriate saved report or writes a validated query, runs it, compares against historical baselines, and returns a structured result with the numbers, a comparison, and any caveats. The local AI formats this into the notebook.

Each AI did what it was best at. Neither needed to understand the other's environment.

4. The Semantic Security Boundary

EaaS introduces a security model that is qualitatively different from what tool-level interfaces can achieve. We call it the semantic security boundary: a security perimeter where access decisions are made based on the meaning of a request, not just its mechanical form.

4.1 From Action-Level to Intent-Level Security

Traditional security in APIs and tool servers operates at the action level:

This is necessary but insufficient for domain environments. The same tool — a SQL query endpoint — can be used to "check last week's retention rate" or to "export all user PII." Both are syntactically identical: a query tool invocation with different SQL. Action-level security cannot distinguish them.

EaaS security operates at the intent level. The service agent evaluates the meaning of each request before executing it:

This is possible only because the interface is an AI agent that understands natural language. A tool endpoint receives structured parameters and cannot reason about intent. An agent receives a natural language request and can evaluate its purpose, its proportionality, and its consistency with the caller's history.

4.2 Credential Isolation

In EaaS, the caller's AI never touches domain credentials. Database passwords, API keys, service tokens, connection strings — all remain inside the environment boundary. The caller holds only a short-lived identity token — which is a credential in the authentication sense, but carries no domain-level privilege: it cannot query a database, call an internal API, or access any service directly. The caller authenticates as a person; the service agent maps that identity to appropriate internal credentials.

This eliminates an entire class of security risks that plague tool-level interfaces:

In a tool server model, distributing capability requires distributing credentials — or building elaborate proxy authentication layers that are themselves attack surfaces. In EaaS, the credential boundary and the service boundary are the same thing.

4.3 Identity-Bound Short-Lived Access

The EaaS authentication model is designed around three principles:

  1. Every request is bound to a real human identity. Not to a shared API key, not to a service account, but to a specific person. This enables fine-grained access control and complete audit trails.
  1. Access tokens are short-lived. A typical token lifetime of eight hours — one working session — means that compromised tokens expire before they can cause sustained damage. Users authenticate through a browser-based OAuth flow at the start of each working day.
  1. The environment controls the identity-to-permission mapping. When a request arrives with a user identity, the service agent determines what that user can access internally. This mapping is maintained inside the environment and is invisible to the caller.

This model is deliberately similar to how gcloud auth login or gh auth login work: authenticate as yourself through a browser, receive a short-lived token, and use it until it expires. Users already understand this pattern. What is new is that the token grants access not to raw APIs but to an intelligent agent that mediates every interaction.

4.4 Layered Defense Within the Environment

The semantic security boundary is not a single checkpoint. Inside the environment, security operates in layers:

  1. Entry interception: Request-level validation, rate limiting, and anomaly detection. Requests that are syntactically malformed, abnormally large, or arriving at unusual patterns are flagged before reaching the agent.
  1. Intent evaluation: The service agent — or a dedicated security agent — evaluates whether the request is appropriate for the caller's identity and role. This is the semantic layer: understanding what the request means, not just what it asks for.
  1. Execution confinement: The agent executes using credentials and permissions scoped to the authenticated user. Even if the agent's reasoning were compromised, it cannot exceed the user's authorized access level.
  1. Output filtering: Before results are returned, a filtering pass ensures that sensitive information — credentials, internal service addresses, PII beyond the user's clearance — is not included in the response.
  1. Periodic audit: A background process reviews the accumulated request and response history for patterns that individual-request evaluation might miss: gradual data exfiltration, systematic probing, or usage drift.

This defense-in-depth is natural in EaaS because the service agent controls the entire request lifecycle. In a tool server model, each tool is responsible for its own security, and cross-tool security concerns (like detecting a multi-step exfiltration pattern) fall through the cracks.

5. The Knowledge Amplification Effect

Domain environments do not only contain tools and data. Over time, they can accumulate knowledge — institutional understanding of how things work, what the right approaches are, where the pitfalls lie. A separate line of work, the Self-Evolving Knowledge Engine (SEKE), has explored how this accumulation can be made autonomous: instead of relying on human curation, the agent learns domain knowledge from its own work, organizes it into a semantic tree that evolves in structure, refines it through adversarial review, and operates within a governance framework of human constitutional decisions. The core thesis is that General Intelligence + Capabilities + Domain Knowledge = Domain Intelligence — and that a system with a knowledge engine can bootstrap itself into domain expertise without custom training or hand-crafted knowledge bases. (For a full treatment, see From Agent to Domain Intelligence: A Self-Evolving Knowledge Engine.)

When the environment behind EaaS includes such a knowledge accumulation system, EaaS creates a powerful amplification effect.

5.1 More Entry Points, Same Knowledge Pool

Without EaaS, only users of the environment's native interface contribute to knowledge accumulation. Every question answered, every edge case discovered, every workflow optimized feeds back into the knowledge system — but only through one channel.

With EaaS, every external caller also contributes:

The knowledge system does not distinguish between requests from the native UI and requests from EaaS. All interactions produce work traces. All work traces feed the knowledge evolution loop. More entry points mean more interactions, more diversity of questions, and faster knowledge accumulation.

5.2 The Flywheel

This creates a self-reinforcing cycle:


More entry points (EaaS)
→ More interactions from diverse contexts
→ Richer material for knowledge capture
→ Better knowledge base
→ Better service quality
→ More users adopt EaaS
→ More entry points

This is a positive feedback loop, but — importantly — the knowledge system itself operates through negative feedback: errors produce incorrect results, reality pushes back (users challenge answers, queries return numbers that don't match expectations), and the knowledge gets corrected. The positive loop is in adoption and diversity of input; the negative loop is in knowledge quality. Together, they produce a system that grows in both reach and accuracy.

5.3 Cross-Context Knowledge Transfer

EaaS enables a form of knowledge transfer that is impossible with isolated tool access. When a user working in the platform UI discovers a workflow optimization, that knowledge enters the tree. When a notebook user later encounters a similar task through EaaS, the agent retrieves and applies that knowledge — even though the two users have never interacted and use completely different tools.

The knowledge is not locked in any single user's environment. It lives in the service's environment, accessible to every caller through the agent's mediation. This is fundamentally different from documentation (which users must find and read) or API changelogs (which describe mechanism, not judgment). The knowledge is applied automatically by the agent when relevant — invisible to the caller but shaping the quality of every response.

6. Reference Architecture

While the specific implementation varies by domain, EaaS has a consistent architectural structure.

6.1 Components

The Environment contains:

The EaaS API Layer provides:

The Auth Layer provides:

Client-side integration can take multiple forms:

Note the MCP wrapper: this is MCP at the client boundary, wrapping a single ask tool that calls the EaaS agent. This is fundamentally different from exposing the environment's tools as MCP. The caller's AI sees one tool ("ask the domain environment"), not twenty tools it doesn't understand.

6.2 Request Flow


1. Caller AI constructs a natural language request
2. Client SDK sends request + identity token to EaaS API
3. Entry interception: validate token, check rate limits, log request
4. Service agent receives request in environment context
5. Agent reads relevant knowledge, selects tools, executes workflow
6. Output filtering: remove sensitive information
7. Structured result returned to caller
8. Knowledge capture: work trace feeds the knowledge system

Steps 4-5 are where the environment's capability lives. The agent's effectiveness here depends on the richness of the environment — tools, knowledge, and policies — not on the caller's sophistication.

7. Design Decisions and Tradeoffs

7.1 Synchronous vs. Asynchronous

Some domain operations complete in seconds (index lookup, cached metric retrieval). Others take minutes (complex multi-step analysis, large SQL queries). EaaS must support both.

The recommended approach is a unified streaming interface. Short requests complete quickly within the stream. Long requests stream progress updates (thinking, executing, validating) before delivering the final result. The caller can choose to block until completion or process the stream incrementally.

An alternative is dual-mode: synchronous for quick operations, asynchronous (submit-then-poll) for long ones. This adds complexity to the client and requires the caller's AI to decide which mode to use — a decision that should be made by the service, not the caller.

7.2 Stateless vs. Lightweight Stateful

Pure stateless operation — every request is independent — is the simplest model and works for most single-turn queries. But multi-turn interactions are common: "What was retention last week?" followed by "Break that down by platform."

The recommended approach is lightweight stateful: each response includes a conversation identifier that the caller can optionally send with subsequent requests to continue the context. The service manages the state internally. If the caller omits the identifier, the request is treated as independent.

This avoids burdening the caller with context management while enabling multi-turn interactions when needed.

7.3 Result Format

The service agent returns structured blocks rather than raw text. Typical block types include:

This structured format serves both AI and human consumers. The caller's AI can parse and transform the data; a human viewing the result can read the narrative and inspect the tables.

7.4 The Single-Tool Client Pattern

On the client side, EaaS is best exposed as a single tool — regardless of the integration mechanism (CLI, SDK, or MCP wrapper). The caller's AI sees one capability: "ask the domain environment." Not "search reports," "run query," "get metadata" — just "ask."

This is deliberate. The moment you expose multiple tools on the client side, you push orchestration responsibility back to the caller's AI — which is exactly what EaaS exists to avoid. One tool, natural language input, structured output. All orchestration stays inside the environment.

8. Relationship to Domain Intelligence

EaaS extends the Domain Intelligence thesis articulated in SEKE.

The SEKE thesis holds that:

General Intelligence + Capabilities + Domain Knowledge = Domain Intelligence

SEKE addresses how Domain Knowledge is created: through autonomous learning from real work, organized into an evolving semantic tree, refined through adversarial review, and governed by human constitutional decisions.

EaaS addresses how Domain Intelligence is distributed: through an agent-mediated interface that makes the environment's full capability — including its accumulated knowledge — accessible to users who work outside the environment.

Together, they form a complete cycle:


SEKE: Build Domain Intelligence (knowledge accumulation)
EaaS: Distribute Domain Intelligence (agent-mediated access)
      ↓
More users, more contexts, more interactions
      ↓
Richer material for SEKE knowledge capture
      ↓
Deeper Domain Intelligence
      ↓
Better EaaS service quality

This cycle has a property worth noting: SEKE and EaaS are mutually reinforcing. SEKE makes EaaS more valuable by deepening the knowledge the agent can draw on. EaaS makes SEKE more effective by multiplying the interactions that feed knowledge capture. Neither alone achieves what both together produce.

The competitive implication is that Domain Intelligence, once bootstrapped through SEKE and distributed through EaaS, creates a compounding advantage at both layers simultaneously. A competitor would need to replicate not just the knowledge base (which requires months of real-world operation) but also the distribution network (which creates the input diversity that drives knowledge quality).

9. Why Not Just MCP? A Deeper Analysis

The comparison with MCP deserves careful treatment, because MCP is the most natural alternative and the most commonly proposed solution to the distribution problem.

MCP's architecture is: expose tools → let the caller's AI orchestrate → return results per tool.

EaaS's architecture is: expose an agent → let it orchestrate internally → return results per request.

The difference is not incremental. It is a difference in where intelligence sits relative to the service boundary.

In MCP, intelligence (orchestration, judgment, domain knowledge) must exist on the caller's side. The service provides mechanism. The caller provides meaning.

In EaaS, intelligence exists on the service side. The service provides both mechanism and meaning. The caller provides only intent.

This has a cascading consequence: MCP's quality ceiling is the caller's domain knowledge; EaaS's quality ceiling is the environment's domain knowledge. Since the environment's knowledge compounds over time (through SEKE or similar systems) while the caller's knowledge is typically static, EaaS's quality ceiling rises continuously while MCP's remains fixed.

There is also a structural impossibility in MCP that is worth noting: MCP cannot expose knowledge. MCP exposes tools with schemas and descriptions. But knowledge — "this data source is unreliable on Mondays because of a known ETL delay," "this metric was redefined last quarter and old reports use the legacy definition," "this query pattern causes timeouts on tables larger than 10M rows" — cannot be represented as tool schemas. It can only be represented as context that an agent reasons about while working.

EaaS is the only architecture where this knowledge is both available and applied.

A note on hybrid designs. An attentive reader will observe that nothing prevents an MCP server from exposing a single high-level ask tool backed by an agent. This is true — and this paper's own reference architecture (Section 6) proposes exactly this as a client-side integration pattern. But this observation does not undermine the argument; it confirms it. An MCP server that wraps a single agent-backed ask tool, with knowledge retrieval, semantic security, and credential isolation on the server side, is not MCP in the sense critiqued above. It is EaaS, delivered through MCP as a transport protocol. The critique in this section targets the common pattern of exposing many domain-specific tools through MCP and relying on the caller's AI to orchestrate them — not the use of MCP as a wire format for a fundamentally different service architecture.

10. The Security Imperative: Why EaaS Is Not Optional

The arguments so far have presented EaaS as a better architecture — more capable, more elegant, more amenable to knowledge accumulation. But there is a stronger claim: in a world where agent environments are fundamentally insecure, EaaS is the only safe way to distribute an environment's capability.

10.1 The Insecurity of Agent Environments

Today's AI agents — Codex, Claude Code, Cursor, Copilot, and their successors — operate in the user's local environment with broad capabilities. They can install packages, access the network, read and write the filesystem, execute arbitrary code, and interact with external services. This is not a design flaw; it is a requirement. Agents need these capabilities to be useful.

But these same capabilities make the agent environment inherently insecure. The threat model is not hypothetical:

Supply chain attacks. When an agent runs pip install or npm install, it trusts the package repository. A single compromised package — injected through typosquatting, maintainer account takeover, or dependency confusion — gains code execution in the agent's environment. This is not theoretical: supply chain attacks on package registries are well-documented and increasing in frequency. Agents amplify this risk because they install packages more aggressively and with less human scrutiny than manual development.

Malicious tool providers. An MCP server, a plugin, a skill, an extension — any third-party capability that an agent loads becomes part of the agent's trusted computing base. A malicious MCP server can inject instructions through prompt injection, exfiltrate data through tool responses, or manipulate the agent's behavior in ways that are invisible to the user.

Prompt injection through data. When an agent reads external content — web pages, documents, API responses, even database results — that content can contain adversarial instructions. If the agent interprets these instructions, the attacker gains indirect control over the agent's actions, including its use of tools and credentials.

Environment exfiltration. An agent that can read environment variables and access the network can exfiltrate credentials in a single operation. No sophisticated exploit is required — reading os.environ and sending an HTTP request is sufficient. Any compromised component in the agent's dependency chain can do this silently.

The uncomfortable conclusion is: we should assume that any agent environment with network access and package installation capability will eventually be compromised. The attack surface is too large, the supply chain too deep, and the defenses too weak. Large-scale agent-native attacks — targeting the millions of developer machines running AI coding assistants — have not yet occurred at scale, but they are trivially easy to execute. The absence of widespread attacks reflects attacker priorities, not attacker inability.

10.2 Capability Distribution as Attack Surface Distribution

This has a direct consequence for how domain capabilities should be distributed.

When an organization distributes capability to agent environments — through APIs, SDKs, MCP servers, or tool integrations — it is distributing attack surface. Every credential shared with the user's environment is a credential that a compromised environment can exfiltrate. Every tool exposed to the user's agent is a tool that a compromised agent can abuse.

Consider the spectrum of distribution approaches:

Distribute credentials directly (API keys, database passwords). The most common and most dangerous approach. Once a credential reaches a compromised environment, the attacker has the same access as the legitimate user — and typically no rate limiting, no behavioral analysis, and no intent evaluation. A database password leaked from one developer's machine gives an attacker direct access to production data.

Distribute capability through proxied access (tool servers, MCP). Better: the caller does not hold raw credentials. But the tools themselves become the attack surface. A compromised agent with access to a query tool can execute arbitrary SQL through that tool. A compromised agent with access to a deployment tool can push malicious code. The tools provide capability without judgment — and a compromised caller has no judgment.

Distribute capability through fine-grained permissions (scoped tokens, per-tool ACLs). Better still, but insufficient. Fine-grained permissions limit what each tool can do, but they cannot limit what the tools are used for. A read-only query token scoped to specific tables still allows a compromised agent to exfiltrate all data from those tables. The permissions are correct at the action level; the usage is malicious at the intent level.

None of these approaches solve the fundamental problem: if the caller's environment is compromised, any capability distributed to it will be used against you. This is true whether that capability is a credential, a tool, or a finely-scoped permission.

10.3 EaaS as the Only Safe Architecture

EaaS resolves this by keeping capability inside the environment — no tools, no service credentials, no query interfaces are distributed to the caller.

In EaaS, the caller's environment contains exactly two things related to the domain service:

  1. A function that sends natural language text to an endpoint
  2. A short-lived identity token

No credentials. No tools. No query interfaces. No deployment capabilities. No MCP servers. Nothing that a compromised environment could use to directly access domain systems.

Even if the caller's environment is fully compromised — every package backdoored, every MCP server malicious, the agent itself hijacked — the attacker can only:

The attack surface collapses from "everything the user's agent can access" to "what can a short-lived identity token accomplish through a natural language interface that evaluates intent." This is not a marginal improvement. It is a categorical reduction in risk.

10.4 Infrastructure-Level Sandboxing of the Service Agent

A natural objection arises: if agent environments are inherently insecure, isn't the EaaS agent itself — which is also an AI agent — equally vulnerable? The answer is no, and the reason is architectural, not aspirational.

The EaaS agent runs in centrally-managed infrastructure, not on a user's machine. This single fact enables a class of sandboxing measures that are impossible in user environments.

Credential-free execution through proxy injection. In a user environment, the agent needs credentials to access services — database passwords, API keys, service tokens — and those credentials must exist somewhere the agent can reach them: environment variables, config files, secret stores. In EaaS, the agent process never possesses credentials at all. All network requests from the agent are routed through a reverse proxy that inspects the request, determines the target service, and injects the appropriate credentials at the proxy layer before forwarding. The agent sends a request to "the dashboarding API"; the proxy adds the authentication header. The agent sends a query to "the database"; the proxy adds the connection credentials. The agent's process memory, environment variables, and filesystem contain zero credential material.

This means that even if the service agent is fully compromised — through adversarial prompting, data-borne prompt injection, or a hypothetical reasoning exploit — the attacker gains access to an agent that structurally cannot leak credentials, because it never had them.

Capability confinement through an external executor. In a user environment, the agent has broad system access: it can run shell commands, access the network directly, read arbitrary files. Restricting this is impractical because the user needs the agent to do general-purpose work. In EaaS, the agent's capabilities are mediated through a fixed Unix socket interface to an external Executor process. The agent cannot execute shell commands, make direct network calls, or access the filesystem outside its workspace. Instead, it sends structured operation requests through the socket, and the Executor — a hardened, non-AI process with a finite and auditable set of permitted operations — decides whether to execute them.

The permitted operation set is small and explicit: run a specific tool, read a specific knowledge file, write a result artifact. The Executor rejects anything outside this set. The agent cannot install packages, cannot open arbitrary network connections, cannot read /etc/passwd or ~/.ssh/. Its capability envelope is defined by the Executor's allowlist, not by the operating system's permissions.

Why this is impractical in user environments. A precise statement of the asymmetry matters here. It is not that user-environment agents cannot be sandboxed in theory — the same proxy injection and executor confinement techniques could, in principle, be applied to any agent. It is that the cost of doing so at the user level creates an organizational risk that rivals the security risk it is meant to mitigate.

The cost operates at two levels:

Building the sandbox is expensive. Every developer's machine would need a credential-injecting proxy, an external executor with a curated operation allowlist, and the infrastructure to manage, update, and monitor these components. This is not a one-time setup: the sandbox must evolve as AI tools evolve, as new MCP servers are adopted, as workflows change. The engineering investment is proportional to the number of users times the diversity of their environments — a scaling factor that makes enterprise-wide deployment prohibitively expensive.

Enforcing the sandbox is even more expensive. A sandbox that users can bypass is not a sandbox. Enforcement means restricting what AI tools employees can run, how those tools access the network, and what packages they can install. In practice, this means limiting the very capabilities that make AI agents useful for general-purpose work. An organization that enforces strict sandboxing on its developers' AI agents is, in effect, throttling its own AI transformation.

This creates a dilemma that is organizational, not technical:

The risk of preventing AI transformation by imposing heavy sandboxing may equal or exceed the risk of a security incident. Both are serious organizational risks, just on different timescales: a data breach causes immediate damage; a stalled AI transformation erodes competitiveness gradually.

This is the fundamental asymmetry: the EaaS agent can be sandboxed at low cost because its purpose is narrow and the infrastructure is centralized (build once, serve all users), while user-environment agents can only be sandboxed at a cost that scales per-user and restricts the very capabilities that make AI adoption valuable. EaaS exploits this asymmetry by moving domain work into an environment where extreme sandboxing is economical, and leaving the user's agent free to do general work without domain credentials or capabilities — and without any sandbox at all.

The result is a defense-in-depth that operates at three independent levels:

  1. The caller never has credentials or capabilities (Section 10.3)
  2. The service agent never has credentials either — proxy injection and executor confinement ensure this architecturally
  3. The natural language interface evaluates intent before any operation reaches the executor (Section 10.5)

An attacker would need to compromise all three levels simultaneously: obtain a valid identity token, craft a natural language request that passes intent evaluation, and somehow circumvent the executor's operation allowlist. Each level is independently defensible, and the combination is qualitatively stronger than any single-layer security model.

10.5 Natural Language as a Security Boundary

An unexpected consequence of the agent-as-interface design is that natural language itself functions as a security boundary.

When the interface to a system is an API, the attacker's language is the API's language: SQL, HTTP, structured function calls. These are precise, machine-interpretable, and exploitable — SQL injection, parameter tampering, request forgery are all attacks that exploit the precision of structured interfaces.

When the interface is natural language mediated by a reasoning agent, the attacker's language is human language — which the service agent interprets, evaluates, and may refuse. The traditional categories of injection attacks do not apply:

This does not mean natural language interfaces are immune to all attacks. Adversarial prompting can attempt to manipulate the service agent's behavior. But the defense is also at the semantic level: the agent can be instructed to recognize and refuse adversarial patterns, and — critically — the agent's security policies operate as system-level constraints that natural language requests cannot override.

The security boundary is not just a filter. It is a reasoning system that understands intent. This is a qualitatively different defense from access control lists, network policies, or credential scoping.

10.6 The Inevitability Argument

We can now state the security argument for EaaS in its strongest form:

  1. Agent environments are insecure and will become more so as agents gain more capabilities and the supply chain grows more complex.
  2. Any capability distributed to an insecure environment will eventually be exploited.
  3. Therefore, the only safe way to make capability accessible to agents is to keep it inside a secure environment and expose only a natural language interface.
  4. The service agent inside that secure environment can itself be sandboxed — stripped of credentials through proxy injection, confined to a finite operation set through an external executor — because its purpose is narrow enough to permit extreme restriction.
  5. This is exactly what EaaS does.

The conclusion is not that EaaS is a better way to distribute capability. It is that EaaS is the only way to make an environment's capability accessible without distributing its tools, credentials, or attack surface to the caller. A compromised client can still issue authorized natural language requests — but the damage is bounded by the semantic security layers, the executor's operation allowlist, and the token's short lifetime. This is a categorical reduction from "everything the user can access," not an elimination of all risk.

Organizations that do not adopt EaaS or an equivalent architecture will face a trilemma as agent-native attacks become widespread:

Each path carries significant cost. The security incident is immediate and visible. The stalled AI transformation is slow and invisible — but compounds over time. The locked-down capability is a persistent drag on organizational effectiveness.

EaaS eliminates the trilemma by separating capability from access. Domain capabilities stay inside a centrally-managed, heavily-sandboxed environment — built once, amortized across all users. User environments remain completely free: no sandbox, no restrictions, no credentials, no domain tools. The bridge between them is natural language through an agent that evaluates intent. Nothing dangerous crosses the boundary. Nothing restrictive is imposed on the user side.

11. Limitations and Tradeoffs

EaaS is not without costs. Several tradeoffs deserve honest acknowledgment.

Latency. The Two-AI Delegation Pattern adds a round trip: the caller's AI sends a request, the EaaS agent reasons and executes, the result returns. For simple lookups that a direct API call could answer in milliseconds, EaaS introduces seconds or minutes of latency. This is acceptable for analytical and multi-step tasks — where the alternative is the caller's AI doing worse work, not faster work — but makes EaaS unsuitable as a replacement for low-latency data APIs serving real-time applications.

Agent reasoning errors. The EaaS agent is an AI system. It can misinterpret requests, select the wrong tool, write incorrect queries, or return flawed analyses. Unlike a deterministic API that either succeeds or fails cleanly, the agent can fail subtly — returning a plausible but wrong answer. This risk is mitigated by the environment's knowledge system (which encodes validation procedures and known edge cases) and by the audit layer (which can detect patterns of error over time), but it cannot be eliminated. Callers should treat EaaS results with the same critical judgment they would apply to a human colleague's analysis.

Cold start. An environment with no accumulated knowledge provides an EaaS service that is only as good as the agent's general reasoning plus whatever tools are available. The knowledge amplification effect (Section 5) requires time and usage to build up. Organizations deploying EaaS should expect a bootstrapping period during which service quality improves progressively.

Operational cost. Running an AI agent as a persistent service interface is more expensive than hosting a static API. Each request consumes inference compute, and complex multi-step requests consume significantly more. The cost structure favors high-value, low-frequency analytical tasks over high-frequency, low-value lookups. Organizations must weigh this against the cost of the alternatives: building and maintaining per-user tool integrations, managing distributed credentials, or foregoing capability distribution entirely.

Dependency on foundation model capability. The quality ceiling of EaaS is bounded by the reasoning capability of the foundation model powering the service agent. As foundation models improve, EaaS service quality improves with them — but the architecture is only as strong as the weakest link in the chain of agent reasoning, tool execution, and knowledge retrieval.

Centralization risk. EaaS concentrates capability, credentials, knowledge, and audit data in a single managed environment. This concentration is the source of its security advantages — but it also creates a high-value target. A compromise of the service environment affects all users, not just one. Insider abuse by operators with access to the environment is a risk that distributed architectures do not share. Multi-tenant isolation, service-side access controls, and operational security practices must be rigorous, and the blast radius of a service-side breach must be taken seriously. EaaS trades distributed, per-user risk for concentrated, infrastructure-level risk — a tradeoff that is favorable when the infrastructure is well-managed, but not one that should be taken for granted.

These are real constraints, not theoretical concerns. They define the operating envelope within which EaaS is the right choice: high-value domain tasks, environments with accumulated knowledge, and contexts where the alternative is not "a faster API" but "no access to the environment's capability at all."

12. Conclusion

Environment as a Service is a simple idea with deep consequences: put an AI agent at the boundary of a domain environment and let it be the interface — to humans and to other agents alike. When a human interacts with it, the agent is a domain expert that understands questions and returns answers. When another AI agent interacts with it, the agent is a domain service that accepts intent and returns structured results. The interface is the same; only the caller changes. This paper has focused on the latter — the agent-to-agent case — because it is the case that existing architectures handle worst and where EaaS's advantages are most decisive.

This single decision resolves the distribution problem — users access the environment's full capability from their own tools, without learning the environment's internals. It creates a new security model — semantic, intent-level security that is impossible with tool-level interfaces. It amplifies knowledge accumulation — every caller, from every context, feeds the same knowledge system. And it is the only architecture that distributes capability without distributing attack surface — a property that shifts from desirable to essential as agent environments become the primary targets of supply chain and prompt injection attacks.

The architecture is straightforward in concept, though non-trivial in implementation. It requires an environment worth distributing, an agent capable of operating within it, and an API that accepts natural language and returns structured results. The components exist today. What is new is the recognition that the right service abstraction for an environment's capability is not a tool, not an endpoint, not a protocol — but an agent.

EaaS completes the Domain Intelligence picture. Where SEKE addresses how domain knowledge is created and refined, EaaS addresses how domain intelligence is distributed and consumed. Together, they describe a system that builds its own expertise through work and makes that expertise available to anyone who can describe what they need — in natural language, from any tool, through a single interface.


The ideas in this paper emerged from building and operating a production domain intelligence system for data analytics. Environment as a Service is not a theoretical proposal — it is an architecture designed from the concrete experience of distributing capability to users across diverse tools and workflows.