Sandboxed Execution for Coding Agents Trial
Overview
Sandboxed execution isolates coding agents from the developer's primary machine, credentials, networks, and production systems while they install packages, call tools, or modify files. OpenAI describes the Codex sandbox as the boundary that lets an agent act autonomously without unrestricted access to the machine, defining what files it can modify and whether commands can use the network (Codex sandboxing).
The key distinction is that sandboxing and approvals are separate controls. A sandbox sets technical boundaries, while an approval policy decides when the agent must stop and ask before crossing those boundaries (Codex sandboxing). This pattern is especially important for coding agents because their normal workflow involves executing shell commands, running package managers, starting test harnesses, editing files, and sometimes interacting with external systems.
The reason to classify sandboxed execution for coding agents as Trial is that it should become a standard control, but implementations are still uneven. Teams should trial sandboxing for local coding agents, cloud agents, CI agents, and developer workstations, then standardize only after validating filesystem, network, secrets, approvals, logs, and escape hatches.
Adoption Signals
- Codex applies sandboxing automatically in its default permission mode for local commands in the app, IDE extension, and CLI, with platform-native enforcement across macOS, Linux, WSL2, and native Windows (Codex sandboxing).
- Codex defines common sandbox modes:
read-only,workspace-write, anddanger-full-access, plus approval policies such asuntrusted,on-request, andnever(Codex sandboxing). - Codex states that spawned commands such as
git, package managers, and test runners inherit the same sandbox boundaries, not only the agent's built-in file operations (Codex sandboxing). - Claude Code documents a sandboxed Bash tool where users define which files and network domains commands can touch, and the operating system enforces the boundary for every Bash command and its child processes (Claude Code sandboxing).
- Claude Code supports macOS Seatbelt and Linux/WSL2 bubblewrap, with filesystem controls for allowed and denied reads/writes and network controls for allowed or denied domains (Claude Code sandboxing).
- Docker documents running OpenCode inside Docker Sandboxes with
sbx run opencode, project-directory isolation, stored secrets, credential injection, and host user-level configuration isolation (Docker OpenCode sandbox). - OpenCode provides a permission model with
allow,ask, anddeny, including controls for file reads, edits, bash commands, web access, subagents, skills, and external directories (OpenCode permissions). - OWASP's Excessive Agency guidance reinforces the need to limit functions, permissions, and autonomy for LLM-based systems that can call tools or extensions, including avoiding open-ended shell-command tools where more granular tools are possible (OWASP LLM06 Excessive Agency).
Risks
- Full access modes can erase the boundary. Codex documents
danger-full-accessas removing filesystem and network boundaries, and full access combines that mode withapproval_policy = "never"(Codex sandboxing). - Network allowlists are hard to reason about. Claude Code warns that broad allowed domains can create data-exfiltration paths and that the built-in proxy does not inspect encrypted traffic, so stronger guarantees may require a custom TLS-inspecting proxy (Claude Code sandboxing).
- Credential files and inherited environment variables can leak. Claude Code notes that credentials such as
~/.aws/credentialsand~/.ssh/are readable by default unless denied, and sandboxed Bash commands inherit the parent process environment by default unless scrubbed (Claude Code sandboxing). - Docker access can break isolation. Claude Code warns that allowing access to
/var/run/docker.sockeffectively grants access to the host system through the Docker socket (Claude Code sandboxing). - Permission prompts are not the same as OS isolation. OpenCode's permission model can ask, allow, or deny agent actions, but permissions alone do not necessarily constrain what a spawned process can do at the operating-system level (OpenCode permissions).
- Tool compatibility can force exceptions. Claude Code notes that some tools such as Docker, Watchman, certain Go-based CLIs, or Windows binaries under WSL may need to run outside the sandbox, creating exception paths that must be reviewed (Claude Code sandboxing).
- Sandbox configuration can drift by developer. If sandboxing is optional or configured only locally, teams may end up with inconsistent safety boundaries; Claude Code supports managed settings to require sandboxing and prevent unsandboxed command fallback (Claude Code sandboxing).
- Sandboxing does not stop bad code from being committed. A sandbox can contain execution effects, but generated changes still need human review, tests, static analysis, dependency scanning, and branch protection before leaving the sandbox.
Pros & Cons
Advantages
- Gives coding agents a bounded environment where they can install packages, edit files, run tests, and execute commands without unrestricted access to the developer's machine or production systems.
- Reduces approval fatigue by allowing safe actions inside a pre-approved filesystem and network boundary while escalating actions that cross that boundary.
- Enables reproducible and reviewable agent work through ephemeral environments, explicit secret injection, resettable state, artifact review, and clear promotion paths from sandbox to branch or pull request.
Disadvantages
- Sandboxing is not uniform across agents, operating systems, and deployment modes; each tool's filesystem, network, secret, and approval behavior must be verified.
- Broad filesystem writes, network allowlists, Docker socket access, inherited environment variables, or unsandboxed escape hatches can turn a sandbox into a weak boundary.
- Sandboxes reduce blast radius but do not replace code review, test validation, dependency scanning, prompt-injection defenses, or least-privilege downstream credentials.
Recommendation
Trial sandboxed execution as a baseline control for coding agents before allowing autonomous command execution. Start with ephemeral or resettable environments, workspace-scoped writes, no host credential access by default, network deny-by-default with narrow allowlists, explicit secret injection, resource limits, and approval gates for actions outside the sandbox.
Validate the actual boundary, not the product claim. Test whether the agent can read SSH keys, cloud credentials, shell history, .env files, sibling directories, Docker sockets, browser profiles, internal network endpoints, package-manager lifecycle scripts, and production deployment credentials. Confirm that spawned commands inherit the same boundary and that logs show what was run.
Promote results through reviewable artifacts. Agents should produce diffs, logs, test outputs, and summaries that humans can review before changes are pushed, merged, deployed, or granted broader access. Move from Trial to Adopt only when sandbox profiles, secret handling, network controls, audit logs, and exception workflows are reproducible across the team's agent tools.