OpenShell Backend¶
The OpenShell backend runs AI agents inside OpenShell sandboxes with network policy enforcement, filesystem isolation, and Landlock-based access control.
How It Works¶
The backend manages three components: a gateway (control plane),
a provider (credentials), and a sandbox (isolated execution
environment). On each agentic-ci run --backend openshell, it:
- Starts the OpenShell gateway with TLS and mTLS auth
- Creates a GCP or Anthropic credential provider
- Creates a sandbox container from the specified image
- Applies a network policy and waits for it to activate
- Uploads an env script with agent configuration
- Executes the agent inside the sandbox
- Tears everything down on completion
OpenShell Commands¶
Below is the exact sequence of openshell CLI commands that agentic-ci
executes. All management commands go through the openshell client CLI,
which talks to the running openshell-gateway server over gRPC.
Gateway Setup¶
# Check if gateway is already running
openshell status
# Generate TLS certificates for sandbox JWT auth
openshell-gateway generate-certs \
--output-dir ~/.local/state/openshell/tls \
--server-san host.openshell.internal
# Start the gateway server (background process)
# Reads config from ~/.config/openshell/gateway.toml
openshell-gateway --db-url sqlite::memory: --log-level info
# Register the gateway with the CLI
openshell gateway add https://localhost:17670 --local --name ci
# Wait for the gateway to become healthy (retries)
openshell status
The gateway config (gateway.toml) is generated by agentic-ci:
[openshell]
version = 1
[openshell.gateway]
bind_address = "0.0.0.0:17670"
compute_drivers = ["podman"]
# Only added when OPENSHELL_SUPERVISOR_IMAGE is set
# TODO: to be replaced by an official image, or the default NVIDIA one
[openshell.drivers.podman]
supervisor_image = "quay.io/mprpic/openshell-supervisor:pr1763"
Provider Setup¶
The provider injects credentials into the sandbox. The setup differs by auth mode.
Vertex AI with User OAuth (local development)¶
openshell provider get ci-gcp # check if exists
openshell provider create \
--name ci-gcp \
--type google-cloud \
--from-gcloud-adc \
--config project_id=<PROJECT> \
--config region=global
Requires gcloud auth application-default login to have been run
first. The --from-gcloud-adc flag reads the user's OAuth refresh token
from ~/.config/gcloud/application_default_credentials.json and mints
an initial access token synchronously.
Vertex AI with Service Account (CI)¶
openshell provider get ci-gcp # check if exists
openshell provider create \
--name ci-gcp \
--type google-cloud \
--credential GCP_SA_ACCESS_TOKEN=placeholder \
--config project_id=<PROJECT> \
--config region=global \
--config service_account_email=<EMAIL>
# Configure JWT-based token refresh from the service account key
openshell provider refresh configure \
--credential-key GCP_SA_ACCESS_TOKEN \
--strategy google-service-account-jwt \
--material client_email=<EMAIL> \
--material private_key=<PRIVATE_KEY> \
--secret-material-key private_key \
ci-gcp
# Mint the initial token immediately (refresh worker runs on 60s interval)
openshell provider refresh rotate \
--credential-key GCP_SA_ACCESS_TOKEN \
ci-gcp
The three-step flow is needed because --from-gcloud-adc rejects service
account keys. The refresh rotate call triggers immediate token minting
instead of waiting for the 60-second background sweep.
API Key (direct Anthropic API)¶
openshell provider get ci-gcp # check if exists
openshell provider create \
--name ci-gcp \
--type anthropic \
--credential ANTHROPIC_API_KEY
Sandbox Lifecycle¶
openshell sandbox get ci # check if exists
# Create sandbox with the provider attached
openshell sandbox create \
--name ci \
--no-tty \
--provider ci-gcp \
--from <SANDBOX_IMAGE> \
-- true
# Apply network policy and wait for the supervisor to compile and load it.
# Built-in defaults are always included. If .agentic-ci/openshell-policy.yml
# exists in the workdir, its endpoints are merged in automatically.
openshell policy update --wait \
--binary /usr/local/bin/claude \
--binary /usr/bin/opencode \
--add-endpoint github.com:443:full \
--add-endpoint *.github.com:443:full \
--add-endpoint gitlab.com:443:full \
--add-endpoint pypi.org:443:read-only \
--add-endpoint files.pythonhosted.org:443:read-only \
--add-endpoint aiplatform.googleapis.com:443:read-write \
--add-endpoint *.aiplatform.googleapis.com:443:read-write \
--add-endpoint oauth2.googleapis.com:443:read-write \
--add-endpoint api.anthropic.com:443:read-write \
ci
# Upload env script with agent configuration
openshell sandbox upload --no-git-ignore ci <env-script-file>
openshell sandbox exec --name ci --no-tty -- \
bash -c "mv <filename> /tmp/.agentic-ci-env.sh"
# Run the agent
openshell sandbox exec --name ci --no-tty -- \
bash -c ". /tmp/.agentic-ci-env.sh && exec \"$@\"" -- \
claude --permission-mode bypassPermissions --model <MODEL> \
--output-format stream-json --verbose -p "<PROMPT>"
Teardown¶
openshell sandbox get ci # check if exists
openshell sandbox delete ci
openshell gateway remove ci # deregister from CLI
# Gateway and podman service processes are killed by PID
Network Policy¶
Endpoints are applied via openshell policy update --wait after sandbox
creation. The --wait flag blocks until the supervisor confirms the
policy rules are compiled and active. This prevents a race condition
where the agent starts before the policy is ready.
Each endpoint must specify explicit binary paths (--binary
/usr/local/bin/claude). Using --binary "*" as a wildcard does not
work for CONNECT tunnel requests, which is how HTTPS clients establish
connections through the supervisor proxy.
The default endpoints cover:
| Endpoint | Access | Purpose |
|---|---|---|
github.com:443 |
full | GitHub API and git operations |
*.github.com:443 |
full | GitHub subdomains (raw, API, etc.) |
gitlab.com:443 |
full | GitLab API and git operations |
pypi.org:443 |
read-only | Python package index |
files.pythonhosted.org:443 |
read-only | Python package downloads |
aiplatform.googleapis.com:443 |
read-write | Vertex AI (global endpoint) |
*.aiplatform.googleapis.com:443 |
read-write | Vertex AI (regional endpoints) |
oauth2.googleapis.com:443 |
read-write | GCP token exchange |
api.anthropic.com:443 |
read-write | Anthropic API (API key auth) |
Project-specific endpoints¶
Projects can declare additional endpoints in
.agentic-ci/openshell-policy.yml at the repository root. These are
merged with the built-in defaults (duplicates are ignored).
# .agentic-ci/openshell-policy.yml
endpoints:
- "redhat.atlassian.net:443:read-only"
- "*.example.com:443:full"
The --policy CLI flag takes precedence: if a flag path is provided
and the file exists, the repo-level file is ignored.
Supervisor Image¶
The sandbox supervisor runs inside each sandbox container and enforces policies. It is mounted as a read-only image volume by the gateway's podman driver.
The default supervisor image is
ghcr.io/nvidia/openshell/supervisor:latest. To override it, set
the OPENSHELL_SUPERVISOR_IMAGE environment variable before running
agentic-ci. This is written into the gateway's TOML config under
[openshell.drivers.podman] supervisor_image.
Currently, the google-cloud provider requires a supervisor built
from PR #1763, which
adds the GCE metadata emulator. A pre-built image is available at
quay.io/mprpic/openshell-supervisor:pr1763. Once PR #1763 merges and
supervisor:latest is rebuilt, the override will no longer be needed.
Known Issues and Workarounds¶
--binary "*" does not work for CONNECT requests¶
The wildcard * in openshell policy update --binary "*" fails to match
binaries making HTTPS CONNECT tunnel requests. Use explicit paths instead:
--from-gcloud-adc rejects service account keys¶
The google-cloud provider's --from-gcloud-adc flag only accepts user
OAuth credentials (from gcloud auth application-default login). Service
account JSON keys must be configured via the three-step create + refresh
configure + rotate flow described above.
Credential refresh worker does not mint initial tokens¶
After openshell provider refresh configure, the gateway's refresh
worker runs on a 60-second interval. Without an explicit openshell
provider refresh rotate, the agent may start before the first token is
minted. Always call rotate after configure for service accounts.
OPENSHELL_SUPERVISOR_IMAGE is not a gateway env var¶
The gateway binary does not read OPENSHELL_SUPERVISOR_IMAGE
from the environment. It reads supervisor_image from the
[openshell.drivers.podman] section of gateway.toml. The env var
is a convention used by agentic-ci (and OpenShell's own dev scripts)
to pass the image name into config generation.