Tool System: Sandbox Configuration
1. Behavior and Context
In the jazzmine tool-calling ecosystem, SandboxConfig is the "Master Specification."
- Decoupling: By defining constraints in pure Python dataclasses, the framework ensures that configurations can be validated and serialized without requiring a live Docker daemon.
- Infrastructure-as-Code: Developers define these configs at the start of their application. The SandboxBuilder and SandboxPool then use these specifications to physically provision the images and containers.
- Validation: The module includes internal validation logic to ensure that hardware limits (Memory/Swap) are physically valid before they reach the Docker engine.
- DooD (Docker-out-of-Docker) Awareness: The configuration includes specialized logic for path translation when the agent controller itself is running inside a container, a common scenario in production environments.
2. Purpose
- Security Enforcement: Defining exactly which hosts a tool can talk to (Network Policy) and how much hardware it can consume (Resource Limits).
- Environment Standardization: Ensuring every tool execution happens in a consistent Python version with identical pre-installed dependencies.
- Data Provisioning: Managing read-only access to host files (Knowledge Bases, Models) via secure bind-mounts.
- Execution Strategy: Selecting between "Plan" mode (batch execution) and "Interactive" mode (step-by-step execution).
3. High-Level API & Comprehensive Example
This example creates a "High-Security Analytics Sandbox" utilizing every single available configuration field.
from jazzmine.core.tools import (
SandboxConfig,
ResourceLimits,
NetworkPolicy,
FileResource,
ExecutionMode
)
# 1. Define strict, validated Resource Limits
limits = ResourceLimits(
cpu_quota=0.5, # 50% of one CPU core
memory_mb=512, # 512MB RAM
memory_swap_mb=1024, # 1GB Total (RAM + Swap). Must be > memory_mb.
pids_limit=32, # Limit processes to prevent fork-bombs
execution_timeout_sec=60, # Kill script after 1 minute
max_output_bytes=1048576 # Limit output log to 1MB
)
# 2. Define an Egress Network Policy
# You can provide FQDNs or specific IP addresses for testing.
policy = NetworkPolicy(
allowed_hosts=["api.stripe.com", "1.2.3.4"],
allowed_ports={
"api.stripe.com": [443],
"1.2.3.4": [8080] # Custom port for a testing server
},
default_port=443
)
# 3. Define File Resources (Bind Mounts)
shared_docs = FileResource(
host_path="/app/data/docs", # Path as seen by the Agent
container_path="/data/docs", # Path as seen by the Tool
read_only=True,
docker_host_path="/srv/prod/agent_docs" # Actual physical path on the HOST
)
# 4. Assemble the complete Sandbox Configuration
analytics_config = SandboxConfig(
name="secure_analytics",
python_version="3.11",
dependencies=["pandas", "scipy", "httpx"],
resources=[shared_docs],
network_policy=policy,
resource_limits=limits,
secrets=["STRIPE_KEY", "DB_PASSWORD"],
execution_mode=ExecutionMode.INTERACTIVE, # Step-by-step reasoning
pool_size=5, # Pre-warm 5 containers
scratch_size_mb=128 # 128MB RAM-disk at /workspace
)5. Detailed Functionality
ExecutionMode (Enum)
Determines how the Agent interacts with the sandbox.
- PLAN: The agent writes a full script, sends it, and receives the final result.
- INTERACTIVE: The agent writes one step, sees the intermediate output, and then writes the next step.
ResourceLimits
Wraps the hardware constraints enforced by the Docker cgroup layer.
| Parameter | Default | Description |
|---|---|---|
| cpu_quota | 0.5 | Fractional CPU cores allocated. |
| memory_mb | 256 | Maximum RAM allowed. Container is killed if exceeded. |
| memory_swap_mb | -1 | Limit for RAM+Swap. -1 means unlimited swap. |
| pids_limit | 64 | Max number of concurrent processes or threads. |
| execution_timeout_sec | 30 | Wall-clock timeout for script execution. |
| max_output_bytes | 1MB | Max size of the JSON event stream read from stdout. |
Validation Logic: The class includes a __post_init__ check:
- memory_mb must be positive.
- If memory_swap_mb is set (not -1), it must be greater than memory_mb, as Docker interprets this value as the aggregate limit of RAM + Swap.
NetworkPolicy
Controls egress traffic via an internal proxy sidecar.
- allowed_hosts: Accepts Domain Names or IP Addresses.
- Security: The proxy sidecar will reject any IP literal that is not explicitly defined in this list, preventing "raw IP" bypasses of domain filters.
- allowlist_env(): Serializes the policy into a string format (host:port,host:port) consumed by the proxy server.
FileResource
Defines a secure bridge between the host filesystem and the sandbox.
- host_path: The path relative to the agent process.
- container_path: The path where the resource appears inside the sandbox (usually /data/...).
- docker_host_path: Used for DooD (Docker-out-of-Docker). When the agent is itself a container, the Docker daemon needs the real host path to mount, not the path inside the agent's container.
SandboxConfig
The root specification object.
- pool_size: Determines how many "warm" containers are kept ready.
- scratch_size_mb: Size of the /workspace directory, which is mounted as a tmpfs (RAM-disk). Files here never touch the host disk and are wiped when the container is returned to the pool.
- with_tool_dependencies(): A helper that clones the config while adding new pip requirements from the tool level.
6. Error Handling
- ValueError (Configuration): Raised during ResourceLimits initialization if memory_mb is non-positive or if swap is less than memory.
- ValueError (Runtime): Raised by infrastructure components if they attempt to build a sandbox that has not been registered in the ToolRegistry.
- Network Isolation: If allowed_hosts is empty, is_isolated returns True. The system will skip the proxy setup, ensuring the container has no network interface at all.
7. Remarks
- Path Translation Priority: In to_docker_mount, the framework prioritizes resolved_host_path (from the pool's prefix map), then docker_host_path, and finally host_path.
- Python Version: The python_version string selects the base image (e.g., python:3.11-slim). Ensure the version corresponds to a valid tag on Docker Hub.
- Resource Efficiency: Use scratch_size_mb judiciously. Since it is a tmpfs, it consumes host RAM directly and is not eligible for swap-out.