Sandbox
Core reference

Sandbox Config

The config module provides the blueprints for the execution environments in jazzmine. It consists of pure configuration dataclasses that describe how a Docker sandbox should be built and constrained. These configurations are "Docker-agnostic" at the definition level, meaning they describe the policy of an environment (resource limits, network access, file mounts) rather than the low-level implementation details of the Docker SDK.

Tool System: Sandbox Configuration

1. Behavior and Context

In the jazzmine tool-calling ecosystem, SandboxConfig is the "Master Specification."

  • Decoupling: By defining constraints in pure Python dataclasses, the framework ensures that configurations can be validated and serialized without requiring a live Docker daemon.
  • Infrastructure-as-Code: Developers define these configs at the start of their application. The SandboxBuilder and SandboxPool then use these specifications to physically provision the images and containers.
  • Validation: The module includes internal validation logic to ensure that hardware limits (Memory/Swap) are physically valid before they reach the Docker engine.
  • DooD (Docker-out-of-Docker) Awareness: The configuration includes specialized logic for path translation when the agent controller itself is running inside a container, a common scenario in production environments.

2. Purpose

  • Security Enforcement: Defining exactly which hosts a tool can talk to (Network Policy) and how much hardware it can consume (Resource Limits).
  • Environment Standardization: Ensuring every tool execution happens in a consistent Python version with identical pre-installed dependencies.
  • Data Provisioning: Managing read-only access to host files (Knowledge Bases, Models) via secure bind-mounts.
  • Execution Strategy: Selecting between "Plan" mode (batch execution) and "Interactive" mode (step-by-step execution).

3. High-Level API & Comprehensive Example

This example creates a "High-Security Analytics Sandbox" utilizing every single available configuration field.

python
from jazzmine.core.tools import (
    SandboxConfig, 
    ResourceLimits, 
    NetworkPolicy, 
    FileResource, 
    ExecutionMode
)

# 1. Define strict, validated Resource Limits
limits = ResourceLimits(
    cpu_quota=0.5,             # 50% of one CPU core
    memory_mb=512,             # 512MB RAM
    memory_swap_mb=1024,       # 1GB Total (RAM + Swap). Must be > memory_mb.
    pids_limit=32,             # Limit processes to prevent fork-bombs
    execution_timeout_sec=60,  # Kill script after 1 minute
    max_output_bytes=1048576   # Limit output log to 1MB
)

# 2. Define an Egress Network Policy
# You can provide FQDNs or specific IP addresses for testing.
policy = NetworkPolicy(
    allowed_hosts=["api.stripe.com", "1.2.3.4"],
    allowed_ports={
        "api.stripe.com": [443],
        "1.2.3.4": [8080]  # Custom port for a testing server
    },
    default_port=443
)

# 3. Define File Resources (Bind Mounts)
shared_docs = FileResource(
    host_path="/app/data/docs",             # Path as seen by the Agent
    container_path="/data/docs",            # Path as seen by the Tool
    read_only=True,
    docker_host_path="/srv/prod/agent_docs" # Actual physical path on the HOST
)

# 4. Assemble the complete Sandbox Configuration
analytics_config = SandboxConfig(
    name="secure_analytics",
    python_version="3.11",
    dependencies=["pandas", "scipy", "httpx"],
    resources=[shared_docs],
    network_policy=policy,
    resource_limits=limits,
    secrets=["STRIPE_KEY", "DB_PASSWORD"],
    execution_mode=ExecutionMode.INTERACTIVE,  # Step-by-step reasoning
    pool_size=5,                              # Pre-warm 5 containers
    scratch_size_mb=128                       # 128MB RAM-disk at /workspace
)

5. Detailed Functionality

ExecutionMode (Enum)

Determines how the Agent interacts with the sandbox.

  • PLAN: The agent writes a full script, sends it, and receives the final result.
  • INTERACTIVE: The agent writes one step, sees the intermediate output, and then writes the next step.

ResourceLimits

Wraps the hardware constraints enforced by the Docker cgroup layer.

ParameterDefaultDescription
cpu_quota0.5Fractional CPU cores allocated.
memory_mb256Maximum RAM allowed. Container is killed if exceeded.
memory_swap_mb-1Limit for RAM+Swap. -1 means unlimited swap.
pids_limit64Max number of concurrent processes or threads.
execution_timeout_sec30Wall-clock timeout for script execution.
max_output_bytes1MBMax size of the JSON event stream read from stdout.

Validation Logic: The class includes a __post_init__ check:

  • memory_mb must be positive.
  • If memory_swap_mb is set (not -1), it must be greater than memory_mb, as Docker interprets this value as the aggregate limit of RAM + Swap.

NetworkPolicy

Controls egress traffic via an internal proxy sidecar.

  • allowed_hosts: Accepts Domain Names or IP Addresses.
  • Security: The proxy sidecar will reject any IP literal that is not explicitly defined in this list, preventing "raw IP" bypasses of domain filters.
  • allowlist_env(): Serializes the policy into a string format (host:port,host:port) consumed by the proxy server.

FileResource

Defines a secure bridge between the host filesystem and the sandbox.

  • host_path: The path relative to the agent process.
  • container_path: The path where the resource appears inside the sandbox (usually /data/...).
  • docker_host_path: Used for DooD (Docker-out-of-Docker). When the agent is itself a container, the Docker daemon needs the real host path to mount, not the path inside the agent's container.

SandboxConfig

The root specification object.

  • pool_size: Determines how many "warm" containers are kept ready.
  • scratch_size_mb: Size of the /workspace directory, which is mounted as a tmpfs (RAM-disk). Files here never touch the host disk and are wiped when the container is returned to the pool.
  • with_tool_dependencies(): A helper that clones the config while adding new pip requirements from the tool level.

6. Error Handling

  • ValueError (Configuration): Raised during ResourceLimits initialization if memory_mb is non-positive or if swap is less than memory.
  • ValueError (Runtime): Raised by infrastructure components if they attempt to build a sandbox that has not been registered in the ToolRegistry.
  • Network Isolation: If allowed_hosts is empty, is_isolated returns True. The system will skip the proxy setup, ensuring the container has no network interface at all.

7. Remarks

  • Path Translation Priority: In to_docker_mount, the framework prioritizes resolved_host_path (from the pool's prefix map), then docker_host_path, and finally host_path.
  • Python Version: The python_version string selects the base image (e.g., python:3.11-slim). Ensure the version corresponds to a valid tag on Docker Hub.
  • Resource Efficiency: Use scratch_size_mb judiciously. Since it is a tmpfs, it consumes host RAM directly and is not eligible for swap-out.