Builder
Core reference

Builder: AgentBuilder

AgentBuilder is the fluent composition API for creating a fully wired Jazzmine agent.

1. Overview

It is responsible for:

  • validating configuration before I/O,
  • instantiating LLMs, stores, memory, tools, and optional HTTP server,
  • returning a runtime pair: (agent, teardown).

All fluent methods return self for chaining. build() is the only async public method.

2. Public API Coverage Checklist

This document covers every public constructor/method/property on AgentBuilder.

2.1 Constructor and Core Fluent Methods

MethodIncludedPurpose
__init__(name, agent_id, personality)YesRequired identity and personality fields
llm(config)YesPrimary reasoning LLM
script_gen_llm(config)YesOptional script-generation LLM
embeddings(...)YesAutomatic tokenizer/ONNX setup
memory(...)YesQdrant-backed long-term memory settings
storage(config)YesMessage store backend selection
sandbox(...)YesRegister sandbox specs
flows(flows)YesRegister flow definitions
domain_terms(terms)YesSeed semantic memory terms
settings(**kwargs)YesOverride AgentSettings fields
version(version)YesStartup display version label
on_intermediate(callback)YesSandbox event callback
logging(config)YesStructured agent logging
security(config)YesSecurityGuard setup
server(config)YesHTTP server setup
agent_server (property)YesAccess running AgentServer after build
build()YesFull runtime assembly

2.2 Escape Hatch Injection Methods

MethodIncludedPurpose
with_llm(llm)YesInject pre-built primary LLM
with_script_gen_llm(llm)YesInject pre-built script-gen LLM
with_message_store(store)YesInject pre-built message store
with_wm_store(store)YesInject pre-built working-memory store
redis_wm(redis_client, ttl_seconds=3600)YesCreate Redis working-memory store shortcut
with_registry(registry)YesInject tool registry
with_enhancer(enhancer)YesInject message enhancer
with_summarizer(summarizer)YesInject summarizer
with_flow_selector(selector)YesInject flow selector
with_episodic_memory(memory)YesInject episodic memory
with_semantic_memory(memory)YesInject semantic memory
with_pool(pool)YesInject sandbox pool

3. Constructor and Fluent Methods

3.1 __init__(name, agent_id, personality)

Creates a builder with required identity fields.

Immediate validation:

  • name cannot be empty/whitespace,
  • agent_id cannot be empty/whitespace,
  • personality cannot be empty/whitespace.

Failure raises ConfigError immediately.

3.2 llm(config: LLMConfig)

Sets the primary LLM used for:

  • core reasoning,
  • flow selection,
  • message enhancement,
  • summarization.

If neither .llm(...) nor .with_llm(...) is provided, build() fails validation.

3.3 script_gen_llm(config: LLMConfig)

Optional dedicated LLM for script generation in tool orchestration.

If omitted, script generation falls back to the primary LLM.

3.4 embeddings(...)

Sets automatic embedding model preparation. During build(), AgentBuilder can:

  • download tokenizer,
  • export ONNX model,
  • quantize ONNX (INT8) when requested,
  • patch memory config paths automatically.

Parameters:

  • model_id default "BAAI/bge-small-en-v1.5"
  • output_dir default empty (resolved to ~/.jazzmine/models/<slug>/)
  • tokenizer_only default False
  • quantized default True
  • opset default 17
  • force_rebuild default False

Notes:

  • tokenizer_only=True skips ONNX export.
  • If .embeddings() is called without .memory(), builder logs a warning but does not fail.

3.5 memory(...)

Enables Qdrant-backed long-term memory (episodic/procedural/semantic).

Important behavior:

  • If .embeddings() is used, tokenizer_path and (for local mode) model_dir are auto-resolved.
  • Without .embeddings(), you must provide the embedding backend manually.

Backend modes:

  • automatic: call .embeddings() before .memory().
  • manual local: set model_dir.
  • manual remote: set embed_api_key (and optionally provider/base URL/model name).

Qdrant automation:

  • if qdrant_auto_start=True, qdrant_api_key is None, and URL host is local (localhost, 127.0.0.1, 0.0.0.0, ::1), builder can auto-start a local Docker Qdrant container.

3.6 storage(config: StorageConfig)

Sets message storage backend using one of:

  • JsonStorage,
  • PostgresStorage,
  • MongoDBStorage.

Default is JsonStorage(). If JSON storage path is empty, builder creates a temporary file and registers cleanup in teardown.

3.7 sandbox(...)

Registers one sandbox spec per call.

Use this to define:

  • runtime/python version,
  • resource limits,
  • networking allowlists,
  • file mounts,
  • package list,
  • secrets,
  • execution mode and pool sizing.

Call multiple times for multiple named sandboxes.

3.8 flows(flows: list[Any])

Registers flow definitions to be synchronized during build when memory is enabled.

Important: flow registry is weak-reference-backed, so keep a strong reference to your flow list in caller scope.

3.9 domain_terms(terms: list[DomainTerm])

Defines semantic glossary entries.

Each entry must be a 5-tuple: (key, value, category, aliases, description).

Entries are diff-synced to semantic memory during build when memory is active.

3.10 settings(**kwargs)

Overrides AgentSettings fields by name.

Unknown key behavior:

  • raises ConfigError with list of valid settings fields.

3.11 version(version: str)

Sets optional version string used by startup display.

3.12 on_intermediate(callback: Callable)

Registers callback for sandbox intermediate events.

3.13 logging(config: dict)

Enables structured runtime logging via AgentLogger.from_config(...).

If configured, logger startup/shutdown is integrated into build/teardown.

3.14 security(config: SecurityConfig)

Enables SecurityGuard with input/output moderation and optional file sanitizer.

Notable rules:

  • input_moderator and toxicity_detector are mutually exclusive.
  • if .security() is not called, runtime uses NOOP_GUARD.

3.15 server(config: ServerConfig)

Attaches HTTP server lifecycle to the built agent.

Behavior:

  • server starts at end of build(),
  • server stop callback is registered in teardown,
  • effective server instance is available via builder.agent_server.

Default endpoint set (from ServerConfig):

  • POST {conversations_endpoint} create conversation
  • GET {conversations_endpoint} list conversations for user
  • GET {conversations_endpoint}/search search conversations
  • GET {conversations_endpoint}/{conversation_id}/messages list messages
  • PATCH {conversations_endpoint}/{conversation_id} update metadata
  • DELETE {conversations_endpoint}/{conversation_id} delete conversation
  • POST {chat_endpoint} full response
  • POST {chat_endpoint}/stream SSE response stream
  • GET {health_endpoint} health probe
  • GET {info_endpoint} agent metadata

3.16 agent_server property

Returns AgentServer | None.

Typical use:

  • after build(), inspect builder.agent_server.effective_port when using port=0 in tests.

4. Escape Hatches and Injection Strategy

Escape hatches let you bypass default construction and inject pre-built components.

Use cases:

  • unit/integration tests,
  • custom implementations,
  • shared connection pools/registries,
  • deterministic runtime wiring.

Coverage:

  • with_llm, with_script_gen_llm
  • with_message_store, with_wm_store, redis_wm
  • with_registry, with_pool
  • with_enhancer, with_summarizer, with_flow_selector
  • with_episodic_memory, with_semantic_memory

5. Validation Model (_validate)

Validation executes synchronously at the top of build() before opening runtime resources.

It accumulates all failures and raises one ConfigError containing bulleted details.

5.1 LLM Validation

  • requires either .llm(...) or .with_llm(...).
  • validates provider config objects through _validate_llm_cfg.

5.2 Embedding and Memory Cross-Validation

Checks include:

  • embeddings.model_id non-empty,
  • embeddings.opset >= 11,
  • output dir is not an existing file,
  • tokenizer/backend consistency rules,
  • embedding mode conflicts (auto/local/remote).

Memory checks include:

  • qdrant_url non-empty,
  • vector_size >= 1,
  • quantization in {None, 4, 8},
  • valid distance_metric in {cosine, euclidean, dot},
  • positive shard/replication/index flush limits,
  • max_batch >= 2,
  • remote provider allowlist validation when API mode is used.

5.3 Storage Validation

  • Postgres: non-empty DSN, pool_min >= 1, pool_max >= pool_min.
  • MongoDB: non-empty URI and DB name.

5.4 Sandbox Validation

Per sandbox spec:

  • non-empty unique name,
  • resource lower bounds (memory, pids, timeout, output cap, CPU, pool, scratch),
  • swap rule (-1 or > memory_mb),
  • port ranges (1..65535),
  • non-empty file mount paths,
  • no blank pip package entries,
  • execution mode in {plan, interactive}.

5.5 Settings Validation

Checks include:

  • default_execution_mode in {plan, interactive},
  • positive turn/retry limits,
  • enhancer_history_window >= 2,
  • summarizer_trigger >= 1,
  • episode_overlap < max_episode_size,
  • pool_max_overflow >= 0,
  • positive flow retrieval limits.

5.6 Domain Terms Validation

Each term must be a tuple/list of length 5.

5.7 Security Validation

Checks include:

  • input_moderator XOR toxicity_detector,
  • required interface methods (classify, predict, sanitize) when objects are provided,
  • threshold ranges in [0.0, 1.0],
  • positive moderation timeout,
  • non-empty block messages.

5.8 Server Validation (_validate_server_cfg)

Checks include:

  • port in 1..65535, non-empty host,
  • all endpoint paths start with /,
  • endpoint uniqueness,
  • no collision with auto stream path ${chat_endpoint}/stream,
  • positive request limits,
  • SSL cert/key pair rules and file existence,
  • valid log level,
  • valid backlog/keep-alive/concurrency settings,
  • CORS credential + wildcard-origin conflict prevention,
  • non-empty CORS origin list,
  • non-blank API key string if set.

6. Build Lifecycle (await build())

Execution order is deterministic.

  1. run _validate().
  2. initialize startup display and register teardown banner callbacks.
  3. if configured, run embedding setup before memory construction.
  4. construct primary/script LLMs (or use injected instances).
  5. prepare tool registry and compute registry change report.
  6. create sandbox pool if sandboxes/tools are present and no pool injected.
  7. initialize message store (with temp JSON file support).
  8. initialize working memory store (default in-process unless injected).
  9. initialize long-term memory (_build_memory) with optional Qdrant auto-start.
  10. sync domain terms and flows when long-term memory is active.
  11. construct flow selector, enhancer, summarizer.
  12. build AgentConfig and then the runtime agent.
  13. optionally start HTTP server and expose builder.agent_server.
  14. register teardown callbacks and return (agent, teardown).

7. Runtime/Dependency Details

7.1 Embedding Setup Internals

_build_embeddings():

  • checks cache (tokenizer.json and ONNX where applicable),
  • checks required Python packages (transformers, and for ONNX export also torch, onnxruntime-tools),
  • executes tokenizer/ONNX export in thread executor,
  • patches self._mem_cfg.tokenizer_path, and local model_dir/quantized when relevant.

7.2 Memory Runtime Internals

_build_memory():

  • validates onnxruntime availability/version (requires >=1.23.0),
  • attempts to resolve ORT shared library path from installed wheel,
  • imports Rust extension module memory,
  • creates QdrantManager, ensures collections, then builds:
  • EpisodicMemory,
  • ProceduralMemory,
  • SemanticMemory.

If .memory() is not configured, builder returns no-op episodic/semantic memory stubs.

8. Error Handling and Failure Modes

8.1 ConfigError

Thrown by constructor (required identity fields) and by _validate() aggregate failures.

8.2 Embedding Setup Failures

Embedding export errors are wrapped into clear ConfigError messages including:

  • model,
  • output dir,
  • common causes (network, disk, missing deps, invalid model id).

8.3 Memory Runtime Import/Version Failures

Common explicit failures:

  • missing onnxruntime,
  • too-old onnxruntime,
  • missing Rust extension module memory.

8.4 HTTP Server Startup Failures

Server startup failures are raised as RuntimeError with host/port context.

9. Practical Examples

9.1 Minimal Agent (No Long-Term Memory)

python
import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder(
            name="Aria",
            agent_id="aria-local",
            personality="Helpful and concise assistant.",
        )
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
    )

    agent, teardown = await builder.build()
    try:
        result = await agent.chat(
            user_id="u1",
            conversation_id="c1",
            content="Hello",
        )
        print(result.response)
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.2 Local Embeddings + Auto Qdrant

python
import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-mem", "Memory-enabled assistant")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .embeddings(
            model_id="BAAI/bge-small-en-v1.5",
            quantized=True,
        )
        .memory(
            qdrant_url="http://localhost:6334",
            qdrant_auto_start=True,
            vector_size=384,
        )
    )

    agent, teardown = await builder.build()
    try:
        await agent.chat(user_id="u1", conversation_id="c1", content="Remember this")
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.3 Remote Embeddings (Tokenizer-Only Local Prep)

python
import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-remote-embed", "Uses remote embedding API")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .embeddings(tokenizer_only=True)
        .memory(
            qdrant_url="http://localhost:6334",
            embed_api_key="sk-remote-embed",
            embed_provider="openai",
            vector_size=1536,
        )
    )

    agent, teardown = await builder.build()
    try:
        await agent.chat(user_id="u1", conversation_id="c1", content="Store this fact")
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.4 Server + Security + CORS

python
import asyncio

from jazzmine.core.builder import (
    AgentBuilder,
    OpenAILLMConfig,
    SecurityConfig,
    ServerConfig,
    CORSConfig,
)


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-http", "Served assistant")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .security(SecurityConfig(fail_open=True))
        .server(
            ServerConfig(
                host="127.0.0.1",
                port=8000,
                api_key="super-secret-token",
                cors=CORSConfig(
                    origins=["https://app.example.com"],
                    allow_credentials=True,
                ),
            )
        )
    )

    agent, teardown = await builder.build()
    try:
        # Useful in tests, especially when using ServerConfig(port=0)
        print(builder.agent_server.effective_port)
        await asyncio.sleep(5)
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.5 Testing With Escape Hatches

python
import asyncio

from jazzmine.core.builder import AgentBuilder, JsonStorage
from jazzmine.core.llm.types import LLMResponse, LLMUsage


class FakeLLM:
    model = "fake"

    def generate(self, messages, *, stop=None, **kwargs):
        return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)

    def stream(self, messages, *, stop=None, **kwargs):
        yield "ok"

    async def agenerate(self, messages, *, stop=None, **kwargs):
        return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)

    async def astream(self, messages, *, stop=None, **kwargs):
        yield "ok"

    def close(self):
        return None

    async def aclose(self):
        return None


class FakeStore:
    async def store_message(self, *args, **kwargs):
        return None

    async def flag_message(self, *args, **kwargs):
        return None

    async def close_conversation(self, *args, **kwargs):
        return None

    async def list_conversations(self, *args, **kwargs):
        return []

    async def list_messages(self, *args, **kwargs):
        return []

    async def close(self):
        return None


async def main() -> None:
    fake_llm = FakeLLM()
    fake_store = FakeStore()

    builder = (
        AgentBuilder("TestAgent", "test-agent", "Testing profile")
        .storage(JsonStorage(path="./test_store.json"))
        .with_llm(fake_llm)
        .with_message_store(fake_store)
    )

    agent, teardown = await builder.build()
    try:
        # Run focused assertions on wiring without exercising full chat pipeline.
        assert agent is not None
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

10. Operational Guidance

  • Always keep a strong reference to your flow instances/list after .flows(...).
  • Prefer .embeddings() + .memory() together to avoid path mismatch errors.
  • For remote embeddings, use .embeddings(tokenizer_only=True) plus embed_api_key in memory config.
  • Call await teardown() in finally blocks to guarantee cleanup.
  • Use escape hatches in tests to avoid external dependencies.