Builder: AgentBuilder

1. Overview

It is responsible for:

validating configuration before I/O,
instantiating LLMs, stores, memory, tools, and optional HTTP server,
returning a runtime pair: (agent, teardown).

All fluent methods return self for chaining. build() is the only async public method.

2. Public API Coverage Checklist

This document covers every public constructor/method/property on AgentBuilder.

2.1 Constructor and Core Fluent Methods

Method	Included	Purpose
__init__(name, agent_id, personality)	Yes	Required identity and personality fields
llm(config)	Yes	Primary reasoning LLM
script_gen_llm(config)	Yes	Optional script-generation LLM
embeddings(...)	Yes	Automatic tokenizer/ONNX setup
memory(...)	Yes	Qdrant-backed long-term memory settings
storage(config)	Yes	Message store backend selection
sandbox(...)	Yes	Register sandbox specs
flows(flows)	Yes	Register flow definitions
domain_terms(terms)	Yes	Seed semantic memory terms
settings(**kwargs)	Yes	Override AgentSettings fields
version(version)	Yes	Startup display version label
on_intermediate(callback)	Yes	Sandbox event callback
logging(config)	Yes	Structured agent logging
security(config)	Yes	SecurityGuard setup
server(config)	Yes	HTTP server setup
agent_server (property)	Yes	Access running AgentServer after build
build()	Yes	Full runtime assembly

2.2 Escape Hatch Injection Methods

Method	Included	Purpose
with_llm(llm)	Yes	Inject pre-built primary LLM
with_script_gen_llm(llm)	Yes	Inject pre-built script-gen LLM
with_message_store(store)	Yes	Inject pre-built message store
with_wm_store(store)	Yes	Inject pre-built working-memory store
redis_wm(redis_client, ttl_seconds=3600)	Yes	Create Redis working-memory store shortcut
with_registry(registry)	Yes	Inject tool registry
with_enhancer(enhancer)	Yes	Inject message enhancer
with_summarizer(summarizer)	Yes	Inject summarizer
with_flow_selector(selector)	Yes	Inject flow selector
with_episodic_memory(memory)	Yes	Inject episodic memory
with_semantic_memory(memory)	Yes	Inject semantic memory
with_pool(pool)	Yes	Inject sandbox pool

3. Constructor and Fluent Methods

3.1 init(name, agent_id, personality)

Creates a builder with required identity fields.

Immediate validation:

name cannot be empty/whitespace,
agent_id cannot be empty/whitespace,
personality cannot be empty/whitespace.

Failure raises ConfigError immediately.

3.2 llm(config: LLMConfig)

Sets the primary LLM used for:

core reasoning,
flow selection,
message enhancement,
summarization.

If neither .llm(...) nor .with_llm(...) is provided, build() fails validation.

3.3 script_gen_llm(config: LLMConfig)

Optional dedicated LLM for script generation in tool orchestration.

If omitted, script generation falls back to the primary LLM.

3.4 embeddings(...)

Sets automatic embedding model preparation. During build(), AgentBuilder can:

download tokenizer,
export ONNX model,
quantize ONNX (INT8) when requested,
patch memory config paths automatically.

Parameters:

model_id default "BAAI/bge-small-en-v1.5"
output_dir default empty (resolved to ~/.jazzmine/models/<slug>/)
tokenizer_only default False
quantized default True
opset default 17
force_rebuild default False

Notes:

tokenizer_only=True skips ONNX export.
If .embeddings() is called without .memory(), builder logs a warning but does not fail.

3.5 memory(...)

Enables Qdrant-backed long-term memory (episodic/procedural/semantic).

Important behavior:

If .embeddings() is used, tokenizer_path and (for local mode) model_dir are auto-resolved.
Without .embeddings(), you must provide the embedding backend manually.

Backend modes:

automatic: call .embeddings() before .memory().
manual local: set model_dir.
manual remote: set embed_api_key (and optionally provider/base URL/model name).

Qdrant automation:

if qdrant_auto_start=True, qdrant_api_key is None, and URL host is local (localhost, 127.0.0.1, 0.0.0.0, ::1), builder can auto-start a local Docker Qdrant container.

3.6 storage(config: StorageConfig)

Sets message storage backend using one of:

JsonStorage,
PostgresStorage,
MongoDBStorage.

Default is JsonStorage(). If JSON storage path is empty, builder creates a temporary file and registers cleanup in teardown.

3.7 sandbox(...)

Registers one sandbox spec per call.

Use this to define:

runtime/python version,
resource limits,
networking allowlists,
file mounts,
package list,
secrets,
execution mode and pool sizing.

Call multiple times for multiple named sandboxes.

3.8 flows(flows: list[Any])

Registers flow definitions to be synchronized during build when memory is enabled.

Important: flow registry is weak-reference-backed, so keep a strong reference to your flow list in caller scope.

3.9 domain_terms(terms: list[DomainTerm])

Defines semantic glossary entries.

Each entry must be a 5-tuple: (key, value, category, aliases, description).

Entries are diff-synced to semantic memory during build when memory is active.

3.10 settings(**kwargs)

Overrides AgentSettings fields by name.

Unknown key behavior:

raises ConfigError with list of valid settings fields.

3.11 version(version: str)

Sets optional version string used by startup display.

3.12 on_intermediate(callback: Callable)

Registers callback for sandbox intermediate events.

3.13 logging(config: dict)

Enables structured runtime logging via AgentLogger.from_config(...).

If configured, logger startup/shutdown is integrated into build/teardown.

3.14 security(config: SecurityConfig)

Enables SecurityGuard with input/output moderation and optional file sanitizer.

Notable rules:

input_moderator and toxicity_detector are mutually exclusive.
if .security() is not called, runtime uses NOOP_GUARD.

3.15 server(config: ServerConfig)

Attaches HTTP server lifecycle to the built agent.

Behavior:

server starts at end of build(),
server stop callback is registered in teardown,
effective server instance is available via builder.agent_server.

Default endpoint set (from ServerConfig):

POST {conversations_endpoint} create conversation
GET {conversations_endpoint} list conversations for user
GET {conversations_endpoint}/search search conversations
GET {conversations_endpoint}/{conversation_id}/messages list messages
PATCH {conversations_endpoint}/{conversation_id} update metadata
DELETE {conversations_endpoint}/{conversation_id} delete conversation
POST {chat_endpoint} full response
POST {chat_endpoint}/stream SSE response stream
GET {health_endpoint} health probe
GET {info_endpoint} agent metadata

3.16 agent_server property

Returns AgentServer | None.

Typical use:

after build(), inspect builder.agent_server.effective_port when using port=0 in tests.

4. Escape Hatches and Injection Strategy

Escape hatches let you bypass default construction and inject pre-built components.

Use cases:

unit/integration tests,
custom implementations,
shared connection pools/registries,
deterministic runtime wiring.

Coverage:

with_llm, with_script_gen_llm
with_message_store, with_wm_store, redis_wm
with_registry, with_pool
with_enhancer, with_summarizer, with_flow_selector
with_episodic_memory, with_semantic_memory

5. Validation Model (_validate)

Validation executes synchronously at the top of build() before opening runtime resources.

It accumulates all failures and raises one ConfigError containing bulleted details.

5.1 LLM Validation

requires either .llm(...) or .with_llm(...).
validates provider config objects through _validate_llm_cfg.

5.2 Embedding and Memory Cross-Validation

Checks include:

embeddings.model_id non-empty,
embeddings.opset >= 11,
output dir is not an existing file,
tokenizer/backend consistency rules,
embedding mode conflicts (auto/local/remote).

Memory checks include:

qdrant_url non-empty,
vector_size >= 1,
quantization in {None, 4, 8},
valid distance_metric in {cosine, euclidean, dot},
positive shard/replication/index flush limits,
max_batch >= 2,
remote provider allowlist validation when API mode is used.

5.3 Storage Validation

Postgres: non-empty DSN, pool_min >= 1, pool_max >= pool_min.
MongoDB: non-empty URI and DB name.

5.4 Sandbox Validation

Per sandbox spec:

non-empty unique name,
resource lower bounds (memory, pids, timeout, output cap, CPU, pool, scratch),
swap rule (-1 or > memory_mb),
port ranges (1..65535),
non-empty file mount paths,
no blank pip package entries,
execution mode in {plan, interactive}.

5.5 Settings Validation

Checks include:

default_execution_mode in {plan, interactive},
positive turn/retry limits,
enhancer_history_window >= 2,
summarizer_trigger >= 1,
episode_overlap < max_episode_size,
pool_max_overflow >= 0,
positive flow retrieval limits.

5.6 Domain Terms Validation

Each term must be a tuple/list of length 5.

5.7 Security Validation

Checks include:

input_moderator XOR toxicity_detector,
required interface methods (classify, predict, sanitize) when objects are provided,
threshold ranges in [0.0, 1.0],
positive moderation timeout,
non-empty block messages.

5.8 Server Validation (_validate_server_cfg)

Checks include:

port in 1..65535, non-empty host,
all endpoint paths start with /,
endpoint uniqueness,
no collision with auto stream path ${chat_endpoint}/stream,
positive request limits,
SSL cert/key pair rules and file existence,
valid log level,
valid backlog/keep-alive/concurrency settings,
CORS credential + wildcard-origin conflict prevention,
non-empty CORS origin list,
non-blank API key string if set.

6. Build Lifecycle (await build())

Execution order is deterministic.

run _validate().
initialize startup display and register teardown banner callbacks.
if configured, run embedding setup before memory construction.
construct primary/script LLMs (or use injected instances).
prepare tool registry and compute registry change report.
create sandbox pool if sandboxes/tools are present and no pool injected.
initialize message store (with temp JSON file support).
initialize working memory store (default in-process unless injected).
initialize long-term memory (_build_memory) with optional Qdrant auto-start.
sync domain terms and flows when long-term memory is active.
construct flow selector, enhancer, summarizer.
build AgentConfig and then the runtime agent.
optionally start HTTP server and expose builder.agent_server.
register teardown callbacks and return (agent, teardown).

7. Runtime/Dependency Details

7.1 Embedding Setup Internals

_build_embeddings():

checks cache (tokenizer.json and ONNX where applicable),
checks required Python packages (transformers, and for ONNX export also torch, onnxruntime-tools),
executes tokenizer/ONNX export in thread executor,
patches self._mem_cfg.tokenizer_path, and local model_dir/quantized when relevant.

7.2 Memory Runtime Internals

_build_memory():

validates onnxruntime availability/version (requires >=1.23.0),
attempts to resolve ORT shared library path from installed wheel,
imports Rust extension module memory,
creates QdrantManager, ensures collections, then builds:
EpisodicMemory,
ProceduralMemory,
SemanticMemory.

If .memory() is not configured, builder returns no-op episodic/semantic memory stubs.

8. Error Handling and Failure Modes

8.1 ConfigError

Thrown by constructor (required identity fields) and by _validate() aggregate failures.

8.2 Embedding Setup Failures

Embedding export errors are wrapped into clear ConfigError messages including:

model,
output dir,
common causes (network, disk, missing deps, invalid model id).

8.3 Memory Runtime Import/Version Failures

Common explicit failures:

missing onnxruntime,
too-old onnxruntime,
missing Rust extension module memory.

8.4 HTTP Server Startup Failures

Server startup failures are raised as RuntimeError with host/port context.

9. Practical Examples

9.1 Minimal Agent (No Long-Term Memory)

python

import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder(
            name="Aria",
            agent_id="aria-local",
            personality="Helpful and concise assistant.",
        )
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
    )

    agent, teardown = await builder.build()
    try:
        result = await agent.chat(
            user_id="u1",
            conversation_id="c1",
            content="Hello",
        )
        print(result.response)
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.2 Local Embeddings + Auto Qdrant

python

import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-mem", "Memory-enabled assistant")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .embeddings(
            model_id="BAAI/bge-small-en-v1.5",
            quantized=True,
        )
        .memory(
            qdrant_url="http://localhost:6334",
            qdrant_auto_start=True,
            vector_size=384,
        )
    )

    agent, teardown = await builder.build()
    try:
        await agent.chat(user_id="u1", conversation_id="c1", content="Remember this")
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.3 Remote Embeddings (Tokenizer-Only Local Prep)

python

import asyncio

from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-remote-embed", "Uses remote embedding API")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .embeddings(tokenizer_only=True)
        .memory(
            qdrant_url="http://localhost:6334",
            embed_api_key="sk-remote-embed",
            embed_provider="openai",
            vector_size=1536,
        )
    )

    agent, teardown = await builder.build()
    try:
        await agent.chat(user_id="u1", conversation_id="c1", content="Store this fact")
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.4 Server + Security + CORS

python

import asyncio

from jazzmine.core.builder import (
    AgentBuilder,
    OpenAILLMConfig,
    SecurityConfig,
    ServerConfig,
    CORSConfig,
)


async def main() -> None:
    builder = (
        AgentBuilder("Aria", "aria-http", "Served assistant")
        .llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
        .security(SecurityConfig(fail_open=True))
        .server(
            ServerConfig(
                host="127.0.0.1",
                port=8000,
                api_key="super-secret-token",
                cors=CORSConfig(
                    origins=["https://app.example.com"],
                    allow_credentials=True,
                ),
            )
        )
    )

    agent, teardown = await builder.build()
    try:
        # Useful in tests, especially when using ServerConfig(port=0)
        print(builder.agent_server.effective_port)
        await asyncio.sleep(5)
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

9.5 Testing With Escape Hatches

python

import asyncio

from jazzmine.core.builder import AgentBuilder, JsonStorage
from jazzmine.core.llm.types import LLMResponse, LLMUsage


class FakeLLM:
    model = "fake"

    def generate(self, messages, *, stop=None, **kwargs):
        return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)

    def stream(self, messages, *, stop=None, **kwargs):
        yield "ok"

    async def agenerate(self, messages, *, stop=None, **kwargs):
        return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)

    async def astream(self, messages, *, stop=None, **kwargs):
        yield "ok"

    def close(self):
        return None

    async def aclose(self):
        return None


class FakeStore:
    async def store_message(self, *args, **kwargs):
        return None

    async def flag_message(self, *args, **kwargs):
        return None

    async def close_conversation(self, *args, **kwargs):
        return None

    async def list_conversations(self, *args, **kwargs):
        return []

    async def list_messages(self, *args, **kwargs):
        return []

    async def close(self):
        return None


async def main() -> None:
    fake_llm = FakeLLM()
    fake_store = FakeStore()

    builder = (
        AgentBuilder("TestAgent", "test-agent", "Testing profile")
        .storage(JsonStorage(path="./test_store.json"))
        .with_llm(fake_llm)
        .with_message_store(fake_store)
    )

    agent, teardown = await builder.build()
    try:
        # Run focused assertions on wiring without exercising full chat pipeline.
        assert agent is not None
    finally:
        await teardown()


if __name__ == "__main__":
    asyncio.run(main())

10. Operational Guidance

Always keep a strong reference to your flow instances/list after .flows(...).
Prefer .embeddings() + .memory() together to avoid path mismatch errors.
For remote embeddings, use .embeddings(tokenizer_only=True) plus embed_api_key in memory config.
Call await teardown() in finally blocks to guarantee cleanup.
Use escape hatches in tests to avoid external dependencies.

1. Overview

2. Public API Coverage Checklist

2.1 Constructor and Core Fluent Methods

2.2 Escape Hatch Injection Methods

3. Constructor and Fluent Methods

3.1 __init__(name, agent_id, personality)

3.2 llm(config: LLMConfig)

3.3 script_gen_llm(config: LLMConfig)

3.4 embeddings(...)

3.5 memory(...)

3.6 storage(config: StorageConfig)

3.7 sandbox(...)

3.8 flows(flows: list[Any])

3.9 domain_terms(terms: list[DomainTerm])

3.10 settings(**kwargs)

3.11 version(version: str)

3.12 on_intermediate(callback: Callable)

3.13 logging(config: dict)

3.14 security(config: SecurityConfig)

3.15 server(config: ServerConfig)

3.16 agent_server property

4. Escape Hatches and Injection Strategy

5. Validation Model (_validate)

5.1 LLM Validation

5.2 Embedding and Memory Cross-Validation

5.3 Storage Validation

5.4 Sandbox Validation

5.5 Settings Validation

5.6 Domain Terms Validation

5.7 Security Validation

5.8 Server Validation (_validate_server_cfg)

6. Build Lifecycle (await build())

7. Runtime/Dependency Details

7.1 Embedding Setup Internals

7.2 Memory Runtime Internals

8. Error Handling and Failure Modes

8.1 ConfigError

8.2 Embedding Setup Failures

8.3 Memory Runtime Import/Version Failures

8.4 HTTP Server Startup Failures

9. Practical Examples

9.1 Minimal Agent (No Long-Term Memory)

9.2 Local Embeddings + Auto Qdrant

9.3 Remote Embeddings (Tokenizer-Only Local Prep)

9.4 Server + Security + CORS

9.5 Testing With Escape Hatches

10. Operational Guidance

3.1 init(name, agent_id, personality)