1. Overview
It is responsible for:
- validating configuration before I/O,
- instantiating LLMs, stores, memory, tools, and optional HTTP server,
- returning a runtime pair: (agent, teardown).
All fluent methods return self for chaining. build() is the only async public method.
2. Public API Coverage Checklist
This document covers every public constructor/method/property on AgentBuilder.
2.1 Constructor and Core Fluent Methods
| Method | Included | Purpose |
|---|---|---|
| __init__(name, agent_id, personality) | Yes | Required identity and personality fields |
| llm(config) | Yes | Primary reasoning LLM |
| script_gen_llm(config) | Yes | Optional script-generation LLM |
| embeddings(...) | Yes | Automatic tokenizer/ONNX setup |
| memory(...) | Yes | Qdrant-backed long-term memory settings |
| storage(config) | Yes | Message store backend selection |
| sandbox(...) | Yes | Register sandbox specs |
| flows(flows) | Yes | Register flow definitions |
| domain_terms(terms) | Yes | Seed semantic memory terms |
| settings(**kwargs) | Yes | Override AgentSettings fields |
| version(version) | Yes | Startup display version label |
| on_intermediate(callback) | Yes | Sandbox event callback |
| logging(config) | Yes | Structured agent logging |
| security(config) | Yes | SecurityGuard setup |
| server(config) | Yes | HTTP server setup |
| agent_server (property) | Yes | Access running AgentServer after build |
| build() | Yes | Full runtime assembly |
2.2 Escape Hatch Injection Methods
| Method | Included | Purpose |
|---|---|---|
| with_llm(llm) | Yes | Inject pre-built primary LLM |
| with_script_gen_llm(llm) | Yes | Inject pre-built script-gen LLM |
| with_message_store(store) | Yes | Inject pre-built message store |
| with_wm_store(store) | Yes | Inject pre-built working-memory store |
| redis_wm(redis_client, ttl_seconds=3600) | Yes | Create Redis working-memory store shortcut |
| with_registry(registry) | Yes | Inject tool registry |
| with_enhancer(enhancer) | Yes | Inject message enhancer |
| with_summarizer(summarizer) | Yes | Inject summarizer |
| with_flow_selector(selector) | Yes | Inject flow selector |
| with_episodic_memory(memory) | Yes | Inject episodic memory |
| with_semantic_memory(memory) | Yes | Inject semantic memory |
| with_pool(pool) | Yes | Inject sandbox pool |
3. Constructor and Fluent Methods
3.1 __init__(name, agent_id, personality)
Creates a builder with required identity fields.
Immediate validation:
- name cannot be empty/whitespace,
- agent_id cannot be empty/whitespace,
- personality cannot be empty/whitespace.
Failure raises ConfigError immediately.
3.2 llm(config: LLMConfig)
Sets the primary LLM used for:
- core reasoning,
- flow selection,
- message enhancement,
- summarization.
If neither .llm(...) nor .with_llm(...) is provided, build() fails validation.
3.3 script_gen_llm(config: LLMConfig)
Optional dedicated LLM for script generation in tool orchestration.
If omitted, script generation falls back to the primary LLM.
3.4 embeddings(...)
Sets automatic embedding model preparation. During build(), AgentBuilder can:
- download tokenizer,
- export ONNX model,
- quantize ONNX (INT8) when requested,
- patch memory config paths automatically.
Parameters:
- model_id default "BAAI/bge-small-en-v1.5"
- output_dir default empty (resolved to ~/.jazzmine/models/<slug>/)
- tokenizer_only default False
- quantized default True
- opset default 17
- force_rebuild default False
Notes:
- tokenizer_only=True skips ONNX export.
- If .embeddings() is called without .memory(), builder logs a warning but does not fail.
3.5 memory(...)
Enables Qdrant-backed long-term memory (episodic/procedural/semantic).
Important behavior:
- If .embeddings() is used, tokenizer_path and (for local mode) model_dir are auto-resolved.
- Without .embeddings(), you must provide the embedding backend manually.
Backend modes:
- automatic: call .embeddings() before .memory().
- manual local: set model_dir.
- manual remote: set embed_api_key (and optionally provider/base URL/model name).
Qdrant automation:
- if qdrant_auto_start=True, qdrant_api_key is None, and URL host is local (localhost, 127.0.0.1, 0.0.0.0, ::1), builder can auto-start a local Docker Qdrant container.
3.6 storage(config: StorageConfig)
Sets message storage backend using one of:
- JsonStorage,
- PostgresStorage,
- MongoDBStorage.
Default is JsonStorage(). If JSON storage path is empty, builder creates a temporary file and registers cleanup in teardown.
3.7 sandbox(...)
Registers one sandbox spec per call.
Use this to define:
- runtime/python version,
- resource limits,
- networking allowlists,
- file mounts,
- package list,
- secrets,
- execution mode and pool sizing.
Call multiple times for multiple named sandboxes.
3.8 flows(flows: list[Any])
Registers flow definitions to be synchronized during build when memory is enabled.
Important: flow registry is weak-reference-backed, so keep a strong reference to your flow list in caller scope.
3.9 domain_terms(terms: list[DomainTerm])
Defines semantic glossary entries.
Each entry must be a 5-tuple: (key, value, category, aliases, description).
Entries are diff-synced to semantic memory during build when memory is active.
3.10 settings(**kwargs)
Overrides AgentSettings fields by name.
Unknown key behavior:
- raises ConfigError with list of valid settings fields.
3.11 version(version: str)
Sets optional version string used by startup display.
3.12 on_intermediate(callback: Callable)
Registers callback for sandbox intermediate events.
3.13 logging(config: dict)
Enables structured runtime logging via AgentLogger.from_config(...).
If configured, logger startup/shutdown is integrated into build/teardown.
3.14 security(config: SecurityConfig)
Enables SecurityGuard with input/output moderation and optional file sanitizer.
Notable rules:
- input_moderator and toxicity_detector are mutually exclusive.
- if .security() is not called, runtime uses NOOP_GUARD.
3.15 server(config: ServerConfig)
Attaches HTTP server lifecycle to the built agent.
Behavior:
- server starts at end of build(),
- server stop callback is registered in teardown,
- effective server instance is available via builder.agent_server.
Default endpoint set (from ServerConfig):
- POST {conversations_endpoint} create conversation
- GET {conversations_endpoint} list conversations for user
- GET {conversations_endpoint}/search search conversations
- GET {conversations_endpoint}/{conversation_id}/messages list messages
- PATCH {conversations_endpoint}/{conversation_id} update metadata
- DELETE {conversations_endpoint}/{conversation_id} delete conversation
- POST {chat_endpoint} full response
- POST {chat_endpoint}/stream SSE response stream
- GET {health_endpoint} health probe
- GET {info_endpoint} agent metadata
3.16 agent_server property
Returns AgentServer | None.
Typical use:
- after build(), inspect builder.agent_server.effective_port when using port=0 in tests.
4. Escape Hatches and Injection Strategy
Escape hatches let you bypass default construction and inject pre-built components.
Use cases:
- unit/integration tests,
- custom implementations,
- shared connection pools/registries,
- deterministic runtime wiring.
Coverage:
- with_llm, with_script_gen_llm
- with_message_store, with_wm_store, redis_wm
- with_registry, with_pool
- with_enhancer, with_summarizer, with_flow_selector
- with_episodic_memory, with_semantic_memory
5. Validation Model (_validate)
Validation executes synchronously at the top of build() before opening runtime resources.
It accumulates all failures and raises one ConfigError containing bulleted details.
5.1 LLM Validation
- requires either .llm(...) or .with_llm(...).
- validates provider config objects through _validate_llm_cfg.
5.2 Embedding and Memory Cross-Validation
Checks include:
- embeddings.model_id non-empty,
- embeddings.opset >= 11,
- output dir is not an existing file,
- tokenizer/backend consistency rules,
- embedding mode conflicts (auto/local/remote).
Memory checks include:
- qdrant_url non-empty,
- vector_size >= 1,
- quantization in {None, 4, 8},
- valid distance_metric in {cosine, euclidean, dot},
- positive shard/replication/index flush limits,
- max_batch >= 2,
- remote provider allowlist validation when API mode is used.
5.3 Storage Validation
- Postgres: non-empty DSN, pool_min >= 1, pool_max >= pool_min.
- MongoDB: non-empty URI and DB name.
5.4 Sandbox Validation
Per sandbox spec:
- non-empty unique name,
- resource lower bounds (memory, pids, timeout, output cap, CPU, pool, scratch),
- swap rule (-1 or > memory_mb),
- port ranges (1..65535),
- non-empty file mount paths,
- no blank pip package entries,
- execution mode in {plan, interactive}.
5.5 Settings Validation
Checks include:
- default_execution_mode in {plan, interactive},
- positive turn/retry limits,
- enhancer_history_window >= 2,
- summarizer_trigger >= 1,
- episode_overlap < max_episode_size,
- pool_max_overflow >= 0,
- positive flow retrieval limits.
5.6 Domain Terms Validation
Each term must be a tuple/list of length 5.
5.7 Security Validation
Checks include:
- input_moderator XOR toxicity_detector,
- required interface methods (classify, predict, sanitize) when objects are provided,
- threshold ranges in [0.0, 1.0],
- positive moderation timeout,
- non-empty block messages.
5.8 Server Validation (_validate_server_cfg)
Checks include:
- port in 1..65535, non-empty host,
- all endpoint paths start with /,
- endpoint uniqueness,
- no collision with auto stream path ${chat_endpoint}/stream,
- positive request limits,
- SSL cert/key pair rules and file existence,
- valid log level,
- valid backlog/keep-alive/concurrency settings,
- CORS credential + wildcard-origin conflict prevention,
- non-empty CORS origin list,
- non-blank API key string if set.
6. Build Lifecycle (await build())
Execution order is deterministic.
- run _validate().
- initialize startup display and register teardown banner callbacks.
- if configured, run embedding setup before memory construction.
- construct primary/script LLMs (or use injected instances).
- prepare tool registry and compute registry change report.
- create sandbox pool if sandboxes/tools are present and no pool injected.
- initialize message store (with temp JSON file support).
- initialize working memory store (default in-process unless injected).
- initialize long-term memory (_build_memory) with optional Qdrant auto-start.
- sync domain terms and flows when long-term memory is active.
- construct flow selector, enhancer, summarizer.
- build AgentConfig and then the runtime agent.
- optionally start HTTP server and expose builder.agent_server.
- register teardown callbacks and return (agent, teardown).
7. Runtime/Dependency Details
7.1 Embedding Setup Internals
_build_embeddings():
- checks cache (tokenizer.json and ONNX where applicable),
- checks required Python packages (transformers, and for ONNX export also torch, onnxruntime-tools),
- executes tokenizer/ONNX export in thread executor,
- patches self._mem_cfg.tokenizer_path, and local model_dir/quantized when relevant.
7.2 Memory Runtime Internals
_build_memory():
- validates onnxruntime availability/version (requires >=1.23.0),
- attempts to resolve ORT shared library path from installed wheel,
- imports Rust extension module memory,
- creates QdrantManager, ensures collections, then builds:
- EpisodicMemory,
- ProceduralMemory,
- SemanticMemory.
If .memory() is not configured, builder returns no-op episodic/semantic memory stubs.
8. Error Handling and Failure Modes
8.1 ConfigError
Thrown by constructor (required identity fields) and by _validate() aggregate failures.
8.2 Embedding Setup Failures
Embedding export errors are wrapped into clear ConfigError messages including:
- model,
- output dir,
- common causes (network, disk, missing deps, invalid model id).
8.3 Memory Runtime Import/Version Failures
Common explicit failures:
- missing onnxruntime,
- too-old onnxruntime,
- missing Rust extension module memory.
8.4 HTTP Server Startup Failures
Server startup failures are raised as RuntimeError with host/port context.
9. Practical Examples
9.1 Minimal Agent (No Long-Term Memory)
import asyncio
from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig
async def main() -> None:
builder = (
AgentBuilder(
name="Aria",
agent_id="aria-local",
personality="Helpful and concise assistant.",
)
.llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
)
agent, teardown = await builder.build()
try:
result = await agent.chat(
user_id="u1",
conversation_id="c1",
content="Hello",
)
print(result.response)
finally:
await teardown()
if __name__ == "__main__":
asyncio.run(main())9.2 Local Embeddings + Auto Qdrant
import asyncio
from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig
async def main() -> None:
builder = (
AgentBuilder("Aria", "aria-mem", "Memory-enabled assistant")
.llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
.embeddings(
model_id="BAAI/bge-small-en-v1.5",
quantized=True,
)
.memory(
qdrant_url="http://localhost:6334",
qdrant_auto_start=True,
vector_size=384,
)
)
agent, teardown = await builder.build()
try:
await agent.chat(user_id="u1", conversation_id="c1", content="Remember this")
finally:
await teardown()
if __name__ == "__main__":
asyncio.run(main())9.3 Remote Embeddings (Tokenizer-Only Local Prep)
import asyncio
from jazzmine.core.builder import AgentBuilder, OpenAILLMConfig
async def main() -> None:
builder = (
AgentBuilder("Aria", "aria-remote-embed", "Uses remote embedding API")
.llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
.embeddings(tokenizer_only=True)
.memory(
qdrant_url="http://localhost:6334",
embed_api_key="sk-remote-embed",
embed_provider="openai",
vector_size=1536,
)
)
agent, teardown = await builder.build()
try:
await agent.chat(user_id="u1", conversation_id="c1", content="Store this fact")
finally:
await teardown()
if __name__ == "__main__":
asyncio.run(main())9.4 Server + Security + CORS
import asyncio
from jazzmine.core.builder import (
AgentBuilder,
OpenAILLMConfig,
SecurityConfig,
ServerConfig,
CORSConfig,
)
async def main() -> None:
builder = (
AgentBuilder("Aria", "aria-http", "Served assistant")
.llm(OpenAILLMConfig(model="gpt-4o-mini", api_key="sk-..."))
.security(SecurityConfig(fail_open=True))
.server(
ServerConfig(
host="127.0.0.1",
port=8000,
api_key="super-secret-token",
cors=CORSConfig(
origins=["https://app.example.com"],
allow_credentials=True,
),
)
)
)
agent, teardown = await builder.build()
try:
# Useful in tests, especially when using ServerConfig(port=0)
print(builder.agent_server.effective_port)
await asyncio.sleep(5)
finally:
await teardown()
if __name__ == "__main__":
asyncio.run(main())9.5 Testing With Escape Hatches
import asyncio
from jazzmine.core.builder import AgentBuilder, JsonStorage
from jazzmine.core.llm.types import LLMResponse, LLMUsage
class FakeLLM:
model = "fake"
def generate(self, messages, *, stop=None, **kwargs):
return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)
def stream(self, messages, *, stop=None, **kwargs):
yield "ok"
async def agenerate(self, messages, *, stop=None, **kwargs):
return LLMResponse(text="ok", usage=LLMUsage(), model=self.model)
async def astream(self, messages, *, stop=None, **kwargs):
yield "ok"
def close(self):
return None
async def aclose(self):
return None
class FakeStore:
async def store_message(self, *args, **kwargs):
return None
async def flag_message(self, *args, **kwargs):
return None
async def close_conversation(self, *args, **kwargs):
return None
async def list_conversations(self, *args, **kwargs):
return []
async def list_messages(self, *args, **kwargs):
return []
async def close(self):
return None
async def main() -> None:
fake_llm = FakeLLM()
fake_store = FakeStore()
builder = (
AgentBuilder("TestAgent", "test-agent", "Testing profile")
.storage(JsonStorage(path="./test_store.json"))
.with_llm(fake_llm)
.with_message_store(fake_store)
)
agent, teardown = await builder.build()
try:
# Run focused assertions on wiring without exercising full chat pipeline.
assert agent is not None
finally:
await teardown()
if __name__ == "__main__":
asyncio.run(main())10. Operational Guidance
- Always keep a strong reference to your flow instances/list after .flows(...).
- Prefer .embeddings() + .memory() together to avoid path mismatch errors.
- For remote embeddings, use .embeddings(tokenizer_only=True) plus embed_api_key in memory config.
- Call await teardown() in finally blocks to guarantee cleanup.
- Use escape hatches in tests to avoid external dependencies.