Builder
Core reference

Agent Builder: server (FastAPI / Uvicorn Integration)

The server.py module provides a fully integrated, production-ready HTTP REST and Server-Sent Events (SSE) server for jazzmine agents. Built on top of FastAPI and Uvicorn, it automatically exposes your agent's capabilities—including conversational chat, real-time streaming, conversation metadata management, and health probes—without requiring you to write any boilerplate API routing code.

1. Introduction

To keep the core framework lightweight, the dependencies (fastapi, uvicorn, pydantic) are imported lazily. The server is completely optional but easily attached to any agent via the AgentBuilder.server() method.


2. Configuration Classes

The server is configured using two strictly typed dataclasses (defined in configs.py). These classes govern the network bindings, routing, security, and tuning of the HTTP server.

2.1 CORSConfig

Defines the Cross-Origin Resource Sharing (CORS) policy for the API.

ParameterTypeDefaultMeaning / Description
originslist[str]["*"]List of allowed origins. Use ["*"] to permit any origin. Note: If allow_credentials is True, you cannot use a wildcard and must list exact URLs.
allow_credentialsboolFalseSet to True to allow cookies or Authorization headers cross-origin.
allow_methodslist[str]["*"]HTTP methods permitted cross-origin (e.g., ["GET", "POST"]).
allow_headerslist[str]["*"]Request headers permitted cross-origin.
expose_headerslist[str][]Response headers the browser is allowed to access.
max_ageint600How long (in seconds) the browser may cache a CORS preflight response.

2.2 ServerConfig

The master configuration object passed to AgentBuilder.server().

Network & Binding

ParameterTypeDefaultMeaning / Description
hoststr"0.0.0.0"The bind address. "0.0.0.0" exposes the server to the network, "127.0.0.1" restricts it to localhost.
portint1453The TCP port to listen on. Set to 0 to let the OS assign a random available port.

Endpoint Routing

ParameterTypeDefaultMeaning / Description
chat_endpointstr"/chat"Base path for chat (creates POST /chat and POST /chat/stream).
conversations_endpointstr"/conversations"Base path for all CRUD operations on conversation metadata.
health_endpointstr"/health"Path for the liveness probe (returns 200 OK).
info_endpointstr"/info"Path for retrieving agent metadata and capabilities.

Authentication & Limits

ParameterTypeDefaultMeaning / Description
api_key`strNone`NoneIf provided, enforces Authorization: Bearer <api_key> on all endpoints except docs and root.
request_timeoutfloat120.0Maximum seconds to wait for the agent's reasoning loop to finish a turn before returning 504 Gateway Timeout.
max_message_lengthint32768Maximum character length allowed for incoming user messages.

API Metadata & OpenAPI

ParameterTypeDefaultMeaning / Description
titlestr"Jazzmine Agent API"The title displayed in the auto-generated Swagger UI.
descriptionstr""Markdown description displayed in the Swagger UI.
api_versionstr"1.0.0"The semantic version string for your API.
docs_url`strNone`"/docs"The route for the Swagger UI. Set to None to disable it in production.
redoc_url`strNone`NoneThe route for ReDoc documentation. Set to None to disable.

TLS & Tuning

ParameterTypeDefaultMeaning / Description
corsCORSConfigCORSConfig()The CORS policy object.
ssl_certfile`strNone`NonePath to the PEM certificate file for HTTPS. Requires ssl_keyfile.
ssl_keyfile`strNone`NonePath to the PEM private-key file for HTTPS.
log_levelstr"info"Uvicorn logging level ("critical", "error", "warning", "info", "debug", "trace").
backlogint2048Maximum number of connections in the socket accept queue.
limit_concurrency`intNone`NoneMaximum concurrent connections before uvicorn returns 503 Service Unavailable.
timeout_keep_aliveint5Seconds to hold an idle keep-alive HTTP connection open.

3. Data Models (Request & Response Schemas)

The server relies on strict Pydantic models for request validation and response serialization.

3.1 Chat Schemas

ChatRequest (Payload for POST /chat and POST /chat/stream)

FieldTypeDefaultDescription
messagestrRequiredThe raw text input from the user (must be > 0 and <= max_message_length).
conversation_idstr""The session ID. If left blank, the server automatically generates a UUID.
user_idstr"user"Identity of the caller, used to scope memory isolation.
explicit_contextlist[str] | NoneNoneOptional RAG snippets injected directly into the agent's prompt for this turn.
metadatadict[str, Any]{}Arbitrary key-value store for client-side correlation (ignored by the agent logic).

ChatResponse (Returned by POST /chat) Contains the full turn result: response (str), response_markdown (str), message_id (str), trace_id (str), conversation_id (str), user_id (str), invoked_flows (list[str]), invoked_tools (list[str]), errors (list[str]), turn_number (int), is_blocked (bool), and latency_ms (int).

3.2 Conversation Management Schemas

ConversationCreateRequest

FieldTypeDefaultDescription
conversation_idstr""Fixed ID to use. Auto-generated if omitted.
user_idstr"user"Owner of the conversation.
titlestr"New conversation"Human-readable label for the UI.

ConversationUpdateRequest

FieldTypeDefaultDescription
titlestrRequiredThe new human-readable label. Cannot be blank.

4. HTTP Endpoints API

Below are the routes exposed by the server. Authenticaton (Authorization: Bearer <api_key>) is required on all routes except /health, /info, /docs, and /.

4.1 Chat Endpoints

  • POST {chat_endpoint}: Sends a message and blocks until the entire turn is complete. Returns a ChatResponse JSON object.
  • POST {chat_endpoint}/stream: Sends a message and returns a StreamingResponse (Server-Sent Events). See Section 5.

4.2 Conversation Endpoints

  • POST {conversations_endpoint}: Creates a conversation shell upfront. Returns ConversationCreateResponse.
  • GET {conversations_endpoint}: Lists conversations for a user.
  • Query Params: user_id (Required), limit (Default 100), offset (Default 0).
  • GET {conversations_endpoint}/search: Case-insensitive substring search on titles and IDs.
  • Query Params: user_id (Required), query (Required), limit, offset.
  • GET {conversations_endpoint}/{conversation_id}/messages: Retrieves the full chronological transcript of messages.
  • Query Params: limit (Default 200), offset (Default 0).
  • PATCH {conversations_endpoint}/{conversation_id}: Updates metadata (e.g., title).
  • DELETE {conversations_endpoint}/{conversation_id}: Cascade-deletes the conversation, messages, turn traces, and working memory.

4.3 Utility Endpoints

  • GET {health_endpoint}: Returns {"status": "ok", "agent_name": "...", "uptime_s": ...}. Safe for load balancers.
  • GET {info_endpoint}: Returns static metadata, versioning, capabilities (has_tools, has_memory), and a dictionary mapping of all registered HTTP routes.
  • GET /: Auto-redirects to the Swagger UI (/docs) or /info if docs are disabled.

5. Server-Sent Events (SSE) Streaming

The POST {chat_endpoint}/stream route returns a text/event-stream response. It yields three distinct event types, allowing frontend UIs to render intermediate tool executions while waiting for the final response.

  1. event: intermediate: Emitted whenever the agent calls a tool or generates an intermediate thought.

``json data: {"type": "tool_call", "label": "Searching DB", "data": {...}} ``

  1. event: done: Emitted exactly once at the end of the turn. Contains the full ChatResponse payload.

``json data: {"response": "Here is the data...", "trace_id": "...", "is_blocked": false, ...} ``

  1. event: error: Emitted on timeouts or fatal agent crashes. The stream terminates immediately after this event.

``json data: {"error": "TimeoutError", "detail": "Agent did not respond within 120.0s."} ``


6. The AgentServer Class (Lifecycle Management)

The AgentServer class wraps the FastAPI app and the Uvicorn server, running it safely within an asyncio.Task so it does not block the main Python thread.

Properties

  • app: The underlying FastAPI application instance.
  • is_running: Boolean indicating if the server is accepting requests.
  • effective_port: Returns the actual bound TCP port. This is exceptionally useful in automated testing when ServerConfig(port=0) is used to let the OS assign a random free port.

Methods

  • start(): Builds the app, starts Uvicorn, and uses a polling loop to ensure the socket is successfully bound before returning control to the caller.
  • stop(): Flags Uvicorn to exit and gracefully awaits the background task. The AgentBuilder automatically registers this method with the Teardown sequence to ensure the server shuts down before database connections are dropped.

7. Error Handling

  • 504 Gateway Timeout: If the agent.chat() internal execution exceeds ServerConfig.request_timeout, the server catches the asyncio.TimeoutError and returns a clean 504 JSON response.
  • 422 Unprocessable Entity: Malformed payloads (e.g., message exceeding max_message_length or missing user_id) are automatically rejected by Pydantic validation before touching the agent logic.
  • 500 Internal Server Error: A global exception handler catches any unhandled exception (like a database disconnect), logs the full traceback, and returns a sanitized ErrorResponse to the client to prevent exposing stack traces to end-users.