1. Introduction
To keep the core framework lightweight, the dependencies (fastapi, uvicorn, pydantic) are imported lazily. The server is completely optional but easily attached to any agent via the AgentBuilder.server() method.
2. Configuration Classes
The server is configured using two strictly typed dataclasses (defined in configs.py). These classes govern the network bindings, routing, security, and tuning of the HTTP server.
2.1 CORSConfig
Defines the Cross-Origin Resource Sharing (CORS) policy for the API.
| Parameter | Type | Default | Meaning / Description |
|---|---|---|---|
| origins | list[str] | ["*"] | List of allowed origins. Use ["*"] to permit any origin. Note: If allow_credentials is True, you cannot use a wildcard and must list exact URLs. |
| allow_credentials | bool | False | Set to True to allow cookies or Authorization headers cross-origin. |
| allow_methods | list[str] | ["*"] | HTTP methods permitted cross-origin (e.g., ["GET", "POST"]). |
| allow_headers | list[str] | ["*"] | Request headers permitted cross-origin. |
| expose_headers | list[str] | [] | Response headers the browser is allowed to access. |
| max_age | int | 600 | How long (in seconds) the browser may cache a CORS preflight response. |
2.2 ServerConfig
The master configuration object passed to AgentBuilder.server().
Network & Binding
| Parameter | Type | Default | Meaning / Description |
|---|---|---|---|
| host | str | "0.0.0.0" | The bind address. "0.0.0.0" exposes the server to the network, "127.0.0.1" restricts it to localhost. |
| port | int | 1453 | The TCP port to listen on. Set to 0 to let the OS assign a random available port. |
Endpoint Routing
| Parameter | Type | Default | Meaning / Description |
|---|---|---|---|
| chat_endpoint | str | "/chat" | Base path for chat (creates POST /chat and POST /chat/stream). |
| conversations_endpoint | str | "/conversations" | Base path for all CRUD operations on conversation metadata. |
| health_endpoint | str | "/health" | Path for the liveness probe (returns 200 OK). |
| info_endpoint | str | "/info" | Path for retrieving agent metadata and capabilities. |
Authentication & Limits
| Parameter | Type | Default | Meaning / Description | |
|---|---|---|---|---|
| api_key | `str | None` | None | If provided, enforces Authorization: Bearer <api_key> on all endpoints except docs and root. |
| request_timeout | float | 120.0 | Maximum seconds to wait for the agent's reasoning loop to finish a turn before returning 504 Gateway Timeout. | |
| max_message_length | int | 32768 | Maximum character length allowed for incoming user messages. |
API Metadata & OpenAPI
| Parameter | Type | Default | Meaning / Description | |
|---|---|---|---|---|
| title | str | "Jazzmine Agent API" | The title displayed in the auto-generated Swagger UI. | |
| description | str | "" | Markdown description displayed in the Swagger UI. | |
| api_version | str | "1.0.0" | The semantic version string for your API. | |
| docs_url | `str | None` | "/docs" | The route for the Swagger UI. Set to None to disable it in production. |
| redoc_url | `str | None` | None | The route for ReDoc documentation. Set to None to disable. |
TLS & Tuning
| Parameter | Type | Default | Meaning / Description | |
|---|---|---|---|---|
| cors | CORSConfig | CORSConfig() | The CORS policy object. | |
| ssl_certfile | `str | None` | None | Path to the PEM certificate file for HTTPS. Requires ssl_keyfile. |
| ssl_keyfile | `str | None` | None | Path to the PEM private-key file for HTTPS. |
| log_level | str | "info" | Uvicorn logging level ("critical", "error", "warning", "info", "debug", "trace"). | |
| backlog | int | 2048 | Maximum number of connections in the socket accept queue. | |
| limit_concurrency | `int | None` | None | Maximum concurrent connections before uvicorn returns 503 Service Unavailable. |
| timeout_keep_alive | int | 5 | Seconds to hold an idle keep-alive HTTP connection open. |
3. Data Models (Request & Response Schemas)
The server relies on strict Pydantic models for request validation and response serialization.
3.1 Chat Schemas
ChatRequest (Payload for POST /chat and POST /chat/stream)
| Field | Type | Default | Description |
|---|---|---|---|
| message | str | Required | The raw text input from the user (must be > 0 and <= max_message_length). |
| conversation_id | str | "" | The session ID. If left blank, the server automatically generates a UUID. |
| user_id | str | "user" | Identity of the caller, used to scope memory isolation. |
| explicit_context | list[str] | None | None | Optional RAG snippets injected directly into the agent's prompt for this turn. |
| metadata | dict[str, Any] | {} | Arbitrary key-value store for client-side correlation (ignored by the agent logic). |
ChatResponse (Returned by POST /chat) Contains the full turn result: response (str), response_markdown (str), message_id (str), trace_id (str), conversation_id (str), user_id (str), invoked_flows (list[str]), invoked_tools (list[str]), errors (list[str]), turn_number (int), is_blocked (bool), and latency_ms (int).
3.2 Conversation Management Schemas
ConversationCreateRequest
| Field | Type | Default | Description |
|---|---|---|---|
| conversation_id | str | "" | Fixed ID to use. Auto-generated if omitted. |
| user_id | str | "user" | Owner of the conversation. |
| title | str | "New conversation" | Human-readable label for the UI. |
ConversationUpdateRequest
| Field | Type | Default | Description |
|---|---|---|---|
| title | str | Required | The new human-readable label. Cannot be blank. |
4. HTTP Endpoints API
Below are the routes exposed by the server. Authenticaton (Authorization: Bearer <api_key>) is required on all routes except /health, /info, /docs, and /.
4.1 Chat Endpoints
- POST {chat_endpoint}: Sends a message and blocks until the entire turn is complete. Returns a ChatResponse JSON object.
- POST {chat_endpoint}/stream: Sends a message and returns a StreamingResponse (Server-Sent Events). See Section 5.
4.2 Conversation Endpoints
- POST {conversations_endpoint}: Creates a conversation shell upfront. Returns ConversationCreateResponse.
- GET {conversations_endpoint}: Lists conversations for a user.
- Query Params: user_id (Required), limit (Default 100), offset (Default 0).
- GET {conversations_endpoint}/search: Case-insensitive substring search on titles and IDs.
- Query Params: user_id (Required), query (Required), limit, offset.
- GET {conversations_endpoint}/{conversation_id}/messages: Retrieves the full chronological transcript of messages.
- Query Params: limit (Default 200), offset (Default 0).
- PATCH {conversations_endpoint}/{conversation_id}: Updates metadata (e.g., title).
- DELETE {conversations_endpoint}/{conversation_id}: Cascade-deletes the conversation, messages, turn traces, and working memory.
4.3 Utility Endpoints
- GET {health_endpoint}: Returns {"status": "ok", "agent_name": "...", "uptime_s": ...}. Safe for load balancers.
- GET {info_endpoint}: Returns static metadata, versioning, capabilities (has_tools, has_memory), and a dictionary mapping of all registered HTTP routes.
- GET /: Auto-redirects to the Swagger UI (/docs) or /info if docs are disabled.
5. Server-Sent Events (SSE) Streaming
The POST {chat_endpoint}/stream route returns a text/event-stream response. It yields three distinct event types, allowing frontend UIs to render intermediate tool executions while waiting for the final response.
- event: intermediate: Emitted whenever the agent calls a tool or generates an intermediate thought.
``json data: {"type": "tool_call", "label": "Searching DB", "data": {...}} ``
- event: done: Emitted exactly once at the end of the turn. Contains the full ChatResponse payload.
``json data: {"response": "Here is the data...", "trace_id": "...", "is_blocked": false, ...} ``
- event: error: Emitted on timeouts or fatal agent crashes. The stream terminates immediately after this event.
``json data: {"error": "TimeoutError", "detail": "Agent did not respond within 120.0s."} ``
6. The AgentServer Class (Lifecycle Management)
The AgentServer class wraps the FastAPI app and the Uvicorn server, running it safely within an asyncio.Task so it does not block the main Python thread.
Properties
- app: The underlying FastAPI application instance.
- is_running: Boolean indicating if the server is accepting requests.
- effective_port: Returns the actual bound TCP port. This is exceptionally useful in automated testing when ServerConfig(port=0) is used to let the OS assign a random free port.
Methods
- start(): Builds the app, starts Uvicorn, and uses a polling loop to ensure the socket is successfully bound before returning control to the caller.
- stop(): Flags Uvicorn to exit and gracefully awaits the background task. The AgentBuilder automatically registers this method with the Teardown sequence to ensure the server shuts down before database connections are dropped.
7. Error Handling
- 504 Gateway Timeout: If the agent.chat() internal execution exceeds ServerConfig.request_timeout, the server catches the asyncio.TimeoutError and returns a clean 504 JSON response.
- 422 Unprocessable Entity: Malformed payloads (e.g., message exceeding max_message_length or missing user_id) are automatically rejected by Pydantic validation before touching the agent logic.
- 500 Internal Server Error: A global exception handler catches any unhandled exception (like a database disconnect), logs the full traceback, and returns a sanitized ErrorResponse to the client to prevent exposing stack traces to end-users.