LLM Types | Jazzmine Core

LLM Core: types

1. Behavior and Context

In the jazzmine architecture, these types act as the "Universal Language" of the LLM layer:

Structured Communication: Every LLM response, regardless of whether it comes from OpenAI, Anthropic, or a local binary, is converted into an LLMResponse.
Observability: The LLMUsage class provides a consistent way to track tokens and financial costs, which is eventually persisted in the TurnTrace.

2. Purpose

Standardization: Mapping disparate provider outputs to a single internal format.
Cost Tracking: Enabling per-turn and per-conversation financial auditing.
Role Management: Defining the three-role system (system, user, assistant) used for prompt construction.

3. Class Breakdown

LLMUsage (Dataclass)

Tracks the resource consumption of a single completion request.

Attribute	Type	Description
prompt_tokens	int	Number of tokens in the input prompt.
completion_tokens	int	Number of tokens generated by the model.
total_tokens	int	Sum of prompt and completion tokens.
cost	Optional[float]	The monetary cost of the request (if provided by the API).

LLMResponse (Dataclass)

The unified container for model output.

Attribute	Type	Description
text	str	The primary text content generated by the model.
usage	LLMUsage	Token and cost metadata.
model	str	The specific model identifier that processed the request.
latency_ms	Optional[int]	Time elapsed for the API call in milliseconds.
finish_reason	Optional[str]	Why the model stopped (e.g., "stop", "length").
raw	Optional[Any]	The original, unmodified response dictionary from the provider.

MessagePart (Dataclass)

Represents a single turn in a conversation history.

Attribute	Type	Description
role	str	The participant role: "system", "user", or "assistant".
content	str	The actual text of the message.

4. Usage Example

python

from jazzmine.core.llm.types import MessagePart, LLMUsage, LLMResponse

# 1. Defining a conversation fragment
messages = [
    MessagePart(role="system", content="You are a helpful assistant."),
    MessagePart(role="user", content="Hello!")
]

# 2. Reconstructing a response (usually handled by the Provider)
response = LLMResponse(
    text="Hello! How can I help you today?",
    usage=LLMUsage(prompt_tokens=20, completion_tokens=10, total_tokens=30),
    model="gpt-4o",
    latency_ms=450
)

print(f"[{response.model}] {response.text}")