LLM Core: utils | Jazzmine Core

1. Behavior and Context

In the jazzmine architecture, the utils module acts as a "Data Sanitizer" for the LLM providers.

Fallback Estimation: When using local models (via LocalLLM) or providers that omit usage metadata, the module provides a heuristic estimator.
Standardization: It maps various provider-specific dictionaries into the unified LLMUsage dataclass defined in the types module.

2. Purpose

Usage Consistency: To provide a single source of truth for creating LLMUsage objects.
Predictability: To ensure that even if an API call fails to return metadata, the system can still estimate the "weight" of the turn for context-window management.
Abstraction: To keep provider-specific parsing logic out of the core agent loop.

3. High-Level API

The utility functions are used internally by classes like OpenAICompatibleLLM and GeminiLLM, but they can be used independently for pre-computation.

Example: Estimating costs before a call

python

from jazzmine.core.llm.utils import estimate_tokens, normalize_usage

prompt = "Translate the following text to French: 'Hello world'"
# Get a quick estimate of tokens
tokens = estimate_tokens(prompt)
print(f"Estimated prompt tokens: {tokens}")

# Create a usage object manually
usage = normalize_usage(prompt=prompt, completion="Bonjour le monde", provider_usage=None)
print(f"Total Turn Tokens: {usage.total_tokens}")

4. Detailed Functionality

estimate_tokens(text: str) -> int

Functionality: Performs a conservative heuristic estimation of the number of tokens in a string.

Parameters:

text (str): The raw string to measure.

How it works: It uses a common industry "rule of thumb" where approximately 4 characters equate to 1 token (based on Byte-Pair Encoding averages). It ensures a minimum of 1 token is returned for non-empty strings.

normalize_usage(...)

Functionality: Constructs a standardized LLMUsage object from either raw strings or provider-provided metadata.

Parameters:

Parameter	Type	Description
prompt	str	The original input text sent to the model.
completion	str	The resulting text generated by the model.
provider_usage	Optional[dict]	The raw usage dictionary returned by the API (e.g., OpenAI's usage field).

How it works:

Check Provider Data: If provider_usage is present, it attempts to extract prompt_tokens, completion_tokens, total_tokens, and cost directly from the dict keys.
Fallback to Estimation: If no provider data is available, it calls estimate_tokens on both the prompt and completion strings to fill in the metrics.

5. Error Handling

Missing Dictionary Keys: normalize_usage uses .get(..., 0) when reading from provider dictionaries. This prevents KeyError if a specific provider uses slightly different naming conventions or omits a field (like cost).
Type Safety: The functions assume the input text is a string. Passing None or other types will result in a standard Python TypeError.

6. Remarks

Accuracy Warning: estimate_tokens is a heuristic, not an exact count. Every model (Llama, GPT, Claude) uses a different tokenizer. For precise billing or strict context-limit enforcement, always rely on the provider_usage data if available.
BPE Approximation: The estimation logic is specifically tuned to approximate GPT-style tokenization, which is the most common standard for the providers supported by jazzmine.