1. Behavior and Context
In the jazzmine architecture, BaseLLM serves as the foundational layer for all "Brains" of the agent.
- Provider Pattern: Any specific model implementation (e.g., GeminiLLM, LocalLLM) must inherit from this class and implement its abstract methods.
- Synchronous & Asynchronous Symmetry: It defines both sync and async versions of generation and streaming to accommodate different execution environments.
- Lifecycle Management: It implements both standard and asynchronous Context Manager protocols (with and async with), ensuring that network connections or sub-processes are strictly cleaned up.
2. Purpose
- Abstraction: Decoupling the agent's reasoning logic from provider-specific SDKs.
- Resource Reliability: Providing a standardized way to open and close HTTP connection pools or subprocess handles.
- Consistency: Enforcing a common set of parameters (temperature, max_tokens, etc.) across all supported backends.
- Safety: Establishing a health_check interface to verify provider availability before starting complex tasks.
3. High-Level API (Subclassing & Usage)
Developers typically use concrete implementations of BaseLLM, but understanding the base API is essential for creating custom providers or writing provider-agnostic agent code.
Example: Implementing a Custom Provider (Conceptual)
from jazzmine.core.llm.base import BaseLLM
from jazzmine.core.llm.types import LLMResponse, MessagePart
class MyCustomModel(BaseLLM):
async def agenerate(self, messages, stop=None, **kwargs):
# Implementation logic here
...
return LLMResponse(...)
# ... must implement generate, stream, and astream ...
# Standard pattern to ensure connections are closed automatically
async with OpenAICompatibleLLM(api_key="...", model="gpt-4o") as llm:
response = await llm.agenerate([
MessagePart(role="user", content="Hello!")
])
print(response.text)4. Detailed Functionality
__init__(model, temperature=0.0, max_tokens=None, timeout=None, **kwargs)
Functionality: Initializes the base configuration for the LLM instance.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | str | Required | The model identifier (e.g., "gpt-4o" or "claude-3-opus"). |
| temperature | float | 0.0 | Controls randomness (0.0 is deterministic, 1.0 is creative). |
| max_tokens | Optional[int] | None | Hard limit on the number of tokens generated in the response. |
| timeout | Optional[float] | None | Maximum seconds to wait for a response before timing out. |
generate / agenerate [Abstract]
Functionality: Performs a single, complete interaction with the model.
Parameters:
- messages (List[MessagePart]): The conversation history.
- stop (Optional[Iterable[str]]): Sequences where the model should stop generating.
- **kwargs: Provider-specific parameters (e.g., top_p, presence_penalty).
Returns: An LLMResponse object.
stream / astream [Abstract]
Functionality: Initiates a streaming request where tokens are yielded as they are generated by the model.
Returns: An Iterator[str] (sync) or AsyncIterator[str] (async).
close() / aclose()
Functionality: Performs the physical cleanup of resources.
- close: Closes synchronous httpx clients or file handles.
- aclose: Closes asynchronous clients.
How it works: These methods check for the existence of client or aclient attributes and call their respective close methods if they support them.
health_check() -> bool
Functionality: Verifies that the model provider is configured correctly and reachable.
- Default: Returns True. Concrete providers can override this to perform a "ping" or a low-cost metadata request.
5. Error Handling
- BaseLLM defines the interface but does not raise specific errors. Subclasses are required to raise the exceptions defined in errors.py (e.g., LLMTimeoutError).
- Resource Safety: If a provider is used within a context manager (with or async with), close() or aclose() is guaranteed to be called even if an exception occurs during generation.
6. Remarks
- Default Temperature: jazzmine defaults to 0.0 temperature across all models to prioritize reasoning stability and consistency for task-performing agents.
- Keyword Arguments: The use of **kwargs in all methods allows the framework to remain future-proof; if a new model parameter is introduced by a provider, it can be passed through without modifying the base class.