LLM Core: BaseLLM

1. Behavior and Context

In the jazzmine architecture, BaseLLM serves as the foundational layer for all "Brains" of the agent.

Provider Pattern: Any specific model implementation (e.g., GeminiLLM, LocalLLM) must inherit from this class and implement its abstract methods.
Synchronous & Asynchronous Symmetry: It defines both sync and async versions of generation and streaming to accommodate different execution environments.
Lifecycle Management: It implements both standard and asynchronous Context Manager protocols (with and async with), ensuring that network connections or sub-processes are strictly cleaned up.

2. Purpose

Abstraction: Decoupling the agent's reasoning logic from provider-specific SDKs.
Resource Reliability: Providing a standardized way to open and close HTTP connection pools or subprocess handles.
Consistency: Enforcing a common set of parameters (temperature, max_tokens, etc.) across all supported backends.
Safety: Establishing a health_check interface to verify provider availability before starting complex tasks.

3. High-Level API (Subclassing & Usage)

Developers typically use concrete implementations of BaseLLM, but understanding the base API is essential for creating custom providers or writing provider-agnostic agent code.

Example: Implementing a Custom Provider (Conceptual)

python

from jazzmine.core.llm.base import BaseLLM
from jazzmine.core.llm.types import LLMResponse, MessagePart

class MyCustomModel(BaseLLM):
    async def agenerate(self, messages, stop=None, **kwargs):
        # Implementation logic here
        ...
        return LLMResponse(...)
    
    # ... must implement generate, stream, and astream ...

# Standard pattern to ensure connections are closed automatically
async with OpenAICompatibleLLM(api_key="...", model="gpt-4o") as llm:
    response = await llm.agenerate([
        MessagePart(role="user", content="Hello!")
    ])
    print(response.text)

4. Detailed Functionality

init(model, temperature=0.0, max_tokens=None, timeout=None, **kwargs)

Functionality: Initializes the base configuration for the LLM instance.

Parameters:

Parameter	Type	Default	Description
model	str	Required	The model identifier (e.g., "gpt-4o" or "claude-3-opus").
temperature	float	0.0	Controls randomness (0.0 is deterministic, 1.0 is creative).
max_tokens	Optional[int]	None	Hard limit on the number of tokens generated in the response.
timeout	Optional[float]	None	Maximum seconds to wait for a response before timing out.

generate / agenerate [Abstract]

Functionality: Performs a single, complete interaction with the model.

Parameters:

messages (List[MessagePart]): The conversation history.
stop (Optional[Iterable[str]]): Sequences where the model should stop generating.
**kwargs: Provider-specific parameters (e.g., top_p, presence_penalty).

Returns: An LLMResponse object.

stream / astream [Abstract]

Functionality: Initiates a streaming request where tokens are yielded as they are generated by the model.

Returns: An Iterator[str] (sync) or AsyncIterator[str] (async).

close() / aclose()

Functionality: Performs the physical cleanup of resources.

close: Closes synchronous httpx clients or file handles.
aclose: Closes asynchronous clients.

How it works: These methods check for the existence of client or aclient attributes and call their respective close methods if they support them.

health_check() -> bool

Functionality: Verifies that the model provider is configured correctly and reachable.

Default: Returns True. Concrete providers can override this to perform a "ping" or a low-cost metadata request.

5. Error Handling

BaseLLM defines the interface but does not raise specific errors. Subclasses are required to raise the exceptions defined in errors.py (e.g., LLMTimeoutError).
Resource Safety: If a provider is used within a context manager (with or async with), close() or aclose() is guaranteed to be called even if an exception occurs during generation.

6. Remarks

Default Temperature: jazzmine defaults to 0.0 temperature across all models to prioritize reasoning stability and consistency for task-performing agents.
Keyword Arguments: The use of **kwargs in all methods allows the framework to remain future-proof; if a new model parameter is introduced by a provider, it can be passed through without modifying the base class.