LLM
Core reference

LLM Core: BaseLLM

BaseLLM is the Abstract Base Class (ABC) that defines the universal interface for all Large Language Model providers in the jazzmine framework. It acts as a formal contract, ensuring that regardless of whether the agent is using a cloud-based API (like OpenAI or Anthropic) or a local model binary, the interaction patterns for generation, streaming, and resource management remain identical.

1. Behavior and Context

In the jazzmine architecture, BaseLLM serves as the foundational layer for all "Brains" of the agent.

  • Provider Pattern: Any specific model implementation (e.g., GeminiLLM, LocalLLM) must inherit from this class and implement its abstract methods.
  • Synchronous & Asynchronous Symmetry: It defines both sync and async versions of generation and streaming to accommodate different execution environments.
  • Lifecycle Management: It implements both standard and asynchronous Context Manager protocols (with and async with), ensuring that network connections or sub-processes are strictly cleaned up.

2. Purpose

  • Abstraction: Decoupling the agent's reasoning logic from provider-specific SDKs.
  • Resource Reliability: Providing a standardized way to open and close HTTP connection pools or subprocess handles.
  • Consistency: Enforcing a common set of parameters (temperature, max_tokens, etc.) across all supported backends.
  • Safety: Establishing a health_check interface to verify provider availability before starting complex tasks.

3. High-Level API (Subclassing & Usage)

Developers typically use concrete implementations of BaseLLM, but understanding the base API is essential for creating custom providers or writing provider-agnostic agent code.

Example: Implementing a Custom Provider (Conceptual)

python
from jazzmine.core.llm.base import BaseLLM
from jazzmine.core.llm.types import LLMResponse, MessagePart

class MyCustomModel(BaseLLM):
    async def agenerate(self, messages, stop=None, **kwargs):
        # Implementation logic here
        ...
        return LLMResponse(...)
    
    # ... must implement generate, stream, and astream ...

# Standard pattern to ensure connections are closed automatically
async with OpenAICompatibleLLM(api_key="...", model="gpt-4o") as llm:
    response = await llm.agenerate([
        MessagePart(role="user", content="Hello!")
    ])
    print(response.text)

4. Detailed Functionality

__init__(model, temperature=0.0, max_tokens=None, timeout=None, **kwargs)

Functionality: Initializes the base configuration for the LLM instance.

Parameters:

ParameterTypeDefaultDescription
modelstrRequiredThe model identifier (e.g., "gpt-4o" or "claude-3-opus").
temperaturefloat0.0Controls randomness (0.0 is deterministic, 1.0 is creative).
max_tokensOptional[int]NoneHard limit on the number of tokens generated in the response.
timeoutOptional[float]NoneMaximum seconds to wait for a response before timing out.

generate / agenerate [Abstract]

Functionality: Performs a single, complete interaction with the model.

Parameters:

  • messages (List[MessagePart]): The conversation history.
  • stop (Optional[Iterable[str]]): Sequences where the model should stop generating.
  • **kwargs: Provider-specific parameters (e.g., top_p, presence_penalty).

Returns: An LLMResponse object.


stream / astream [Abstract]

Functionality: Initiates a streaming request where tokens are yielded as they are generated by the model.

Returns: An Iterator[str] (sync) or AsyncIterator[str] (async).


close() / aclose()

Functionality: Performs the physical cleanup of resources.

  • close: Closes synchronous httpx clients or file handles.
  • aclose: Closes asynchronous clients.

How it works: These methods check for the existence of client or aclient attributes and call their respective close methods if they support them.


health_check() -> bool

Functionality: Verifies that the model provider is configured correctly and reachable.

  • Default: Returns True. Concrete providers can override this to perform a "ping" or a low-cost metadata request.

5. Error Handling

  • BaseLLM defines the interface but does not raise specific errors. Subclasses are required to raise the exceptions defined in errors.py (e.g., LLMTimeoutError).
  • Resource Safety: If a provider is used within a context manager (with or async with), close() or aclose() is guaranteed to be called even if an exception occurs during generation.

6. Remarks

  • Default Temperature: jazzmine defaults to 0.0 temperature across all models to prioritize reasoning stability and consistency for task-performing agents.
  • Keyword Arguments: The use of **kwargs in all methods allows the framework to remain future-proof; if a new model parameter is introduced by a provider, it can be passed through without modifying the base class.