LLM Providers
Core reference

LLM Providers: AzureOpenAILLM

The AzureOpenAILLM provider is a specialized implementation tailored for Microsoft Azure OpenAI Service. While it shares the core logic of the OpenAI standard, it handles the unique URL routing, deployment-based modeling, and non-standard authentication headers required by the Azure enterprise environment.

1. Behavior and Context

In the jazzmine architecture, AzureOpenAILLM inherits from OpenAICompatibleLLM. It acts as a specialized wrapper that reconfigures the base provider's behavior:

  • Deployment-Centric: Instead of using a global model name, you target a specific "Deployment Name" created in your Azure AI Studio.
  • Header Overriding: It automatically replaces the standard Authorization: Bearer header with the Azure-specific api-key header.
  • Automatic URL Construction: It assembles the complex Azure endpoint URL using the provided resource endpoint and API version.
  • Compatibility: Because it inherits from the OpenAI compatible class, it supports the same high-performance streaming and JSON response parsing logic.

2. Purpose

  • Enterprise Integration: Designed for organizations running agents within the Azure ecosystem who require VPC isolation and regional data residency.
  • Security: Leverages Azure-specific authentication and private endpoints.
  • Predictable Throughput: Connects to specific provisioned deployments with guaranteed capacity and latency.

3. High-Level API Examples

Example: Initializing the Azure Provider

python
from jazzmine.core.llm import AzureOpenAILLM

# deployment_name matches the 'Deployment Name' in Azure AI Studio
llm = AzureOpenAILLM(
    api_key="your-azure-key",
    endpoint="https://your-resource-name.openai.azure.com",
    deployment_name="gpt-4o-prod",
    api_version="2024-02-15-preview",
    temperature=0.0,
    timeout=30.0
)

# Usage is identical to other jazzmine LLM providers
response = await llm.agenerate(messages)
print(f"Azure Response: {response.text}")

4. Detailed Functionality

__init__(api_key, endpoint, deployment_name, api_version, **kwargs)

Functionality: Configures the Azure-specific identity and initializes the parent connection pools.

Parameters:

  • api_key (str): The secret key found in the "Keys and Endpoint" section of your Azure resource.
  • endpoint (str): The base URL for your resource (e.g., https://my-ai.openai.azure.com).
  • deployment_name (str): The specific name you gave your model deployment.
  • api_version (str): The Azure API version string (e.g., "2024-02-15-preview").

How it works:

  • Constructs the azure_base_url following the pattern: {endpoint}/openai/deployments/{deployment_name}.
  • Calls the parent OpenAICompatibleLLM constructor with an empty key (to prevent default auth) and sets the chat_endpoint to /chat/completions.
  • Injects the api-key header and api-version query parameter into the httpx clients.

Inherited Methods

Since this class inherits from OpenAICompatibleLLM, the following methods function using Azure's specific URL structure:

  • generate / agenerate: Executes a completion request against the deployment.
  • stream / astream: Processes real-time token events using the standard OpenAI-compatible SSE format.

5. Error Handling

  • LLMInvalidRequestError: Raised if Azure's "Content Safety" filters block a prompt or completion. The error message will typically indicate that a safety policy was violated.
  • LLMRateLimitError: Raised if you exceed the Tokens Per Minute (TPM) or Requests Per Minute (RPM) quotas assigned to your specific deployment.
  • LLMConnectionError: Occurs if the endpoint URL is formatted incorrectly or if the Azure region is experiencing downtime.

6. Remarks

  • API Versions: Azure OpenAI frequently releases new API versions. Ensure the api_version string you provide is compatible with your resource region and model type.
  • Model ID Mapping: In the resulting LLMResponse, the model field will contain your deployment_name rather than the underlying model name (e.g., it will say gpt-4o-prod, not gpt-4o).
  • Environment Variables: For production security, avoid hardcoding the api_key. Use environment variables and load them into the constructor at runtime.