LLM Providers: AzureOpenAILLM

1. Behavior and Context

In the jazzmine architecture, AzureOpenAILLM inherits from OpenAICompatibleLLM. It acts as a specialized wrapper that reconfigures the base provider's behavior:

Deployment-Centric: Instead of using a global model name, you target a specific "Deployment Name" created in your Azure AI Studio.
Header Overriding: It automatically replaces the standard Authorization: Bearer header with the Azure-specific api-key header.
Automatic URL Construction: It assembles the complex Azure endpoint URL using the provided resource endpoint and API version.
Compatibility: Because it inherits from the OpenAI compatible class, it supports the same high-performance streaming and JSON response parsing logic.

2. Purpose

Enterprise Integration: Designed for organizations running agents within the Azure ecosystem who require VPC isolation and regional data residency.
Security: Leverages Azure-specific authentication and private endpoints.
Predictable Throughput: Connects to specific provisioned deployments with guaranteed capacity and latency.

3. High-Level API Examples

Example: Initializing the Azure Provider

python

from jazzmine.core.llm import AzureOpenAILLM

# deployment_name matches the 'Deployment Name' in Azure AI Studio
llm = AzureOpenAILLM(
    api_key="your-azure-key",
    endpoint="https://your-resource-name.openai.azure.com",
    deployment_name="gpt-4o-prod",
    api_version="2024-02-15-preview",
    temperature=0.0,
    timeout=30.0
)

# Usage is identical to other jazzmine LLM providers
response = await llm.agenerate(messages)
print(f"Azure Response: {response.text}")

4. Detailed Functionality

init(api_key, endpoint, deployment_name, api_version, **kwargs)

Functionality: Configures the Azure-specific identity and initializes the parent connection pools.

Parameters:

api_key (str): The secret key found in the "Keys and Endpoint" section of your Azure resource.
endpoint (str): The base URL for your resource (e.g., https://my-ai.openai.azure.com).
deployment_name (str): The specific name you gave your model deployment.
api_version (str): The Azure API version string (e.g., "2024-02-15-preview").

How it works:

Constructs the azure_base_url following the pattern: {endpoint}/openai/deployments/{deployment_name}.
Calls the parent OpenAICompatibleLLM constructor with an empty key (to prevent default auth) and sets the chat_endpoint to /chat/completions.
Injects the api-key header and api-version query parameter into the httpx clients.

Inherited Methods

Since this class inherits from OpenAICompatibleLLM, the following methods function using Azure's specific URL structure:

generate / agenerate: Executes a completion request against the deployment.
stream / astream: Processes real-time token events using the standard OpenAI-compatible SSE format.

5. Error Handling

LLMInvalidRequestError: Raised if Azure's "Content Safety" filters block a prompt or completion. The error message will typically indicate that a safety policy was violated.
LLMRateLimitError: Raised if you exceed the Tokens Per Minute (TPM) or Requests Per Minute (RPM) quotas assigned to your specific deployment.
LLMConnectionError: Occurs if the endpoint URL is formatted incorrectly or if the Azure region is experiencing downtime.

6. Remarks

API Versions: Azure OpenAI frequently releases new API versions. Ensure the api_version string you provide is compatible with your resource region and model type.
Model ID Mapping: In the resulting LLMResponse, the model field will contain your deployment_name rather than the underlying model name (e.g., it will say gpt-4o-prod, not gpt-4o).
Environment Variables: For production security, avoid hardcoding the api_key. Use environment variables and load them into the constructor at runtime.