1. Behavior and Context
In the jazzmine architecture, AzureOpenAILLM inherits from OpenAICompatibleLLM. It acts as a specialized wrapper that reconfigures the base provider's behavior:
- Deployment-Centric: Instead of using a global model name, you target a specific "Deployment Name" created in your Azure AI Studio.
- Header Overriding: It automatically replaces the standard Authorization: Bearer header with the Azure-specific api-key header.
- Automatic URL Construction: It assembles the complex Azure endpoint URL using the provided resource endpoint and API version.
- Compatibility: Because it inherits from the OpenAI compatible class, it supports the same high-performance streaming and JSON response parsing logic.
2. Purpose
- Enterprise Integration: Designed for organizations running agents within the Azure ecosystem who require VPC isolation and regional data residency.
- Security: Leverages Azure-specific authentication and private endpoints.
- Predictable Throughput: Connects to specific provisioned deployments with guaranteed capacity and latency.
3. High-Level API Examples
Example: Initializing the Azure Provider
python
from jazzmine.core.llm import AzureOpenAILLM
# deployment_name matches the 'Deployment Name' in Azure AI Studio
llm = AzureOpenAILLM(
api_key="your-azure-key",
endpoint="https://your-resource-name.openai.azure.com",
deployment_name="gpt-4o-prod",
api_version="2024-02-15-preview",
temperature=0.0,
timeout=30.0
)
# Usage is identical to other jazzmine LLM providers
response = await llm.agenerate(messages)
print(f"Azure Response: {response.text}")4. Detailed Functionality
__init__(api_key, endpoint, deployment_name, api_version, **kwargs)
Functionality: Configures the Azure-specific identity and initializes the parent connection pools.
Parameters:
- api_key (str): The secret key found in the "Keys and Endpoint" section of your Azure resource.
- endpoint (str): The base URL for your resource (e.g., https://my-ai.openai.azure.com).
- deployment_name (str): The specific name you gave your model deployment.
- api_version (str): The Azure API version string (e.g., "2024-02-15-preview").
How it works:
- Constructs the azure_base_url following the pattern: {endpoint}/openai/deployments/{deployment_name}.
- Calls the parent OpenAICompatibleLLM constructor with an empty key (to prevent default auth) and sets the chat_endpoint to /chat/completions.
- Injects the api-key header and api-version query parameter into the httpx clients.
Inherited Methods
Since this class inherits from OpenAICompatibleLLM, the following methods function using Azure's specific URL structure:
- generate / agenerate: Executes a completion request against the deployment.
- stream / astream: Processes real-time token events using the standard OpenAI-compatible SSE format.
5. Error Handling
- LLMInvalidRequestError: Raised if Azure's "Content Safety" filters block a prompt or completion. The error message will typically indicate that a safety policy was violated.
- LLMRateLimitError: Raised if you exceed the Tokens Per Minute (TPM) or Requests Per Minute (RPM) quotas assigned to your specific deployment.
- LLMConnectionError: Occurs if the endpoint URL is formatted incorrectly or if the Azure region is experiencing downtime.
6. Remarks
- API Versions: Azure OpenAI frequently releases new API versions. Ensure the api_version string you provide is compatible with your resource region and model type.
- Model ID Mapping: In the resulting LLMResponse, the model field will contain your deployment_name rather than the underlying model name (e.g., it will say gpt-4o-prod, not gpt-4o).
- Environment Variables: For production security, avoid hardcoding the api_key. Use environment variables and load them into the constructor at runtime.