1. Behavior and Context
In the jazzmine architecture, BedrockLLM acts as an enterprise-grade gateway.
- Converse API Integration: Unlike older AWS SDK methods that required model-specific JSON bodies, this class uses the converse and converse_stream methods. This ensures that switching from anthropic.claude-v3 to meta.llama3 requires only a change in the model ID.
- Native Async Support: It leverages the aioboto3 library to ensure that AWS calls are non-blocking, maintaining the high performance of the agent chat loop.
- IAM Authentication: Security is handled via standard AWS credentials (access keys, session tokens, or IAM roles), making it ideal for deployment on AWS infrastructure (Lambda, EC2, ECS).
2. Purpose
- Security & Compliance: Ideal for organizations with strict data residency requirements who prefer to run models within their own AWS perimeter.
- Model Diversity: Allows the agent to use a wide variety of models (Claude, Llama, Mistral, etc.) via a single, stable provider interface.
- Performance Tracking: Extracts standardized token metrics directly from the Bedrock response metadata.
3. High-Level API Examples
Example: Initializing with Claude 3.5 Sonnet on Bedrock
from jazzmine.core.llm import BedrockLLM
# Bedrock automatically picks up credentials from your environment
# (~/.aws/credentials or IAM Role)
llm = BedrockLLM(
model="anthropic.claude-3-5-sonnet-20240620-v1:0",
region_name="us-east-1",
temperature=0.0,
max_tokens=2048
)
# Usage remains consistent with the jazzmine interface
response = await llm.agenerate(messages)
print(f"Bedrock ({response.model}) says: {response.text}")4. Detailed Functionality
__init__(model, region_name, **kwargs)
Functionality: Initializes the AWS clients for both synchronous and asynchronous communication.
Parameters:
- model (str): The full Bedrock Model ID (e.g., anthropic.claude-3-sonnet-20240229-v1:0).
- region_name (str): The AWS region where the model is enabled (e.g., us-west-2).
- **kwargs: Inherited parameters like temperature and max_tokens.
_prepare_payload(messages) [Internal]
Functionality: Translates MessagePart objects into the AWS Converse API nested dictionary format.
How it works:
- System Prompts: It extracts all messages with the system role and places them in the top-level system list of the Converse call.
- Message Array: Maps user and assistant roles and wraps their content in the Bedrock-required [{"text": "..."}] structure.
- Inference Config: Packages temperature and maxTokens into the standardized inferenceConfig block.
_parse_response(response, start_time) [Internal]
Functionality: Extracts the text content and token usage from the complex Boto3 response dictionary.
How it works:
- Text Extraction: Accesses the path output -> MessagePart -> content[0] -> text.
- Usage Mapping: Maps AWS's inputTokens and outputTokens directly to the LLMUsage model.
stream / astream
Functionality: Processes token events via the converse_stream API.
How it works: It iterates over the event stream yielded by AWS. It specifically monitors for contentBlockDelta events, extracting and yielding the string fragments as they are generated by the model.
5. Error Handling
- LLMRateLimitError: Raised if the provider returns a ThrottlingException. This is common if the AWS account has not requested a quota increase for the specific model.
- LLMInternalError: Raised for general AWS ClientError issues or server-side failures within the Bedrock service.
- AWS Credentials: If credentials are missing or expired, boto3 will raise a native error. Ensure aws configure has been run or environment variables (AWS_ACCESS_KEY_ID, etc.) are set.
6. Remarks
- Region Specificity: Not all models are available in all regions. Always check the AWS Bedrock Console to ensure the model ID is active in your selected region_name.
- Model IDs: Unlike OpenAI, Bedrock model IDs are often long strings (e.g., anthropic.claude-v2). Ensure the full string is used.
- Resource Management: The close() and aclose() methods handle the shutdown of the boto3 client. Since Bedrock uses pooled HTTP connections, explicit closing is recommended during application shutdown.