LLM Providers: BedrockLLM

1. Behavior and Context

In the jazzmine architecture, BedrockLLM acts as an enterprise-grade gateway.

Converse API Integration: Unlike older AWS SDK methods that required model-specific JSON bodies, this class uses the converse and converse_stream methods. This ensures that switching from anthropic.claude-v3 to meta.llama3 requires only a change in the model ID.
Native Async Support: It leverages the aioboto3 library to ensure that AWS calls are non-blocking, maintaining the high performance of the agent chat loop.
IAM Authentication: Security is handled via standard AWS credentials (access keys, session tokens, or IAM roles), making it ideal for deployment on AWS infrastructure (Lambda, EC2, ECS).

2. Purpose

Security & Compliance: Ideal for organizations with strict data residency requirements who prefer to run models within their own AWS perimeter.
Model Diversity: Allows the agent to use a wide variety of models (Claude, Llama, Mistral, etc.) via a single, stable provider interface.
Performance Tracking: Extracts standardized token metrics directly from the Bedrock response metadata.

3. High-Level API Examples

Example: Initializing with Claude 3.5 Sonnet on Bedrock

python

from jazzmine.core.llm import BedrockLLM

# Bedrock automatically picks up credentials from your environment 
# (~/.aws/credentials or IAM Role)
llm = BedrockLLM(
    model="anthropic.claude-3-5-sonnet-20240620-v1:0",
    region_name="us-east-1",
    temperature=0.0,
    max_tokens=2048
)

# Usage remains consistent with the jazzmine interface
response = await llm.agenerate(messages)
print(f"Bedrock ({response.model}) says: {response.text}")

4. Detailed Functionality

init(model, region_name, **kwargs)

Functionality: Initializes the AWS clients for both synchronous and asynchronous communication.

Parameters:

model (str): The full Bedrock Model ID (e.g., anthropic.claude-3-sonnet-20240229-v1:0).
region_name (str): The AWS region where the model is enabled (e.g., us-west-2).
**kwargs: Inherited parameters like temperature and max_tokens.

_prepare_payload(messages) [Internal]

Functionality: Translates MessagePart objects into the AWS Converse API nested dictionary format.

How it works:

System Prompts: It extracts all messages with the system role and places them in the top-level system list of the Converse call.
Message Array: Maps user and assistant roles and wraps their content in the Bedrock-required [{"text": "..."}] structure.
Inference Config: Packages temperature and maxTokens into the standardized inferenceConfig block.

_parse_response(response, start_time) [Internal]

Functionality: Extracts the text content and token usage from the complex Boto3 response dictionary.

How it works:

Text Extraction: Accesses the path output -> MessagePart -> content[0] -> text.
Usage Mapping: Maps AWS's inputTokens and outputTokens directly to the LLMUsage model.

stream / astream

Functionality: Processes token events via the converse_stream API.

How it works: It iterates over the event stream yielded by AWS. It specifically monitors for contentBlockDelta events, extracting and yielding the string fragments as they are generated by the model.

5. Error Handling

LLMRateLimitError: Raised if the provider returns a ThrottlingException. This is common if the AWS account has not requested a quota increase for the specific model.
LLMInternalError: Raised for general AWS ClientError issues or server-side failures within the Bedrock service.
AWS Credentials: If credentials are missing or expired, boto3 will raise a native error. Ensure aws configure has been run or environment variables (AWS_ACCESS_KEY_ID, etc.) are set.

6. Remarks

Region Specificity: Not all models are available in all regions. Always check the AWS Bedrock Console to ensure the model ID is active in your selected region_name.
Model IDs: Unlike OpenAI, Bedrock model IDs are often long strings (e.g., anthropic.claude-v2). Ensure the full string is used.
Resource Management: The close() and aclose() methods handle the shutdown of the boto3 client. Since Bedrock uses pooled HTTP connections, explicit closing is recommended during application shutdown.