Introduction, Context, and Purpose
Purpose: Its primary purpose is to abstract away the boilerplate associated with configuring remote embedding APIs. It automatically maps simple string aliases (like "openai" or "cohere") to their exact API endpoints and default embedding models.
Context & Behavior: When a Python user instantiates a memory component (such as EpisodicMemory or ProceduralMemory) and passes remote configuration arguments, those arguments are passed down to Rust. The provider_utils intercept these optional arguments. If the user omits the base_url or model_name, this component intelligently injects the optimal defaults based on the chosen provider. This allows users to switch between different AI providers with minimal code changes, while still retaining the ability to fully override endpoints (e.g., for local endpoints like vLLM).
High-Level API
Because this is an internal Rust module, it is not directly imported in Python. Instead, its logic is triggered via the constructors of the memory classes.
Example Usage (Triggering the Utilities via Python)
from memory import SemanticMemory
# Example 1: Relying on provider_utils defaults
# Behind the scenes, provider_utils maps "mistral" to:
# URL: https://api.mistral.ai/v1
# Model: mistral-embed
memory1 = SemanticMemory(
qdrant_manager=manager,
tokenizer_path="./tokenizer.json",
provider="mistral",
api_key="your-key"
)
# Example 2: Overriding provider_utils defaults
# Here, provider_utils respects the custom overrides for an OpenAI-compatible local server
memory2 = SemanticMemory(
qdrant_manager=manager,
tokenizer_path="./tokenizer.json",
provider="openai",
base_url="http://localhost:8000/v1",
model_name="my-custom-local-model",
api_key="not-needed"
)Detailed Functionality
resolve_remote_provider_config
The primary entry point of this module. It takes the optional configuration provided by the user and fills in the gaps using intelligent defaults.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| provider | Option<String> | Some("openai") | The string identifier of the AI provider passed from Python. |
| base_url | Option<String> | None | A user‑provided override for the API endpoint URL. |
| model_name | Option<String> | None | A user‑provided override for the embedding model name. |
How it works
- It unwraps the provider string, defaulting to "openai" if None was provided.
- It converts the string into a strongly‑typed Provider enum.
- If base_url is None, it calls default_base_url() to get the correct URL for the Provider.
- If model_name is None, it calls default_model_name() to get the recommended model.
- It returns a tuple (Provider, String, String) containing the resolved Provider Enum, URL, and Model Name.
default_base_url
A mapping function that returns the official API endpoint for a given AI provider.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| provider | &Provider | Required | A reference to the strongly‑typed Provider enum. |
How it works
Matches the provided enum against a hardcoded list of official base URLs. For AzureOpenAI, it returns a placeholder string that requires the user to replace "YOUR_RESOURCE_NAME" and "YOUR_DEPLOYMENT_NAME".
| Provider | Default Base URL |
|---|---|
| OpenAI | https://api.openai.com/v1 |
| Gemini | https://generativelanguage.googleapis.com/v1beta |
| Cohere | https://api.cohere.com/v1 |
| HuggingFace | https://api-inference.huggingface.co |
| Mistral | https://api.mistral.ai/v1 |
| Together | https://api.together.xyz/v1 |
| VoyageAI | https://api.voyageai.com/v1 |
| JinaAI | https://api.jina.ai/v1 |
| Nomic | https://api-atlas.nomic.ai/v1 |
| OpenRouter | https://openrouter.ai/api/v1 |
| DeepInfra | https://api.deepinfra.com/v1/openai |
default_model_name
A mapping function that returns the officially recommended embedding model for a given AI provider.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| provider | &Provider | Required | A reference to the strongly‑typed Provider enum. |
How it works
Matches the provided enum against a hardcoded list of high‑performing embedding models.
| Provider | Default Model Name |
|---|---|
| OpenAI / Azure | text-embedding-3-small |
| Gemini | gemini-embedding-001 |
| Cohere | embed-english-v3.0 |
| HuggingFace | BAAI/bge-m3 |
| Mistral | mistral-embed |
| Together | togethercomputer/m2-bert-80M-32k-retrieval |
| VoyageAI | voyage-3-lite |
| JinaAI | jina-embeddings-v3 |
| Nomic | nomic-embed-text-v1.5 |
| OpenRouter | openai/text-embedding-3-small |
| DeepInfra | BAAI/bge-large-en-v1.5 |
Error Handling
- Graceful Fallbacks: This module is designed to never crash. If a user passes an unrecognized provider string (e.g., "unknown-ai"), the Provider::from_str parser safely defaults to Provider::OpenAI.
- Azure URL Warning: For AzureOpenAI, default_base_url returns a template URL (https://YOUR_RESOURCE_NAME...). This will result in an HTTP resolution error downstream if the user fails to provide an explicit base_url override in Python.
Remarks
- Extensibility: Adding a new provider to the jazzmine ecosystem only requires updating the mapping tables in this file and extending the Provider enum logic in embed.rs.
- Statelessness: These utilities are purely functional and stateless, making them incredibly fast and thread‑safe to execute during memory class initialization.