1. Behavior and Context
In the jazzmine framework, the QdrantManager is typically the first component initialized. It acts as an "Infrastructure Administrator" that:
- Establishes and maintains the connection pool to the database.
- Enforces schema consistency by validating vector sizes on startup.
- Automates the creation of specialized indexes (HNSW for dense vectors and BM25 for sparse vectors).
- Manages "Collection Prefixes" to allow multiple agents or environments (e.g., dev, prod) to share a single Qdrant instance without data collisions.
2. Purpose
- Collection Provisioning: Idempotently creates the three core memory collections required for a task-performing agent.
- Performance Tuning: Configures hardware-level optimizations such as Product Quantization (PQ) and Scalar Quantization (SQ).
- Hybrid Search Readiness: Sets up both dense vector configurations and sparse vector modifiers (IDF) to support advanced retrieval strategies.
- Data Integrity: Creates payload indexes for metadata fields (like user_id, agent_id, timestamp) to ensure filtered queries are high-performance.
3. High-Level API
The QdrantManager is exposed to Python via PyO3. It should be initialized at the start of your application and passed to the various Memory classes.
Example: Initializing the Infrastructure
from memory import QdrantManager
# 1. Connect to Qdrant and set global vector defaults
manager = QdrantManager(
url="http://localhost:6334",
vector_size=384, # Size of your embedding model (e.g. MiniLM)
quantization=8, # Use 8-bit scalar quantization
distance_metric="cosine",
collection_prefix="jazzmine_dev"
)
# 2. Provision the memory collections (Idempotent calls)
await manager.ensure_conversation_summaries_collection()
await manager.ensure_flows_collection()
await manager.ensure_semantic_collection()4. Detailed Functionality
QdrantManager(...) [Constructor]
Initializes the client and sets the policy for all subsequent collection creations.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| url | str | Required | The endpoint of the Qdrant server. |
| vector_size | int | Required | The dimension of dense embeddings (1 to 65536). |
| use_indexing | bool | True | If True, builds HNSW indexes for fast retrieval. |
| quantization | Optional[int] | 8 | Compression level. 4 (Product Quant), 8 (Scalar Quant), or None. |
| on_disk | bool | False | If True, stores vectors and payloads on disk to save RAM. |
| api_key | Optional[str] | None | Authentication key for protected Qdrant instances. |
| distance_metric | str | "cosine" | Method: "cosine", "euclidean", or "dot". |
| collection_prefix | str | "" | String prepended to collection names. |
| replication_factor | int | 1 | Number of data copies across shards (High Availability). |
| shard_number | int | 1 | Number of shards to distribute data across. |
| indexing_threshold | int | 2000 | Points required before background indexing triggers. |
| use_sparse_vectors | bool | True | Enables BM25 support via sparse vector configurations. |
ensure_conversation_summaries_collection()
Target: EpisodicMemory
- Creates a collection with two dense vectors: short_summary and long_summary.
- Configures a sparse vector named bm25 with an IDF Modifier for hybrid keyword search.
- Automatically indexes payload fields: user_id, agent_id, conversation_id, timestamp_begin, timestamp_end, flows_activated, and tools_invoked.
ensure_flows_collection()
Target: ProceduralMemory
- Creates a collection with a single dense vector named flow.
- Configures the bm25 sparse vector.
- Indexes payload fields: flow_id, agent_id, flow_name, and flow_type.
- Includes validation: If the collection exists but has a different vector_size, it raises a ValueError.
ensure_semantic_collection()
Target: SemanticMemory
- Note: This is a Sparse‑Only collection. It contains an empty dense configuration.
- Configures the bm25 sparse vector with the IDF modifier.
- Designed for terminology and jargon lookup where exact token overlap is the primary metric.
- Indexes payload fields: agent_id, key, and category.
5. Error Handling
- ValueError: Raised if vector_size is 0 or exceeds 65536, or if replication_factor/shard_number is less than 1. Also raised during ensure_* calls if a collection exists with a conflicting vector dimension.
- ConnectionError: Raised if the manager cannot establish a handshake with the Qdrant server at the provided url.
- RuntimeError: Raised if the manager fails to create payload indexes or if the collection existence check fails due to database‑side errors.
6. Remarks
- Deterministic Naming: Collection names are derived automatically. For example, if collection_prefix is "test", the episodic store will be named "test_conversation_summaries".
- Quantization Strategy:
- Bit8 (Scalar): Best for most use cases; provides ~4× memory reduction with minimal accuracy loss.
- Bit4 (Product): Provides ~8×+ compression; use this only for extremely large datasets where RAM is the primary constraint.
- Sparse Vectors: The use of Modifier::Idf in sparse configurations is crucial. It ensures that rare terms (like specific project names) carry more weight in search results than common words.