QdrantManager | Jazzmine Core

1. Behavior and Context

In the jazzmine framework, the QdrantManager is typically the first component initialized. It acts as an "Infrastructure Administrator" that:

Establishes and maintains the connection pool to the database.
Enforces schema consistency by validating vector sizes on startup.
Automates the creation of specialized indexes (HNSW for dense vectors and BM25 for sparse vectors).
Manages "Collection Prefixes" to allow multiple agents or environments (e.g., dev, prod) to share a single Qdrant instance without data collisions.

2. Purpose

Collection Provisioning: Idempotently creates the three core memory collections required for a task-performing agent.
Performance Tuning: Configures hardware-level optimizations such as Product Quantization (PQ) and Scalar Quantization (SQ).
Hybrid Search Readiness: Sets up both dense vector configurations and sparse vector modifiers (IDF) to support advanced retrieval strategies.
Data Integrity: Creates payload indexes for metadata fields (like user_id, agent_id, timestamp) to ensure filtered queries are high-performance.

3. High-Level API

The QdrantManager is exposed to Python via PyO3. It should be initialized at the start of your application and passed to the various Memory classes.

Example: Initializing the Infrastructure

python

from memory import QdrantManager

# 1. Connect to Qdrant and set global vector defaults
manager = QdrantManager(
    url="http://localhost:6334",
    vector_size=384,          # Size of your embedding model (e.g. MiniLM)
    quantization=8,           # Use 8-bit scalar quantization
    distance_metric="cosine",
    collection_prefix="jazzmine_dev"
)

# 2. Provision the memory collections (Idempotent calls)
await manager.ensure_conversation_summaries_collection()
await manager.ensure_flows_collection()
await manager.ensure_semantic_collection()

4. Detailed Functionality

QdrantManager(...) [Constructor]

Initializes the client and sets the policy for all subsequent collection creations.

Parameters:

Parameter	Type	Default	Description
url	str	Required	The endpoint of the Qdrant server.
vector_size	int	Required	The dimension of dense embeddings (1 to 65536).
use_indexing	bool	True	If True, builds HNSW indexes for fast retrieval.
quantization	Optional[int]	8	Compression level. 4 (Product Quant), 8 (Scalar Quant), or None.
on_disk	bool	False	If True, stores vectors and payloads on disk to save RAM.
api_key	Optional[str]	None	Authentication key for protected Qdrant instances.
distance_metric	str	"cosine"	Method: "cosine", "euclidean", or "dot".
collection_prefix	str	""	String prepended to collection names.
replication_factor	int	1	Number of data copies across shards (High Availability).
shard_number	int	1	Number of shards to distribute data across.
indexing_threshold	int	2000	Points required before background indexing triggers.
use_sparse_vectors	bool	True	Enables BM25 support via sparse vector configurations.

ensure_conversation_summaries_collection()

Target: EpisodicMemory

Creates a collection with two dense vectors: short_summary and long_summary.
Configures a sparse vector named bm25 with an IDF Modifier for hybrid keyword search.
Automatically indexes payload fields: user_id, agent_id, conversation_id, timestamp_begin, timestamp_end, flows_activated, and tools_invoked.

ensure_flows_collection()

Target: ProceduralMemory

Creates a collection with a single dense vector named flow.
Configures the bm25 sparse vector.
Indexes payload fields: flow_id, agent_id, flow_name, and flow_type.
Includes validation: If the collection exists but has a different vector_size, it raises a ValueError.

ensure_semantic_collection()

Target: SemanticMemory

Note: This is a Sparse‑Only collection. It contains an empty dense configuration.
Configures the bm25 sparse vector with the IDF modifier.
Designed for terminology and jargon lookup where exact token overlap is the primary metric.
Indexes payload fields: agent_id, key, and category.

5. Error Handling

ValueError: Raised if vector_size is 0 or exceeds 65536, or if replication_factor/shard_number is less than 1. Also raised during ensure_* calls if a collection exists with a conflicting vector dimension.
ConnectionError: Raised if the manager cannot establish a handshake with the Qdrant server at the provided url.
RuntimeError: Raised if the manager fails to create payload indexes or if the collection existence check fails due to database‑side errors.

6. Remarks

Deterministic Naming: Collection names are derived automatically. For example, if collection_prefix is "test", the episodic store will be named "test_conversation_summaries".

Quantization Strategy:
Bit8 (Scalar): Best for most use cases; provides ~4× memory reduction with minimal accuracy loss.
Bit4 (Product): Provides ~8×+ compression; use this only for extremely large datasets where RAM is the primary constraint.

Sparse Vectors: The use of Modifier::Idf in sparse configurations is crucial. It ensures that rare terms (like specific project names) carry more weight in search results than common words.