Memory
Core reference

ProceduralMemory

ProceduralMemory is the "skills and logic" repository of the jazzmine agent. It is designed to store, manage, and retrieve "Flows"—Standard Operating Procedures (SOPs) that define how an agent should handle specific tasks. By using a hybrid retrieval system (Dense embeddings + BM25 keyword matching), it ensures the agent can find the correct procedure whether the user request is semantically similar or contains specific technical keywords.

1. Behavior and Context

In the jazzmine architecture, ProceduralMemory acts as the bridge between the agent's reasoning engine and its executable skills.

  • Skill Registry: It stores flow definitions, including their names, descriptions, conditions for use, and desired effects.
  • Hybrid Selection: It performs simultaneous searches across dense "intent" vectors and sparse "keyword" vectors, merging them via Reciprocal Rank Fusion (RRF).
  • Idempotency: Every flow is indexed by a flow_id which is converted into a deterministic UUID. This ensures that updates to an existing flow overwrite the old version rather than creating duplicates.
  • Lifecycle Management: It provides specialized methods like list_flows and delete_flow, which are utilized by the framework's sync_registry utility to keep the database in perfect sync with the Python source code.

2. Purpose

  • Flow Discovery: Dynamically identifying which internal procedure matches a user's natural language goal.
  • Standardization: Ensuring the agent adheres to predefined logic and constraints (conditions) before executing a task.
  • Contextual Guardrails: Storing desired_effects so the agent knows exactly what the successful completion of a procedure looks like.
  • Synchronized Intelligence: Enabling the agent's capabilities to be updated in real-time simply by modifying the Python @tool or @flow definitions.

3. High-Level API (Python)

ProceduralMemory is exposed to Python and requires a QdrantManager.

Example: Initialization and Flow Storage

python
from memory import ProceduralMemory, QdrantManager

# 1. Setup Infrastructure
mgr = QdrantManager(url="http://localhost:6334", vector_size=384)

# 2. Initialize (Remote OpenAI example)
proc_mem = ProceduralMemory(
    qdrant_manager=mgr,
    tokenizer_path="./models/tokenizer.json",
    api_key="sk-...",
    provider="openai",
    model_name="text-embedding-3-small",
    hidden_size=1536
)

# 3. Store a new flow (usually handled automatically by sync_registry)
await proc_mem.memorize(
    embedding_text="Process a refund for a customer order based on order ID.",
    flow_id="refund_v1",
    flow_name="OrderRefund",
    flow_type="transactional",
    description="Validates order eligibility and processes a refund to the original payment method.",
    condition="User wants their money back or has a defective item.",
    agent_id="support_bot_01",
    desired_effects=["Order marked as refunded", "Notification sent to user"],
    checksum="a1b2c3d4..."
)

# Find the procedure that best matches the user's current request
flows = await proc_mem.recall(
    query="I'd like to get a return for this broken laptop",
    agent_id="support_bot_01",
    top_k=2,
    rrf_k=60
)

for flow in flows:
    print(f"Match: {flow['flow_name']} (Fusion Score: {flow['score']})")

4. Detailed Functionality

ProceduralMemory(...) [Constructor]

Initializes the memory module. It sets up the Embedder service using either the provided local model directory or remote API credentials.

Parameters:

ParameterTypeDefaultDescription
qdrant_managerPy<QdrantManager>RequiredProvides the Qdrant connection.
tokenizer_pathstrRequiredPath to the tokenizer for BM25 vectors.
model_dirOptional[str]NonePath for local ONNX models.
quantizedboolFalseUse INT8 local models.
api_keyOptional[str]NoneAPI key for cloud embedding providers.
providerstr"openai"Remote provider name.
hidden_sizeint384Embedding dimensions.

memorize(...) / update_memory(...)

Functionality: Indexes a procedure definition into the vector store.

  • Hybrid Embedding: Generates a dense vector for the flow field and a sparse vector for the bm25 field using the embedding_text.
  • Idempotent Update: Parses the flow_id string into a UUID. If the ID is already present in the collection, Qdrant performs an in‑place update.
  • Metadata: Persists the full procedural context (conditions, effects, types) as payload for the agent to inspect after retrieval.

recall(...)

Functionality: Performs a high‑accuracy hybrid search using Reciprocal Rank Fusion (RRF).

  • Two‑Stage Search:
  • Searches the dense flow vector space for semantic similarity.
  • Searches the sparse bm25 index for keyword overlap.
  • RRF Fusion: Combines the two ranked lists into one. Flows that appear near the top of both lists receive significantly higher scores.
  • Scaling: The search uses a stage1_limit (4× the requested top_k) to ensure a diverse set of candidates is considered for the fusion stage.

list_flows(agent_id)

Functionality: Returns a summary of all procedures stored for a specific agent.

  • Use Case: Used primarily by the sync_registry utility to compare the database state with the local code’s @flow definitions to detect changes.
  • Output: Returns a list of dictionaries containing flow_id, flow_name, flow_type, and checksum.

5. Error Handling

  • PyRuntimeError (Inference): Raised if the local ONNX session panics or if a remote provider returns an HTTP error (e.g., 401 Unauthorized or 429 Rate Limit).
  • PyValueError: Raised if the hidden_size provided during initialization does not match the actual dimension of the existing Qdrant collection.
  • UUID Parsing: If a provided flow_id is not a valid UUID string, the system generates a random UUID v4 to ensure the point can still be stored, though idempotency may be lost.

6. Remarks

  • Why Hybrid? Procedural memory often involves technical jargon (e.g., "SQL", "API", "Refund"). Dense vectors are great for intent, but sparse vectors (BM25) are superior for ensuring that when a user mentions a specific technical procedure name, it is retrieved correctly.
  • RRF_K Parameter: The rrf_k parameter (default 60) acts as a smoothing factor. Higher values give more weight to items that appear lower in the search rankings, while lower values prioritize items that appear at the absolute top of the lists.
  • Performance: All search operations are executed concurrently using tokio::try_join!, minimizing latency for real‑time agent responses.