1. Behavior and Context
In the jazzmine architecture, SemanticMemory acts as the agent's "Living Glossary."
- Keyword Saliency: It uses BM25 with IDF (Inverse Document Frequency) weighting. This naturally handles term rarity: uncommon internal project names score highly when mentioned, while common stop-words are downweighted.
- Deterministic Identity: Entry IDs are computed using a UUID5 hash of the agent_id and the lowercased key. This makes the memory inherently idempotent: upserting the same key twice simply overwrites the previous definition, and deleting a term requires no prior search.
- Lightweight Footprint: Since it only requires a tokenizer and no heavy ONNX models or API calls for embedding, it is the fastest and most resource-light memory component in the framework.
2. Purpose
- Jargon Resolution: Providing the agent with exact definitions for industry or company-specific terms that may not exist in a general-purpose LLM's training data.
- Alias Mapping: Linking multiple surface forms (e.g., "JSON Web Token" and "jwt") to a single canonical definition.
- Deterministic Knowledge Management: Allowing for reliable "forgetting" or updating of specific terminology by key name.
- Contextual Grounding: Ensuring the agent uses the correct internal terminology when generating responses.
3. High-Level API (Python)
SemanticMemory is exposed to Python and requires a QdrantManager and a path to a standard tokenizer file.
Example: Defining and Retrieving Jargon
from memory import SemanticMemory, QdrantManager
# 1. Setup Infrastructure
mgr = QdrantManager(url="http://localhost:6334", vector_size=384)
# 2. Initialize Semantic Memory (No model_dir needed!)
semantic = SemanticMemory(
qdrant_manager=mgr,
tokenizer_path="./models/tokenizer.json"
)
# 3. Memorize a term with aliases
await semantic.memorize(
key="PII",
value="Personally Identifiable Information",
category="abbreviation",
agent_id="compliance_bot",
aliases=["private data", "sensitive info"],
description="Any data that could be used to identify a specific individual."
)
# 4. Recall based on a query
results = await semantic.recall(
query="How do we handle private data?",
agent_id="compliance_bot",
top_k=3,
score_threshold=1.5
)
for term in results:
print(f"Found: {term['key']} -> {term['value']} (BM25 Score: {term['score']})")4. Detailed Functionality
SemanticMemory(qdrant_manager, tokenizer_path) [Constructor]
Initializes the memory module. It loads the tokenizer from the filesystem.
- Tokenizer Requirement: This should be the same tokenizer used by your other memory modules to ensure token consistency, though SemanticMemory uses it strictly for term frequency counting.
memorize(...)
Functionality: Stores or overwrites a term definition.
- Indexing Text: It joins the key, value, every alias, and the description into a single block of text for tokenization. This maximizes recall; searching for a word in the description will still surface the key.
- Deterministic ID: Generates a UUID5 from "{agent_id}::{key.to_lowercase()}".
- Wait for Consistency: Uses wait=true during the Qdrant upsert to ensure the term is searchable immediately after the call returns.
recall(query, agent_id, top_k=5, score_threshold=None)
Functionality: Retrieves the most relevant term definitions.
- Scoring: Results are ranked by BM25 score.
- Filtering: Strictly scoped to the provided agent_id.
- Thresholding: The score_threshold is highly recommended for SemanticMemory. A score of 1.0 to 3.0 usually filters out irrelevant “near‑miss” keyword matches.
forget(key, agent_id) / forget_many(keys, agent_id)
Functionality: Permanently deletes definitions.
- Efficiency: Because IDs are deterministic, these methods do not perform a search. They compute the target UUID(s) and send a direct delete command to Qdrant.
list_all(agent_id, category=None)
Functionality: Scrolls through all definitions for an agent.
- Sorting: Returns a list of dictionaries sorted alphabetically by key.
- Filtering: Can optionally return only terms within a specific category (e.g., "technical").
entry_count(agent_id)
Functionality: Returns the exact number of terms stored for a specific agent.
5. Error Handling
- PyRuntimeError (Tokenization): Raised if the provided string is incompatible with the tokenizer (rare) or if the tokenizer file was corrupted.
- PyRuntimeError (Qdrant): Raised if the batch delete or scroll operations fail due to network or database‑side timeouts.
- Case Insensitivity: The system automatically lowercases the key for ID generation, but preserves the original casing in the payload for display.
6. Remarks
- Why No Dense Vectors? SemanticMemory handles “Entity‑like” data. If you use dense vectors for "JWT", a search for "Session Token" might retrieve it. While conceptually similar, in a technical context, they are different things. BM25 ensures that if the user didn’t mention the term or its specific aliases, the agent doesn’t guess incorrectly.
- Alias Power: Use aliases extensively. Adding "k8s" as an alias for "Kubernetes" allows SemanticMemory to act as a bridge between user shorthand and formal documentation.
- Memory Usage: This is the most memory‑efficient module. Thousands of terms can be stored and searched with negligible RAM usage on the agent side, as only the tokenizer is resident in memory.