SemanticMemory | Jazzmine Core

1. Behavior and Context

In the jazzmine architecture, SemanticMemory acts as the agent's "Living Glossary."

Keyword Saliency: It uses BM25 with IDF (Inverse Document Frequency) weighting. This naturally handles term rarity: uncommon internal project names score highly when mentioned, while common stop-words are downweighted.
Deterministic Identity: Entry IDs are computed using a UUID5 hash of the agent_id and the lowercased key. This makes the memory inherently idempotent: upserting the same key twice simply overwrites the previous definition, and deleting a term requires no prior search.
Lightweight Footprint: Since it only requires a tokenizer and no heavy ONNX models or API calls for embedding, it is the fastest and most resource-light memory component in the framework.

2. Purpose

Jargon Resolution: Providing the agent with exact definitions for industry or company-specific terms that may not exist in a general-purpose LLM's training data.
Alias Mapping: Linking multiple surface forms (e.g., "JSON Web Token" and "jwt") to a single canonical definition.
Deterministic Knowledge Management: Allowing for reliable "forgetting" or updating of specific terminology by key name.
Contextual Grounding: Ensuring the agent uses the correct internal terminology when generating responses.

3. High-Level API (Python)

SemanticMemory is exposed to Python and requires a QdrantManager and a path to a standard tokenizer file.

Example: Defining and Retrieving Jargon

python

from memory import SemanticMemory, QdrantManager

# 1. Setup Infrastructure
mgr = QdrantManager(url="http://localhost:6334", vector_size=384)

# 2. Initialize Semantic Memory (No model_dir needed!)
semantic = SemanticMemory(
    qdrant_manager=mgr,
    tokenizer_path="./models/tokenizer.json"
)

# 3. Memorize a term with aliases
await semantic.memorize(
    key="PII",
    value="Personally Identifiable Information",
    category="abbreviation",
    agent_id="compliance_bot",
    aliases=["private data", "sensitive info"],
    description="Any data that could be used to identify a specific individual."
)

# 4. Recall based on a query
results = await semantic.recall(
    query="How do we handle private data?",
    agent_id="compliance_bot",
    top_k=3,
    score_threshold=1.5
)

for term in results:
    print(f"Found: {term['key']} -> {term['value']} (BM25 Score: {term['score']})")

4. Detailed Functionality

SemanticMemory(qdrant_manager, tokenizer_path) [Constructor]

Initializes the memory module. It loads the tokenizer from the filesystem.

Tokenizer Requirement: This should be the same tokenizer used by your other memory modules to ensure token consistency, though SemanticMemory uses it strictly for term frequency counting.

memorize(...)

Functionality: Stores or overwrites a term definition.

Indexing Text: It joins the key, value, every alias, and the description into a single block of text for tokenization. This maximizes recall; searching for a word in the description will still surface the key.
Deterministic ID: Generates a UUID5 from "{agent_id}::{key.to_lowercase()}".
Wait for Consistency: Uses wait=true during the Qdrant upsert to ensure the term is searchable immediately after the call returns.

recall(query, agent_id, top_k=5, score_threshold=None)

Functionality: Retrieves the most relevant term definitions.

Scoring: Results are ranked by BM25 score.
Filtering: Strictly scoped to the provided agent_id.
Thresholding: The score_threshold is highly recommended for SemanticMemory. A score of 1.0 to 3.0 usually filters out irrelevant “near‑miss” keyword matches.

forget(key, agent_id) / forget_many(keys, agent_id)

Functionality: Permanently deletes definitions.

Efficiency: Because IDs are deterministic, these methods do not perform a search. They compute the target UUID(s) and send a direct delete command to Qdrant.

list_all(agent_id, category=None)

Functionality: Scrolls through all definitions for an agent.

Sorting: Returns a list of dictionaries sorted alphabetically by key.
Filtering: Can optionally return only terms within a specific category (e.g., "technical").

entry_count(agent_id)

Functionality: Returns the exact number of terms stored for a specific agent.

5. Error Handling

PyRuntimeError (Tokenization): Raised if the provided string is incompatible with the tokenizer (rare) or if the tokenizer file was corrupted.
PyRuntimeError (Qdrant): Raised if the batch delete or scroll operations fail due to network or database‑side timeouts.
Case Insensitivity: The system automatically lowercases the key for ID generation, but preserves the original casing in the payload for display.

6. Remarks

Why No Dense Vectors? SemanticMemory handles “Entity‑like” data. If you use dense vectors for "JWT", a search for "Session Token" might retrieve it. While conceptually similar, in a technical context, they are different things. BM25 ensures that if the user didn’t mention the term or its specific aliases, the agent doesn’t guess incorrectly.
Alias Power: Use aliases extensively. Adding "k8s" as an alias for "Kubernetes" allows SemanticMemory to act as a bridge between user shorthand and formal documentation.
Memory Usage: This is the most memory‑efficient module. Thousands of terms can be stored and searched with negligible RAM usage on the agent side, as only the tokenizer is resident in memory.