Working Memory System | Jazzmine Core

The Working Memory Stack

The system is divided into four distinct layers:

Enums: Defining the state machine boundaries.
Records: The atomic data units (Tool calls, slots, entity entries).
State: The live aggregate objects representing a conversation.
Stores: The persistence drivers (In-Memory for dev, Redis for production).

I. The State Machine

1. FlowStatus

Introduction

FlowStatus represents the fine-grained lifecycle of a specific task (Flow) currently being executed by the agent.

Purpose

It governs the internal logic of the ActiveFlowState. It tells the system whether a skill is currently running, waiting for data, or completed.

Behavior and Context

Transition logic: A flow typically starts as ACTIVE. If the agent realizes a parameter is missing, it moves to COLLECTING_SLOTS. Before a dangerous action, it moves to AWAITING_CONFIRMATION.
Terminal States: Once a flow reaches COMPLETED, CANCELLED, or FAILED, it is archived into a FlowRecord and removed from active memory.

2. ConversationStatus

Introduction

A high-level status that summarizes the overall "mood" of the conversation thread.

Purpose

This is the primary signal used by the Agent Router and Prompt Builder. It determines which specific instructions (e.g., "ask for the next slot") are injected into the LLM's system prompt.

Values

IDLE: The agent is waiting for a new request.
IN_FLOW: A skill is currently being executed.
COLLECTING_SLOTS: The agent is in "Form-Filling" mode.
AWAITING_CONFIRMATION: The agent is waiting for a "Yes/No" to proceed.

II. Atomic Data Blocks

1. ToolCallRecord

Introduction

Records the details of a single tool invocation within a sandbox environment.

Purpose

It provides the agent with a memory of its own technical actions. It distinguishes between the Raw Result (data for logic) and the Result Summary (text for the LLM).

High-Level API

python

record = ToolCallRecord(
    tool_name="get_inventory",
    arguments={"sku": "LAP-123"},
    result={"count": 5, "warehouse": "A1"},
    result_summary="Found 5 units in Warehouse A1.",
    success=True
)

Detailed Functionality

to_prompt_xml():

Converts the record into an XML fragment. Crucially, it only includes the result_summary, preventing the LLM from being overwhelmed by large JSON payloads.

Contextual Tracking:

Stores step (for Sequential flows) and branch (for Parallel flows) to help the agent understand where in a complex process the tool was called.

1. Slot

Introduction

Represents a single variable (parameter) the agent needs to collect to satisfy a flow's requirements.

Behavior and Context

Form Filling: When a flow is active, the agent checks pending_slots.
Validation: If a user provides an invalid value, the validation_error field is populated. The agent will then see this error and re-ask the user with specific feedback.

2. PendingConfirmation

Introduction

A safety gate for irreversible or destructive actions (e.g., payments, deletions).

Purpose

To hold a "Yes/No" request in memory. If the user drifts to a different topic, the request will automatically expire after timeout_turns (default 3), ensuring the agent doesn't execute a stale action.

3. WorkspaceEntity

Introduction

The cross-turn coreference bridge.

Purpose

It allows the agent to resolve "it," "that," or "him" across turns. If "Order #555" was mentioned 3 turns ago, it remains in the workspace.

Mention Count: Tracks how often an entity is discussed.
Attribute Merging: If an entity is mentioned again with new details, the workspace performs a union merge of the attributes.

III. Live Context

1. ActiveFlowState

Introduction

The complete execution context for the "Skill" currently in progress.

Behavior and Context

Serialization Safety: It never stores live Python objects. It only stores flow_id and flow_name. This allows the state to be saved to Redis and reloaded without losing progress.
Flow Types: It dynamically handles SequentialFlow (tracking current_step) and ParallelFlow (tracking branch_results).

2. TurnCache

Introduction

A temporary data container that is populated at the very beginning of a turn.

Purpose

It caches the results of heavy operations (Episodic recall, Semantic recall, Flow selection). Instead of passing these results through every function, the agent reads them once from the cache.

Lifecycle: The TurnCache is strictly per-turn. It is completely replaced when the next user message arrives.

3. WorkingMemory

Introduction

The root aggregate object for a single conversation.

Detailed Accessors (The "Routing Surface")

is_in_flow: Boolean helper to check if the agent is busy with a task.
salient_entities: Returns entities sorted by recency, providing the LLM with "Coreference Hints."
build_prompt_tool_sections(): Assembles the XML context. It prevents redundancy by ensuring that if a tool was just called this turn, it isn't duplicated in the long-term flow history section.

IV. The Storage Engine

1. WorkingMemoryStore (Base)

Introduction

An Abstract Base Class (ABC) that defines the protocol for managing Working Memory.

Detailed Method Functionality

begin_turn(conversation_id, turn_cache): The most complex method. It installs the cache, merges entities, evicts stale data, records topic shifts, and handles confirmation timeouts in one atomic sequence.
collect_slot(...): Updates a specific slot. If all required slots are now full, it automatically moves the flow status back to ACTIVE.
complete_flow(): Finalizes the flow, creates a FlowRecord for the history, and cleans up the active state.

2. RedisWorkingMemoryStore

Introduction

The production-grade storage driver. It persists working memory as JSON blobs in Redis.

Behavior and Context: The Distributed Lock

To prevent data corruption in multi-worker environments, this class implements a Distributed Lock Pattern:

Acquisition: Uses SET NX with a unique UUID token and a LOCK_TIMEOUT (15s).
Safety: If a worker tries to update a conversation that is already being processed by another worker, it will wait (with exponential backoff and jitter) until the lock is free.
Atomic Release: Uses a Lua Script to verify the token before deleting the lock. This ensures a worker cannot accidentally delete a lock that has already been taken over by another worker after a timeout.

Error Handling

TimeoutError: Raised if a lock cannot be acquired within 15 seconds. This usually indicates an LLM provider is stalling or a deadlock has occurred.
Corruption Recovery: If WorkingMemory.from_dict fails (due to corrupted JSON), the store logs the error and deletes the key, allowing the conversation to reset rather than being permanently broken.

3. InMemoryWorkingMemoryStore

Introduction

A dictionary-backed store for single-process environments.

Purpose

Testing: Ideal for unit tests where Redis is not available.
Local CLI: Perfect for single-user local bots.
Locking: Uses asyncio.Lock per conversation ID to ensure that even in a single process, two concurrent messages from the same user are processed sequentially.

Remarks

The Working Memory system is the only part of Jazzmine that is "Aware" of the conversation's current moment. By strictly separating the Logic from the Storage, the system achieves perfect consistency across distributed clusters.

Standard Usage Pattern:

python

# 1. Fetch current memory
wm = await store.get_or_create(cid, uid, aid)

# 2. Update it at turn start
wm = await store.begin_turn(cid, turn_cache)

# 3. Agent performs work...
await store.record_tool_call(cid, my_record)

# 4. Agent finishes
await store.end_turn(cid, "Hello world", "greeting")