Flows
Core reference

Flow Selector

The Flow Selector is the "Strategic Router" of the jazzmine framework. Its primary responsibility is to bridge the gap between a high-level user request and the specific technical procedures (Flows) required to fulfill it. It manages a sophisticated Two-Stage Resolution process that combines the speed of vector retrieval with the high-reasoning capabilities of a Large Language Model (LLM).

1. Behavior and Context

In the jazzmine execution loop, the Selector is invoked after the user's message has been enhanced and relevant context has been recalled from memory.

Key behaviors:

  • Two-Stage Filter:
  1. Stage 1 (Recall): It retrieves the top candidates from ProceduralMemory using semantic similarity.
  2. Stage 2 (Selection): It presents the metadata of those candidates (Name, Description, Condition, Desired Effects) to an LLM to make the final "intelligent" decision.
  • Contextual Augmentation: When performing initial retrieval, it appends recent "Episode Summaries" to the query. This allows the selector to resolve vague requests like "I need help with that" by identifying what "that" referred to in previous turns.
  • Multi-Flow Support: The selector can identify when a single user request requires multiple parallel or sequential flows (e.g., "Check my balance and then pay my bill").
  • Prompt Optimization: It caches the generated XML prompts for flows (build_prompt) to minimize redundant string processing and reduce the token footprint of the final agent context.

2. Purpose

  • Accuracy: Vector search alone can often return "near misses." The LLM selection stage ensures that the chosen flow's trigger conditions strictly match the user's intent.
  • Tie-Breaking: Uses desired_effects as a secondary logic layer to distinguish between flows with similar descriptions.
  • Hallucination Prevention: Ensures that the agent only attempts to execute flows that actually exist in the local Python registry.
  • Token Efficiency: By only loading the full instructions for the selected flows into the main agent prompt, it keeps the reasoning loop focused and cost-effective.

3. High-Level API (Usage)

The FlowSelector requires a BaseLLM provider, access to the BaseFlow.registry, and an instance of ProceduralMemory.

Example: Selecting Flows for a Request

python
from jazzmine.core.flows import FlowSelector, BaseFlow

# 1. Initialize the Selector
selector = FlowSelector(
    llm=my_reasoning_llm,
    flow_registry=BaseFlow.registry,
    proc_mem=my_procedural_memory,
    agent_id="support_bot_01",
    top_k=3 # Retrieve top 3 candidates for LLM review
)

# 2. Perform selection
# 'episode_summaries' come from EpisodicMemory.recall()
result = await selector.select(
    enhanced_query="Process a refund for my last order",
    episode_summaries=["User discussed a broken item in the previous episode."]
)

# 3. Handle the result
if result.chosen_flows:
    print(f"Executing: {[f.name for f in result.chosen_flows]}")
    # This block is ready to be injected into the Agent's system prompt
    agent_context = result.combined_prompt

4. Detailed Functionality

FlowSelectionResult [Dataclass]

The structured return value of a selection operation.

  • chosen_flows: A list of live BaseFlow objects.
  • chosen_prompts: A list of XML strings generated from the chosen flows (using verbose=False for token savings).
  • combined_prompt: All chosen prompts joined by newlines, ready for immediate injection into the Agent's reasoning window.
  • unresolved_ids: A list of IDs the LLM selected that were not found in the registry (signals stale database entries).
  • candidates: The raw metadata dictionaries retrieved from Qdrant during Stage 1.

select(...)

Functionality: Orchestrates the two-stage resolution process.

How it works:

  1. Recall: If candidates are not provided manually, it calls the internal _recall method. This method augments the query with episode summaries to find the most relevant skills in ProceduralMemory.
  2. Fast Path: If only one candidate is found, it skips the LLM call and returns that candidate immediately to save latency.
  3. LLM Reasoning: If multiple candidates exist, it builds a compact XML block of their metadata and asks the LLM to return a JSON array of the best-fitting IDs.
  4. Registry Mapping: Converts the selected IDs back into live Python objects.
  5. Fallback: If the LLM returns an empty list or invalid data, the selector automatically falls back to the #1 semantically ranked candidate to ensure the conversation doesn't stall.

_llm_select(...) [Private]

Functionality: Communicates with the LLM using a specialized, high-density system prompt.

Rules enforced:

  • Pick flows whose condition best matches the request.
  • Use desired_effects to break ties.
  • Only use IDs from the provided candidate list.
  • Return only a raw JSON array.

5. Error Handling

  • LLM Parsing Errors: If the LLM returns markdown fences or conversational text instead of a JSON array, the selector uses a robust parser to strip fences and attempt a recovery. If parsing still fails, it triggers the semantic fallback.
  • Stale Entry Warnings: If the LLM picks an ID that exists in the vector database but has been deleted from the Python source code, the selector logs a warning and marks the ID as "unresolved."
  • Missing Infrastructure: If proc_mem is not provided and no manual candidates are passed, the selector returns an empty result rather than raising an exception, allowing the agent to handle the "no skill found" scenario gracefully.

6. Remarks

  • Token Compression: The selector uses verbose=False when calling flow.build_prompt. This removes instructional comments intended for developers, providing the agent with only the core technical logic.
  • Prompt Caching: The _prompt_cache is keyed by the flow's deterministic uuid5. Since flow definitions are static, this avoids repeated XML generation across turns.
  • RRF Parameter: The rrf_k parameter (default 10) controls the Reciprocal Rank Fusion used during retrieval. Lower values prioritize the absolute top-ranked results across the dense and sparse search streams.