Errors | jazzmine-security

Security System: Errors

1. Behavior and Context

At the core of this module is the SecurityError base class, which inherits from Python's standard Exception. Every error raised by the jazzmine-security package derives from this base class.

Key behaviors:

Context-Awareness: Unlike standard exceptions, Jazzmine's exceptions natively accept a context dictionary. This allows internal modules to attach crucial state information (e.g., input_len, model_path, batch_size) directly to the exception object before it propagates up the stack.
Exception Chaining: The cause parameter explicitly captures the underlying system or third-party error (e.g., a FileNotFoundError or an urllib3 timeout) without losing the higher-level context of why the operation failed.
Categorized Hierarchies: Exceptions are grouped logically by component (e.g., Models, Sanitizers, Moderators, Data Pipelines). This allows developers to catch broad categories of errors (like SanitizationError) or highly specific ones (like PDFSanitizationError).
Formatted Outputs: The base class overrides __str__ to automatically format the error message, its underlying cause, and the attached context dictionary into a readable multi-line string for console output and logs.

2. Purpose

Type-Safe Error Handling: Allows agent developers to write precise try...except blocks that react differently to different failure modes (e.g., falling back to a default response if ModeratorError occurs, versus crashing the app if a ModelLoadError occurs).
Telemetry Enrichment: By capturing context data natively within the exception, logging systems (like BaseLogger) can extract rich metadata when an error is recorded, improving observability.
Framework Consistency: Ensures all components within the jazzmine-security package emit standardized errors, abstracting away the messy underlying errors generated by third-party libraries like pypdf, bleach, or transformers.

3. High-Level API & Examples

Example 1: Raising a Context-Aware Error (Internal Use)

If you are extending the framework, you can raise detailed exceptions that capture exact failure states.

python

from jazzmine.security.errors import ModelLoadError

def load_custom_model(path: str):
    try:
        # Simulate a failed load
        raise FileNotFoundError("model.bin not found")
    except Exception as e:
        raise ModelLoadError(
            message="Failed to initialize the custom safety model.",
            cause=e,
            context={"attempted_path": path, "retries": 3}
        )

Example 2: Granular Exception Handling (External Use)

When building a conversational agent, you can catch specific exceptions to trigger targeted fallback behaviors.

python

from jazzmine.security.output_sanitizer import JazzminePDFSanitizer
from jazzmine.security.errors import PDFSanitizationError, SanitizationError, SecurityError

sanitizer = JazzminePDFSanitizer()

try:
    safe_pdf = sanitizer.sanitize(file_input=raw_bytes)
except PDFSanitizationError as e:
    # Handle PDF specific errors (e.g., password protected)
    print(f"PDF specifically failed: {e.context.get('input_len')} bytes processed.")
except SanitizationError as e:
    # Handle any other sanitization failure (CSV, HTML)
    print("A general sanitization error occurred.")
except SecurityError as e:
    # Catch-all for ANY error generated by jazzmine-security
    print(f"Framework error: {e.message}")

5. Detailed Class Functionality

SecurityError [Base Class]

The foundation of the exception tree. Inherits from Exception.

Parameters:
message (str): Human-readable description of what went wrong.
cause (Optional[Exception]): The original exception that triggered this error.
context (Optional[Dict[str, Any]]): Dictionary containing variable states, metadata, or inputs relevant to the failure.
Functionality: Stores these attributes and provides a detailed string representation via __str__().

Component-Specific Exception Families

All of the following classes inherit from their respective category base classes (which in turn inherit from SecurityError).

1. Model Related Exceptions

Base Class: ModelError

ModelNotFoundError: Raised when local model artifacts or directories cannot be located. Inherits from FileNotFoundError as well.
ModelLoadError: Raised when Hugging Face transformers/tokenizers or custom model binaries fail to load into memory.
ModelSaveError: Raised during failure to write a checkpoint or model to disk.
ChecksumMismatchError: Raised during model verification if downloaded/loaded artifacts do not match expected cryptographic hashes. Inherits from ValueError.
ModelDecompressError: Raised when unpacking a .tar.gz or .zip model archive fails.

2. Explainability Exceptions

Base Class: ExplainabilityError

SHAPComputeError: Raised when the SHAP (SHapley Additive exPlanations) engine fails to compute values, usually due to shape mismatches or unsupported model types.
FeatureImportanceError: Raised when extracting global feature importance fails or cannot be mapped back to human-readable feature names.

3. Detector Exceptions

Base Class: DetectorError

DetectorInitializationError: Raised when a specific ML detector (e.g., Toxicity Detector) fails to initialize due to missing config or weights.
PredictionError: Raised when the inference step or post-processing probability aggregation fails.
FeatureNameMismatchError: Raised when the extracted text features do not match the expected input layer names of the model. Inherits from KeyError.

4. Sanitizer Exceptions

Base Class: SanitizationError

PDFSanitizationError: Raised for corrupted PDFs, password protection locks, or memory overruns during PDF macro extraction.
CSVSanitizationError: Raised when CSV matrices are malformed or string parsing fails.
HTMLSanitizationError: Raised when the bleach engine fails to parse an HTML DOM.

5. Moderator Exceptions

Base Class: ModeratorError

TokenizerLoadError: Raised when AutoTokenizer fails to initialize.
PipelineLoadError: Raised when the HF pipeline() constructor fails.
ModelNotLoadedError: Raised if a classify or classify_batch method is called before the model is loaded into memory or onto the GPU.
InvalidModelError: Raised when the target HF repository exists but lacks the required classification heads or architecture.

6. Data Pipeline & Training Exceptions

Base Class: DataPipelineError and TrainingError

CSVParseError: Raised when parsing training datasets fails.
FeatureExtractionError: Raised when text tokenization/vectorization fails on an input prompt.
MergeError: Raised when attempting to combine incompatible datasets.
SaveCheckpointError / LoadCheckpointError: Raised when training loops fail to read/write state dictionaries.

7. Utility Exceptions

FileLockError: Raised when concurrent processes fail to acquire a safe lock on a shared model file or dataset.
CompressionError: Raised during general archiving operations.

6. Error Handling

When integrating jazzmine-security into your conversational agent architecture, it is best practice to catch errors hierarchically.

Because LLM generation and network-dependent model loading are inherently unpredictable, you should always wrap security validation steps in try...except blocks.

Best Practice Pattern:

Catch specific operational errors first (e.g., ModelLoadError during setup).
Catch categorical errors (e.g., ModeratorError during runtime inference) to implement broad fallbacks, like returning a standard safe response to the user.
Catch SecurityError as a final fail-safe to guarantee that no unhandled Jazzmine exceptions crash your main application loop.

7. Remarks

The Context Dictionary

When inspecting caught errors in production, always check the context attribute. Internal components heavily utilize this. For instance, if an asynchronous batch evaluation fails, the ModeratorError.context will often contain the exact batch_size and hardware_device where the failure occurred, drastically reducing debugging time.

Python 3 Exception Chaining (raise ... from ...)

While standard Python 3 uses raise NewException() from old_exception, the SecurityError's cause parameter serves a similar purpose but explicitly attaches the old exception as a readable attribute. This makes it easier for structured logging frameworks (like JSON loggers) to serialize the root cause without having to parse Python traceback strings.