Skip to content

grounding

grounding

External grounding hooks for external validation of sheet outputs.

Provides a mechanism for validating sheet outputs against external sources (APIs, databases, file checksums, etc.) to prevent model drift and ensure output quality beyond internal validation.

This addresses the mathematical necessity of external validators documented in arXiv 2601.05280 (entropy decay in self-training) and DGM objective hacking.

Attributes

Classes

GroundingPhase

Bases: str, Enum

When the grounding hook should execute relative to validation.

Attributes
PRE_VALIDATION class-attribute instance-attribute
PRE_VALIDATION = 'pre_validation'

Run before internal validation engine.

POST_VALIDATION class-attribute instance-attribute
POST_VALIDATION = 'post_validation'

Run after internal validation engine.

BOTH class-attribute instance-attribute
BOTH = 'both'

Run both before and after internal validation.

GroundingContext dataclass

GroundingContext(job_id, sheet_num, prompt, output, output_files=list(), validation_passed=None, validation_details=list(), metadata=dict())

Context provided to grounding hooks for validation.

Contains all relevant information about the sheet execution that the hook can use to perform external validation.

Attributes
job_id instance-attribute
job_id

Unique identifier for the job.

sheet_num instance-attribute
sheet_num

Sheet number being validated.

prompt instance-attribute
prompt

The prompt that was used for sheet execution.

output instance-attribute
output

The raw output from sheet execution.

output_files class-attribute instance-attribute
output_files = field(default_factory=list)

List of file paths created/modified by the sheet.

validation_passed class-attribute instance-attribute
validation_passed = None

Result of internal validation (None if pre_validation phase).

validation_details class-attribute instance-attribute
validation_details = field(default_factory=list)

Detailed validation results from internal engine.

metadata class-attribute instance-attribute
metadata = field(default_factory=dict)

Additional context metadata (e.g., config values, timing).

GroundingResult dataclass

GroundingResult(passed, hook_name, message='', confidence=1.0, details=dict(), timestamp=(lambda: now(UTC))(), recovery_guidance=None, should_escalate=False)

Result from a grounding hook execution.

Contains the validation outcome and optional guidance for recovery.

Attributes
passed instance-attribute
passed

Whether the grounding validation passed.

hook_name instance-attribute
hook_name

Name/identifier of the hook that produced this result.

message class-attribute instance-attribute
message = ''

Human-readable description of the result.

confidence class-attribute instance-attribute
confidence = 1.0

Confidence in the grounding result (0.0-1.0).

details class-attribute instance-attribute
details = field(default_factory=dict)

Additional details about the validation.

timestamp class-attribute instance-attribute
timestamp = field(default_factory=lambda: now(UTC))

When the grounding check was performed.

recovery_guidance class-attribute instance-attribute
recovery_guidance = None

Optional guidance for what to do on failure.

should_escalate class-attribute instance-attribute
should_escalate = False

Whether this failure should trigger escalation (if available).

GroundingHook

Bases: Protocol

Protocol for external grounding hooks.

Implementations can check outputs against external sources like: - API endpoints for data freshness - File checksums for artifact integrity - External validators for output quality - Database queries for data consistency

Attributes
name property
name

Unique name for this grounding hook.

phase property
phase

When this hook should run relative to validation.

Functions
validate async
validate(context)

Perform external validation on the sheet output.

Parameters:

Name Type Description Default
context GroundingContext

Full context about the sheet execution.

required

Returns:

Type Description
GroundingResult

GroundingResult with pass/fail status and details.

Source code in src/marianne/execution/grounding.py
async def validate(self, context: GroundingContext) -> GroundingResult:
    """Perform external validation on the sheet output.

    Args:
        context: Full context about the sheet execution.

    Returns:
        GroundingResult with pass/fail status and details.
    """
    ...

FileChecksumGroundingHook

FileChecksumGroundingHook(expected_checksums=None, checksum_algorithm='sha256', name=None, *, allowed_root=None)

Example grounding hook that validates file checksums.

Checks that output files have expected checksums, preventing model from overwriting important files incorrectly.

Initialize the file checksum hook.

Parameters:

Name Type Description Default
expected_checksums dict[str, str] | None

Map of file path to expected checksum.

None
checksum_algorithm Literal['md5', 'sha256']

Algorithm to use for checksums.

'sha256'
name str | None

Custom name for this hook (defaults to "file_checksum").

None
allowed_root str | Path | None

If set, all file paths must resolve within this directory. Rejects absolute paths and .. traversal.

None
Source code in src/marianne/execution/grounding.py
def __init__(
    self,
    expected_checksums: dict[str, str] | None = None,
    checksum_algorithm: Literal["md5", "sha256"] = "sha256",
    name: str | None = None,
    *,
    allowed_root: str | Path | None = None,
) -> None:
    """Initialize the file checksum hook.

    Args:
        expected_checksums: Map of file path to expected checksum.
        checksum_algorithm: Algorithm to use for checksums.
        name: Custom name for this hook (defaults to "file_checksum").
        allowed_root: If set, all file paths must resolve within this
            directory.  Rejects absolute paths and ``..`` traversal.
    """
    self._expected_checksums = expected_checksums or {}
    self._algorithm = checksum_algorithm
    self._name = name or "file_checksum"
    self._allowed_root = Path(allowed_root).resolve() if allowed_root else None
Functions
validate async
validate(context)

Validate file checksums match expected values.

Source code in src/marianne/execution/grounding.py
async def validate(self, context: GroundingContext) -> GroundingResult:
    """Validate file checksums match expected values."""
    import hashlib
    from pathlib import Path

    if not self._expected_checksums:
        return GroundingResult(
            passed=True,
            hook_name=self.name,
            message="No checksums configured",
        )

    mismatches: list[str] = []
    checked = 0

    for file_path, expected_hash in self._expected_checksums.items():
        path = Path(file_path)

        # Security: when allowed_root is set, validate paths stay within bounds
        if self._allowed_root is not None:
            resolved = (self._allowed_root / path).resolve()
            if not str(resolved).startswith(str(self._allowed_root)):
                mismatches.append(
                    f"{file_path}: path escapes allowed root"
                )
                continue

        if not path.exists():
            mismatches.append(f"{file_path}: file not found")
            continue

        hasher = (
            hashlib.md5() if self._algorithm == "md5" else hashlib.sha256()
        )
        with open(path, "rb") as f:
            for chunk in iter(lambda: f.read(FILE_HASH_CHUNK_SIZE), b""):
                hasher.update(chunk)

        actual_hash = hasher.hexdigest()
        if actual_hash != expected_hash:
            mismatches.append(
                f"{file_path}: expected {expected_hash[:16]}..., "
                f"got {actual_hash[:16]}..."
            )
        checked += 1

    if mismatches:
        return GroundingResult(
            passed=False,
            hook_name=self.name,
            message=f"Checksum validation failed for {len(mismatches)} file(s)",
            details={"mismatches": mismatches, "files_checked": checked},
            recovery_guidance="Re-generate the affected files or verify expected checksums",
        )

    return GroundingResult(
        passed=True,
        hook_name=self.name,
        message=f"All {checked} file checksum(s) validated",
        details={"files_checked": checked},
    )

GroundingEngine

GroundingEngine(hooks=None, config=None)

Engine for executing grounding hooks.

Coordinates multiple hooks and aggregates their results.

Initialize the grounding engine.

Parameters:

Name Type Description Default
hooks list[GroundingHook] | None

List of grounding hooks to execute.

None
config GroundingConfig | None

Configuration for grounding behavior (from core.config).

None
Source code in src/marianne/execution/grounding.py
def __init__(
    self,
    hooks: list[GroundingHook] | None = None,
    config: "GroundingConfig | None" = None,
) -> None:
    """Initialize the grounding engine.

    Args:
        hooks: List of grounding hooks to execute.
        config: Configuration for grounding behavior (from core.config).
    """
    self._hooks = hooks or []
    # Default to a basic config if none provided
    if config is None:
        from marianne.core.config import GroundingConfig as GC
        self._config = GC()
    else:
        self._config = config
Functions
add_hook
add_hook(hook)

Add a grounding hook to the engine.

Source code in src/marianne/execution/grounding.py
def add_hook(self, hook: GroundingHook) -> None:
    """Add a grounding hook to the engine."""
    self._hooks.append(hook)
get_hook_count
get_hook_count()

Get the number of registered hooks.

Source code in src/marianne/execution/grounding.py
def get_hook_count(self) -> int:
    """Get the number of registered hooks."""
    return len(self._hooks)
run_hooks async
run_hooks(context, phase)

Run all hooks matching the specified phase.

Includes a circuit breaker: after CIRCUIT_BREAKER_THRESHOLD consecutive failures (timeout or exception), remaining hooks are skipped to avoid wasting O(hooks x timeout) when an external service is down.

Parameters:

Name Type Description Default
context GroundingContext

Context for grounding validation.

required
phase GroundingPhase

Which phase to run hooks for.

required

Returns:

Type Description
list[GroundingResult]

List of GroundingResult from all matching hooks.

Source code in src/marianne/execution/grounding.py
async def run_hooks(
    self,
    context: GroundingContext,
    phase: GroundingPhase,
) -> list[GroundingResult]:
    """Run all hooks matching the specified phase.

    Includes a circuit breaker: after CIRCUIT_BREAKER_THRESHOLD consecutive
    failures (timeout or exception), remaining hooks are skipped to avoid
    wasting O(hooks x timeout) when an external service is down.

    Args:
        context: Context for grounding validation.
        phase: Which phase to run hooks for.

    Returns:
        List of GroundingResult from all matching hooks.
    """
    import asyncio

    results: list[GroundingResult] = []
    consecutive_failures = 0

    for hook in self._hooks:
        # Check if hook should run in this phase
        if hook.phase not in (phase, GroundingPhase.BOTH):
            continue

        # Circuit breaker: skip remaining hooks after too many failures
        if consecutive_failures >= self.CIRCUIT_BREAKER_THRESHOLD:
            results.append(
                GroundingResult(
                    passed=False,
                    hook_name=hook.name,
                    message=(
                        f"Skipped: circuit breaker open after "
                        f"{consecutive_failures} consecutive failures"
                    ),
                    should_escalate=self._config.escalate_on_failure,
                )
            )
            continue

        try:
            # Run hook with timeout
            result = await asyncio.wait_for(
                hook.validate(context),
                timeout=self._config.timeout_seconds,
            )
            results.append(result)
            if result.passed:
                consecutive_failures = 0
            else:
                consecutive_failures += 1
        except TimeoutError:
            consecutive_failures += 1
            results.append(
                GroundingResult(
                    passed=False,
                    hook_name=hook.name,
                    message=f"Hook timed out after {self._config.timeout_seconds}s",
                    should_escalate=self._config.escalate_on_failure,
                )
            )
        except Exception as e:
            consecutive_failures += 1
            results.append(
                GroundingResult(
                    passed=False,
                    hook_name=hook.name,
                    message=f"Hook error: {e!s}",
                    details={"error_type": type(e).__name__},
                    should_escalate=self._config.escalate_on_failure,
                )
            )

    return results
aggregate_results
aggregate_results(results)

Aggregate multiple grounding results into overall status.

Parameters:

Name Type Description Default
results list[GroundingResult]

List of grounding results to aggregate.

required

Returns:

Type Description
tuple[bool, str]

Tuple of (overall_passed, summary_message).

Source code in src/marianne/execution/grounding.py
def aggregate_results(
    self,
    results: list[GroundingResult],
) -> tuple[bool, str]:
    """Aggregate multiple grounding results into overall status.

    Args:
        results: List of grounding results to aggregate.

    Returns:
        Tuple of (overall_passed, summary_message).
    """
    if not results:
        return True, "No grounding hooks executed"

    passed = all(r.passed for r in results)
    failed = [r for r in results if not r.passed]

    if passed:
        return True, f"All {len(results)} grounding check(s) passed"

    # Build failure summary
    failures = ", ".join(f"{r.hook_name}: {r.message}" for r in failed)
    return False, f"{len(failed)}/{len(results)} grounding check(s) failed: {failures}"

Functions

create_hook_from_config

create_hook_from_config(hook_config)

Factory function to create a grounding hook from configuration.

This is the integration point for hook registration. The factory reads hook configuration from YAML and instantiates the appropriate hook class.

Parameters:

Name Type Description Default
hook_config GroundingHookConfig

Configuration for the hook (from GroundingConfig.hooks).

required

Returns:

Type Description
GroundingHook

An instantiated GroundingHook ready for registration.

Raises:

Type Description
ValueError

If hook type is unknown.

Example

from marianne.core.config import GroundingHookConfig config = GroundingHookConfig( type="file_checksum", expected_checksums={"output.txt": "abc123..."}, ) hook = create_hook_from_config(config) grounding_engine.add_hook(hook)

Source code in src/marianne/execution/grounding.py
def create_hook_from_config(
    hook_config: "GroundingHookConfig",
) -> GroundingHook:
    """Factory function to create a grounding hook from configuration.

    This is the integration point for hook registration. The factory reads
    hook configuration from YAML and instantiates the appropriate hook class.

    Args:
        hook_config: Configuration for the hook (from GroundingConfig.hooks).

    Returns:
        An instantiated GroundingHook ready for registration.

    Raises:
        ValueError: If hook type is unknown.

    Example:
        from marianne.core.config import GroundingHookConfig
        config = GroundingHookConfig(
            type="file_checksum",
            expected_checksums={"output.txt": "abc123..."},
        )
        hook = create_hook_from_config(config)
        grounding_engine.add_hook(hook)
    """
    # Import here to avoid TYPE_CHECKING import issues
    from marianne.core.config import GroundingHookConfig as GHC

    if not isinstance(hook_config, GHC):
        raise TypeError(f"Expected GroundingHookConfig, got {type(hook_config)}")

    if hook_config.type == "file_checksum":
        return FileChecksumGroundingHook(
            expected_checksums=hook_config.expected_checksums,
            checksum_algorithm=hook_config.checksum_algorithm,
            name=hook_config.name,
        )

    # Future hook types can be added here:
    # elif hook_config.type == "api_validator":
    #     return ApiValidatorGroundingHook(...)

    raise ValueError(f"Unknown grounding hook type: {hook_config.type}")