Skip to content

spec

spec

Specification corpus configuration models.

Defines SpecFragment (a single spec document) and SpecCorpusConfig (the collection of fragments loaded from a project's spec directory). These models are the data layer for the spec corpus pipeline:

YAML/MD files → SpecCorpusLoader → list[SpecFragment]
                                     ↓
                          SpecCorpusConfig (in JobConfig)
                                     ↓
                          PromptBuilder (injected per-sheet)

Classes

SpecFragment

Bases: BaseModel

A single specification fragment loaded from the spec corpus.

Each fragment corresponds to one file in the project's spec directory. Structured YAML files produce fragments with parsed data; markdown files produce text fragments.

Fragments are tagged for per-sheet filtering: a score can declare spec_tags: {1: ["goals", "safety"]} so sheet 1 only receives fragments matching those tags.

Functions
name_not_empty classmethod
name_not_empty(v)

Ensure fragment name is not empty or whitespace-only.

Source code in src/marianne/core/config/spec.py
@field_validator("name")
@classmethod
def name_not_empty(cls, v: str) -> str:
    """Ensure fragment name is not empty or whitespace-only."""
    if not v.strip():
        raise ValueError("SpecFragment name must not be empty")
    return v
content_not_empty classmethod
content_not_empty(v)

Ensure fragment content is not empty.

Source code in src/marianne/core/config/spec.py
@field_validator("content")
@classmethod
def content_not_empty(cls, v: str) -> str:
    """Ensure fragment content is not empty."""
    if not v.strip():
        raise ValueError("SpecFragment content must not be empty")
    return v

SpecCorpusConfig

Bases: BaseModel

Configuration for the specification corpus.

Controls where spec fragments are loaded from and how they are filtered for injection into agent prompts.

Functions
get_fragments_by_tags
get_fragments_by_tags(tags)

Filter fragments by tags.

Parameters:

Name Type Description Default
tags list[str]

Tags to filter by. A fragment matches if it has at least one tag in common with the filter list. An empty filter list returns all fragments (no filtering).

required

Returns:

Type Description
list[SpecFragment]

List of matching fragments.

Source code in src/marianne/core/config/spec.py
def get_fragments_by_tags(self, tags: list[str]) -> list[SpecFragment]:
    """Filter fragments by tags.

    Args:
        tags: Tags to filter by. A fragment matches if it has at least
            one tag in common with the filter list. An empty filter list
            returns all fragments (no filtering).

    Returns:
        List of matching fragments.
    """
    if not tags:
        return list(self.fragments)

    tag_set = set(tags)
    return [f for f in self.fragments if tag_set & set(f.tags)]
corpus_hash
corpus_hash()

Compute a deterministic hash of the corpus content.

The hash is order-independent: the same set of fragments produces the same hash regardless of insertion order. This prevents false drift detection when filesystem listing order varies across OS.

Returns:

Type Description
str

Hex digest string. Empty corpus produces a consistent empty hash.

Source code in src/marianne/core/config/spec.py
def corpus_hash(self) -> str:
    """Compute a deterministic hash of the corpus content.

    The hash is order-independent: the same set of fragments produces
    the same hash regardless of insertion order. This prevents false
    drift detection when filesystem listing order varies across OS.

    Returns:
        Hex digest string. Empty corpus produces a consistent empty hash.
    """
    if not self.fragments:
        return hashlib.sha256(b"").hexdigest()

    # Sort by name for order independence, then hash content
    sorted_fragments = sorted(self.fragments, key=lambda f: f.name)
    hasher = hashlib.sha256()
    for frag in sorted_fragments:
        hasher.update(frag.name.encode("utf-8"))
        hasher.update(b"\x00")
        hasher.update(frag.content.encode("utf-8"))
        hasher.update(b"\x00")
    return hasher.hexdigest()