execution
execution
¶
Execution and retry configuration models.
Defines models for retry behavior, rate limiting, circuit breaker, cost limits, and parallel execution.
Classes¶
RetryConfig
¶
Bases: BaseModel
Configuration for retry behavior including partial completion recovery.
RateLimitConfig
¶
Bases: BaseModel
Configuration for rate limit detection and handling.
CircuitBreakerConfig
¶
Bases: BaseModel
Configuration for the circuit breaker pattern.
The circuit breaker prevents cascading failures by temporarily blocking requests after repeated failures. This gives the backend time to recover before retrying.
State transitions: - CLOSED (normal): Requests flow through, failures are tracked - OPEN (blocking): Requests are blocked after failure_threshold exceeded - HALF_OPEN (testing): Single request allowed to test recovery
Evolution #8: Cross-Workspace Circuit Breaker adds coordination between parallel Marianne jobs via the global learning store. When one job hits a rate limit, other jobs will honor that limit and wait.
Example
circuit_breaker: enabled: true failure_threshold: 5 recovery_timeout_seconds: 300 cross_workspace_coordination: true honor_other_jobs_rate_limits: true
CostLimitConfig
¶
Bases: BaseModel
Configuration for cost tracking and limits.
Prevents runaway costs by tracking token usage and optionally enforcing cost limits per sheet or per job. Cost is estimated from token counts using configurable rates.
When cost limits are exceeded: - The current sheet is marked as failed with reason "cost_limit" - For per-job limits, the job is paused to prevent further execution - All cost data is recorded in checkpoint state for analysis
Example
cost_limits: enabled: true max_cost_per_sheet: 5.00 max_cost_per_job: 100.00 cost_per_1k_input_tokens: 0.003 cost_per_1k_output_tokens: 0.015
Default rates are for Claude Sonnet. For Opus, use:
cost_per_1k_input_tokens: 0.015 cost_per_1k_output_tokens: 0.075
StaleDetectionConfig
¶
Bases: BaseModel
Configuration for detecting stale (hung) sheet executions.
When enabled, monitors execution activity and fails sheets that produce
no output for longer than idle_timeout_seconds. This catches hung
processes that the per-sheet timeout alone may not detect quickly enough
(e.g., a 30-minute timeout sheet that hangs after 2 minutes of output).
Example
stale_detection: enabled: true idle_timeout_seconds: 300 check_interval_seconds: 30
Note: The idle timeout should be generous enough to accommodate legitimate pauses (e.g., waiting for API responses). A minimum of 120 seconds is recommended for LLM-based workloads.
PreflightConfig
¶
Bases: BaseModel
Configuration for pre-flight prompt analysis before sheet execution.
Controls token count thresholds for warning and error during preflight checks. Different instruments have different context windows — a 150K threshold that's correct for a 200K-context model is wrong for a 1M-context model. Set thresholds appropriate for the instruments in use.
This config lives at the daemon level because the conductor manages all execution and knows which instruments are available. Individual scores inherit the daemon's thresholds unless overridden.
Example (daemon config for large-context instruments): preflight: token_warning_threshold: 200000 token_error_threshold: 800000
Example (disable the error threshold entirely): preflight: token_error_threshold: 0
ParallelConfig
¶
Bases: BaseModel
Configuration for parallel sheet execution (v17 evolution).
Enables running multiple sheets concurrently when the dependency DAG permits. Requires sheet dependencies to be configured for meaningful parallel execution.
Example YAML
parallel: enabled: true max_concurrent: 3 fail_fast: true
sheet: dependencies: 2: [1] 3: [1] 4: [2, 3]
With this config, sheets 2 and 3 can run in parallel after sheet 1 completes, then sheet 4 runs after both 2 and 3 complete.
ValidationRule
¶
Bases: BaseModel
A single validation rule for checking sheet outputs.
Supports staged execution via the stage field. Validations are run
in stage order (1, 2, 3...). If any validation in a stage fails,
higher stages are skipped (fail-fast behavior).
Typical stage layout: - Stage 1: Syntax & compilation (cargo check, cargo fmt --check) - Stage 2: Testing (cargo test, pytest) - Stage 3: Code quality (clippy -D warnings, ruff check) - Stage 4: Security (cargo audit, npm audit)
SkipWhenCommand
¶
Bases: BaseModel
A command-based conditional skip rule for sheet execution.
When the command exits 0, the sheet is SKIPPED. When the command exits non-zero, the sheet RUNS. On timeout or error, the sheet RUNS (fail-open for safety).
The command field supports {workspace} template expansion,
following the same pattern as validation commands.