Index
core
¶
Core domain models and configuration.
Classes¶
CheckpointState
¶
Bases: BaseModel
Complete checkpoint state for a job run.
This is the primary state object that gets persisted and restored for resumable job execution.
Zombie Detection
A job is considered a "zombie" when the state shows RUNNING status
but the associated process (tracked by pid) is no longer alive.
This can happen when:
- External timeout wrapper sends SIGKILL
- System crash or forced termination
- WSL shutdown while job running
Use is_zombie() to detect this state, and mark_zombie_detected()
to recover from it.
Worktree Isolation
When isolation is enabled, jobs execute in a separate git worktree. The worktree tracking fields record the worktree state for: - Resume operations (reuse existing worktree) - Cleanup on completion (remove or preserve based on outcome) - Debugging (know which worktree was used)
Functions¶
record_hook_result
¶
Append a hook result to the checkpoint state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
dict[str, Any]
|
Serialized HookResult dict from hook execution. |
required |
Source code in src/marianne/core/checkpoint.py
record_circuit_breaker_change
¶
Record a circuit breaker state transition.
Persists circuit breaker state changes so that mzt status
can display ground-truth CB state instead of inferring it from
failure patterns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state
|
str
|
Current CB state after transition ("closed", "open", "half_open"). |
required |
trigger
|
str
|
What caused the transition (e.g., "failure_recorded", "success_recorded"). |
required |
consecutive_failures
|
int
|
Number of consecutive failures at time of transition. |
required |
Source code in src/marianne/core/checkpoint.py
add_synthesis
¶
Add or update a synthesis result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch_id
|
str
|
The batch identifier. |
required |
result
|
SynthesisResultDict
|
Synthesis result as dict (from SynthesisResult.to_dict()). |
required |
Source code in src/marianne/core/checkpoint.py
get_next_sheet
¶
Determine the next sheet to process.
Returns None if all sheets are complete.
Source code in src/marianne/core/checkpoint.py
mark_sheet_started
¶
Mark a sheet as started.
Source code in src/marianne/core/checkpoint.py
mark_sheet_completed
¶
mark_sheet_completed(sheet_num, validation_passed=True, validation_details=None, execution_duration_seconds=None)
Mark a sheet as completed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_num
|
int
|
Sheet number that completed. |
required |
validation_passed
|
bool
|
Whether validation checks passed. |
True
|
validation_details
|
list[ValidationDetailDict] | None
|
Detailed validation results. |
None
|
execution_duration_seconds
|
float | None
|
How long the sheet execution took. |
None
|
Source code in src/marianne/core/checkpoint.py
mark_sheet_failed
¶
mark_sheet_failed(sheet_num, error_message, error_category=None, exit_code=None, exit_signal=None, exit_reason=None, execution_duration_seconds=None, error_code=None)
Mark a sheet as failed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_num
|
int
|
Sheet number that failed. |
required |
error_message
|
str
|
Human-readable error description. |
required |
error_category
|
ErrorCategory | str | None
|
Error category from ErrorClassifier (e.g., "signal", "timeout"). |
None
|
exit_code
|
int | None
|
Process exit code (None if killed by signal). |
None
|
exit_signal
|
int | None
|
Signal number if killed by signal (e.g., 9=SIGKILL, 15=SIGTERM). |
None
|
exit_reason
|
ExitReason | None
|
Why execution ended ("completed", "timeout", "killed", "error"). |
None
|
execution_duration_seconds
|
float | None
|
How long the sheet execution took. |
None
|
error_code
|
str | None
|
Structured error code (e.g., "E001", "E006"). More specific than error_category — distinguishes stale (E006) from timeout (E001). |
None
|
Source code in src/marianne/core/checkpoint.py
1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 | |
mark_sheet_skipped
¶
Mark a sheet as skipped.
v21 Evolution: Proactive Checkpoint System - supports skipping sheets via checkpoint response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_num
|
int
|
Sheet number to skip. |
required |
reason
|
str | None
|
Optional reason for skipping (stored in error_message field). |
None
|
Source code in src/marianne/core/checkpoint.py
mark_job_failed
¶
Mark the entire job as failed.
Source code in src/marianne/core/checkpoint.py
mark_job_paused
¶
Mark the job as paused.
Source code in src/marianne/core/checkpoint.py
get_progress
¶
Get progress as (completed, total).
get_progress_percent
¶
is_zombie
¶
Check if this job is a zombie (RUNNING but process dead).
A zombie state occurs when: 1. Status is RUNNING 2. PID is set 3. Process with that PID is no longer alive
Note: This only checks if the PID is dead. It does NOT use time-based stale detection, as jobs can legitimately run for hours or days.
Returns:
| Type | Description |
|---|---|
bool
|
True if job appears to be a zombie, False otherwise. |
Source code in src/marianne/core/checkpoint.py
mark_zombie_detected
¶
Mark this job as recovered from zombie state.
Changes status from RUNNING to PAUSED, clears PID, and records the zombie recovery in the error message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reason
|
str | None
|
Optional additional context about why zombie was detected. |
None
|
Source code in src/marianne/core/checkpoint.py
set_running_pid
¶
Set the PID of the running orchestrator process.
Call this when starting job execution to enable zombie detection. If pid is None, uses the current process PID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pid
|
int | None
|
Process ID to record. Defaults to current process. |
None
|
Source code in src/marianne/core/checkpoint.py
SheetState
¶
Bases: BaseModel
State for a single sheet.
Attributes¶
applied_pattern_ids
property
writable
¶
Backward-compatible accessor for pattern IDs.
applied_pattern_descriptions
property
writable
¶
Backward-compatible accessor for pattern descriptions.
has_fallback_available
property
¶
Whether there is another instrument in the fallback chain to try.
Functions¶
record_attempt
¶
Record an attempt result and update tracking state.
Only non-successful, non-rate-limited attempts increment
normal_attempts. Successes and rate-limited attempts are
recorded in attempt_results for history but don't consume
retry budget.
Source code in src/marianne/core/checkpoint.py
advance_fallback
¶
Advance to the next instrument in the fallback chain.
Records the transition, switches instrument_name, resets retry budget, and increments current_instrument_index.
Returns the new instrument name, or None if the chain is exhausted.
Source code in src/marianne/core/checkpoint.py
to_dict
¶
from_dict
classmethod
¶
Restore from dict. Compatibility with dataclass SheetExecutionState.
capture_output
¶
Capture tail of stdout/stderr for debugging.
Stores the last max_bytes of each output stream. Sets output_truncated
to True if either stream was larger than the limit.
Credential scanning (F-003): Before storing, both streams are scanned for API key patterns (sk-ant-, AKIA, AIzaSy, Bearer tokens) and matches are replaced with [REDACTED_] placeholders. This prevents leaked credentials from propagating to learning store, dashboard, diagnostics, and MCP resources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stdout
|
str
|
Full stdout string from execution. |
required |
stderr
|
str
|
Full stderr string from execution. |
required |
max_bytes
|
int
|
Maximum bytes to capture per stream (default 10KB). |
MAX_OUTPUT_CAPTURE_BYTES
|
Source code in src/marianne/core/checkpoint.py
add_error_to_history
¶
Append an error record and enforce the history size limit.
All callers that add errors to error_history should use this
method instead of appending directly so that the list never exceeds
MAX_ERROR_HISTORY entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
error
|
CheckpointErrorRecord
|
The error record to add. |
required |
Source code in src/marianne/core/checkpoint.py
add_fallback_to_history
¶
Append a fallback record and enforce the history size limit.
All callers that add entries to instrument_fallback_history
should use this method instead of appending directly so the list
never exceeds MAX_INSTRUMENT_FALLBACK_HISTORY entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
record
|
dict[str, str]
|
Dict with keys from, to, reason, timestamp. |
required |
Source code in src/marianne/core/checkpoint.py
SheetStatus
¶
Bases: str, Enum
Status of a single sheet.
The baton tracks 11 scheduling states. Status display and persistence
use all 11. Consumers that only care about terminal/non-terminal can
check is_terminal.
BackendConfig
¶
Bases: BaseModel
Configuration for the execution backend.
Uses a flat structure with cross-field validation to ensure type-specific
fields are only meaningful when the corresponding backend type is selected.
The _validate_type_specific_fields validator warns when fields for an
unselected backend are set to non-default values.
JobConfig
¶
Bases: BaseModel
Complete configuration for an orchestration job.
Functions¶
to_yaml
¶
Serialize this JobConfig to valid score YAML.
The output is semantically equivalent to the original config:
from_yaml_string(config.to_yaml()) produces an equivalent config
(compared via model_dump()). String-level identity with the
original YAML file is NOT guaranteed because workspace paths are
resolved to absolute at parse time and fan-out configs are expanded.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exclude_defaults
|
bool
|
If True, omit fields that match their default values for cleaner output. Defaults to False (lossless). |
False
|
Returns:
| Type | Description |
|---|---|
str
|
A valid YAML string that |
Source code in src/marianne/core/config/job.py
from_yaml
classmethod
¶
Load job configuration from a YAML file.
Source code in src/marianne/core/config/job.py
from_yaml_string
classmethod
¶
Load job configuration from a YAML string.
Source code in src/marianne/core/config/job.py
get_state_path
¶
Get the resolved state path.
Source code in src/marianne/core/config/job.py
get_outcome_store_path
¶
Get the resolved outcome store path for learning.
Source code in src/marianne/core/config/job.py
NotificationConfig
¶
Bases: BaseModel
Configuration for a notification channel.
PromptConfig
¶
Bases: BaseModel
Configuration for prompt templating.
Functions¶
at_least_one_template
¶
Warn when no template source is provided (falls back to default prompt).
Source code in src/marianne/core/config/job.py
RateLimitConfig
¶
Bases: BaseModel
Configuration for rate limit detection and handling.
RetryConfig
¶
Bases: BaseModel
Configuration for retry behavior including partial completion recovery.
SheetConfig
¶
Bases: BaseModel
Configuration for sheet processing.
In Marianne's musical theme, a composition is divided into sheets, each containing a portion of the work to be performed.
Fan-out support: When fan_out is specified, stages are expanded into
concrete sheets at parse time. For example, total_items=7, fan_out={2: 3}
produces 9 concrete sheets (stage 2 instantiated 3 times). After expansion,
total_items and dependencies reflect expanded values, and fan_out
is cleared to {} to prevent re-expansion on resume.
Attributes¶
total_stages
property
¶
Return the original stage count.
After fan-out expansion, total_items reflects expanded sheet count. total_stages preserves the original logical stage count from fan_out_stage_map. When no fan-out was used, total_stages == total_sheets (identity).
Functions¶
strip_computed_fields
classmethod
¶
Strip computed properties that users may include in YAML.
total_sheets is computed from size/total_items, not configurable. Accept it silently for backward compatibility — rejecting it would break existing scores that include it.
Source code in src/marianne/core/config/job.py
validate_per_sheet_instruments
classmethod
¶
Validate per-sheet instrument assignments.
Source code in src/marianne/core/config/job.py
validate_per_sheet_fallbacks
classmethod
¶
Validate per-sheet fallback chain keys are positive integers.
Source code in src/marianne/core/config/job.py
validate_instrument_map
classmethod
¶
Validate instrument_map: no duplicate sheets, valid names.
Source code in src/marianne/core/config/job.py
get_fan_out_metadata
¶
Get fan-out metadata for a specific sheet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_num
|
int
|
Concrete sheet number (1-indexed). |
required |
Returns:
| Type | Description |
|---|---|
FanOutMetadata
|
FanOutMetadata with stage, instance, and fan_count. |
FanOutMetadata
|
When no fan-out is configured, returns identity metadata |
FanOutMetadata
|
(stage=sheet_num, instance=1, fan_count=1). |
Source code in src/marianne/core/config/job.py
validate_fan_out
classmethod
¶
Validate fan_out field values.
Source code in src/marianne/core/config/job.py
validate_dependencies
classmethod
¶
Validate dependency declarations.
Note: Full validation (range checks, cycle detection) happens when the DependencyDAG is built at runtime, since total_sheets isn't available during field validation.
Source code in src/marianne/core/config/job.py
expand_fan_out_config
¶
Expand fan_out declarations into concrete sheet assignments.
This runs after field validators. When fan_out is non-empty: 1. Validates constraints (size=1, start_item=1) 2. Calls expand_fan_out() to compute concrete sheet assignments 3. Overwrites total_items and dependencies with expanded values 4. Stores metadata in fan_out_stage_map for resume support 5. Clears fan_out={} to prevent re-expansion on resume
Source code in src/marianne/core/config/job.py
validate_dependency_range
¶
Validate that dependency sheet numbers are within the valid range.
Runs after fan-out expansion so total_sheets reflects the final count.
Source code in src/marianne/core/config/job.py
ValidationRule
¶
Bases: BaseModel
A single validation rule for checking sheet outputs.
Supports staged execution via the stage field. Validations are run
in stage order (1, 2, 3...). If any validation in a stage fails,
higher stages are skipped (fail-fast behavior).
Typical stage layout: - Stage 1: Syntax & compilation (cargo check, cargo fmt --check) - Stage 2: Testing (cargo test, pytest) - Stage 3: Code quality (clippy -D warnings, ruff check) - Stage 4: Security (cargo audit, npm audit)
ClassifiedError
dataclass
¶
ClassifiedError(category, message, error_code=UNKNOWN, original_error=None, exit_code=None, exit_signal=None, exit_reason=None, retriable=True, suggested_wait_seconds=None, error_info=None)
An error with its classification and metadata.
ClassifiedError combines high-level category (for retry logic) with specific error codes (for diagnostics and logging). The error_code provides stable identifiers for programmatic handling while category determines retry behavior.
ErrorCategory
¶
Bases: str, Enum
Categories of errors with different retry behaviors.
Attributes¶
RATE_LIMIT
class-attribute
instance-attribute
¶
Retriable with long wait - API/service is rate limiting.
TRANSIENT
class-attribute
instance-attribute
¶
Retriable with backoff - temporary network/service issues.
VALIDATION
class-attribute
instance-attribute
¶
Retriable - Claude ran but didn't produce expected output.
AUTH
class-attribute
instance-attribute
¶
Fatal - authentication/authorization failure, needs user intervention.
NETWORK
class-attribute
instance-attribute
¶
Retriable with backoff - network connectivity issues.
SIGNAL
class-attribute
instance-attribute
¶
Process killed by signal - may be retriable depending on signal.
CONFIGURATION
class-attribute
instance-attribute
¶
Non-retriable - configuration error needs user intervention (e.g., MCP setup).
PREFLIGHT
class-attribute
instance-attribute
¶
Pre-execution check failure — config or environment not ready.
ESCALATION
class-attribute
instance-attribute
¶
Escalation-based abort — grounding or escalation policy triggered.
ErrorClassifier
¶
Classifies errors based on patterns and exit codes.
Pattern matching follows the approach from run-sheet-review.sh which checks output for rate limit indicators.
Initialize classifier with detection patterns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rate_limit_patterns
|
list[str] | None
|
Regex patterns indicating rate limiting |
None
|
auth_patterns
|
list[str] | None
|
Regex patterns indicating auth failures |
None
|
network_patterns
|
list[str] | None
|
Regex patterns indicating network issues |
None
|
Source code in src/marianne/core/errors/classifier.py
Functions¶
parse_reset_time
¶
Parse reset time from message and return seconds until reset.
Supports patterns like: - "resets at 9pm" -> seconds until 9pm (or next day if past) - "resets at 21:00" -> seconds until 21:00 - "resets in 3 hours" -> 3 * 3600 seconds - "resets in 30 minutes" -> 30 * 60 seconds
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Error message that may contain reset time info. |
required |
Returns:
| Type | Description |
|---|---|
float | None
|
Seconds until reset, or None if no reset time found. |
float | None
|
Returns minimum of RESET_TIME_MINIMUM_WAIT_SECONDS to avoid immediate retries. |
Source code in src/marianne/core/errors/classifier.py
extract_rate_limit_wait
¶
Extract wait duration from rate limit error text.
Supports common patterns from Anthropic, Claude Code, and generic APIs: - "retry after N seconds/minutes/hours" - "try again in N seconds/minutes/hours" - "wait N seconds/minutes/hours" - "Retry-After: N" (header value) - "resets in N hours/minutes" (delegates to parse_reset_time)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Error message or combined stdout/stderr. |
required |
Returns:
| Type | Description |
|---|---|
float | None
|
Seconds to wait, clamped to [MIN, MAX], or None if no pattern matches. |
Source code in src/marianne/core/errors/classifier.py
classify
¶
classify(stdout='', stderr='', exit_code=None, exit_signal=None, exit_reason=None, exception=None, output_format=None)
Classify an error based on output, exit code, and signal.
Delegates to sub-classifiers in priority order: 1. Signal-based exits (_classify_signal) 2. Timeout exit reason 3. Pattern-matching on output (_classify_by_pattern) 4. Exit code analysis (_classify_by_exit_code) 5. Unknown fallback
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stdout
|
str
|
Standard output from the command |
''
|
stderr
|
str
|
Standard error from the command |
''
|
exit_code
|
int | None
|
Process exit code (0 = success), None if killed by signal |
None
|
exit_signal
|
int | None
|
Signal number if killed by signal |
None
|
exit_reason
|
ExitReason | None
|
Why execution ended (completed, timeout, killed, error) |
None
|
exception
|
Exception | None
|
Optional exception that was raised |
None
|
output_format
|
str | None
|
Backend output format ("text", "json", "stream-json"). When "text", exit code 1 is classified as E209 (validation) instead of E009 (unknown). |
None
|
Returns:
| Type | Description |
|---|---|
ClassifiedError
|
ClassifiedError with category, error_code, and metadata |
Source code in src/marianne/core/errors/classifier.py
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 | |
classify_execution
¶
classify_execution(stdout='', stderr='', exit_code=None, exit_signal=None, exit_reason=None, exception=None, output_format=None, *, input=None)
Classify execution errors using structured JSON parsing with fallback.
This is the new multi-error classification method that: 1. Parses structured JSON errors[] from CLI output (if present) 2. Classifies each error independently (no short-circuiting) 3. Analyzes exit code and signal for additional context 4. Selects root cause using priority-based scoring 5. Returns all errors with primary/secondary designation
This method returns ClassificationResult which provides access to
all detected errors while maintaining backward compatibility through
the primary attribute.
Supports two calling conventions
- Keyword args (legacy):
classify_execution(stdout=..., stderr=..., ...) - Bundled (preferred):
classify_execution(input=ClassificationInput(...))
When input is supplied, its fields take precedence over individual keyword arguments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stdout
|
str
|
Standard output from the command (may contain JSON). |
''
|
stderr
|
str
|
Standard error from the command. |
''
|
exit_code
|
int | None
|
Process exit code (0 = success), None if killed by signal. |
None
|
exit_signal
|
int | None
|
Signal number if killed by signal. |
None
|
exit_reason
|
ExitReason | None
|
Why execution ended (completed, timeout, killed, error). |
None
|
exception
|
Exception | None
|
Optional exception that was raised. |
None
|
output_format
|
str | None
|
Expected output format (e.g. "json"). |
None
|
input
|
ClassificationInput | None
|
Bundled classification input (preferred over individual kwargs). |
None
|
Returns:
| Type | Description |
|---|---|
ClassificationResult
|
ClassificationResult with primary error, secondary errors, and metadata. |
Example
result = classifier.classify_execution(stdout, stderr, exit_code)
# Access primary (root cause) error
if result.primary.category == ErrorCategory.RATE_LIMIT:
wait_time = result.primary.suggested_wait_seconds
# Access all errors for debugging
for error in result.all_errors:
logger.info(f"{error.error_code.value}: {error.message}")
Source code in src/marianne/core/errors/classifier.py
967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 | |
from_config
classmethod
¶
ErrorCode
¶
Bases: str, Enum
Structured error codes for comprehensive error classification.
Error codes are organized by category using numeric prefixes: - E0xx: Execution errors (timeouts, crashes, kills) - E1xx: Rate limit / capacity errors - E2xx: Validation errors - E3xx: Configuration errors - E4xx: State errors - E5xx: Backend errors - E6xx: Preflight errors
Error codes are stable identifiers that can be used for: - Programmatic error handling and routing - Log aggregation and alerting - Documentation and troubleshooting guides - Metrics and observability dashboards
Attributes¶
EXECUTION_TIMEOUT
class-attribute
instance-attribute
¶
Command execution exceeded timeout limit.
EXECUTION_KILLED
class-attribute
instance-attribute
¶
Process was killed by a signal (external termination).
EXECUTION_CRASHED
class-attribute
instance-attribute
¶
Process crashed (segfault, bus error, abort, etc.).
EXECUTION_INTERRUPTED
class-attribute
instance-attribute
¶
Process was interrupted by user (SIGINT/Ctrl+C).
EXECUTION_OOM
class-attribute
instance-attribute
¶
Process was killed due to out of memory condition.
EXECUTION_STALE
class-attribute
instance-attribute
¶
Execution killed by stale detection — no output for too long.
EXECUTION_UNKNOWN
class-attribute
instance-attribute
¶
Unknown execution error with non-zero exit code.
RATE_LIMIT_API
class-attribute
instance-attribute
¶
API rate limit exceeded (429, quota, throttling).
RATE_LIMIT_CLI
class-attribute
instance-attribute
¶
CLI-level rate limiting detected.
CAPACITY_EXCEEDED
class-attribute
instance-attribute
¶
Service capacity exceeded (overloaded, try again later).
QUOTA_EXHAUSTED
class-attribute
instance-attribute
¶
Token/usage quota exhausted - wait until reset time.
VALIDATION_FILE_MISSING
class-attribute
instance-attribute
¶
Expected output file does not exist.
VALIDATION_CONTENT_MISMATCH
class-attribute
instance-attribute
¶
Output content does not match expected pattern.
VALIDATION_COMMAND_FAILED
class-attribute
instance-attribute
¶
Validation command returned non-zero exit code.
VALIDATION_TIMEOUT
class-attribute
instance-attribute
¶
Validation check timed out.
VALIDATION_GENERIC
class-attribute
instance-attribute
¶
Generic validation failure (output validation needed).
CONFIG_INVALID
class-attribute
instance-attribute
¶
Configuration file is malformed or invalid.
CONFIG_MISSING_FIELD
class-attribute
instance-attribute
¶
Required configuration field is missing.
CONFIG_PATH_NOT_FOUND
class-attribute
instance-attribute
¶
Configuration file path does not exist.
CONFIG_PARSE_ERROR
class-attribute
instance-attribute
¶
Failed to parse configuration file (YAML/JSON syntax error).
CONFIG_MCP_ERROR
class-attribute
instance-attribute
¶
MCP server/plugin configuration error (missing env vars, invalid config).
CONFIG_CLI_MODE_ERROR
class-attribute
instance-attribute
¶
Claude CLI mode mismatch (e.g., streaming mode incompatible with operation).
STATE_CORRUPTION
class-attribute
instance-attribute
¶
Checkpoint state file is corrupted or inconsistent.
STATE_LOAD_FAILED
class-attribute
instance-attribute
¶
Failed to load checkpoint state from storage.
STATE_SAVE_FAILED
class-attribute
instance-attribute
¶
Failed to save checkpoint state to storage.
STATE_VERSION_MISMATCH
class-attribute
instance-attribute
¶
Checkpoint state version is incompatible.
BACKEND_CONNECTION
class-attribute
instance-attribute
¶
Failed to connect to backend service.
BACKEND_AUTH
class-attribute
instance-attribute
¶
Backend authentication or authorization failed.
BACKEND_RESPONSE
class-attribute
instance-attribute
¶
Invalid or unexpected response from backend.
BACKEND_TIMEOUT
class-attribute
instance-attribute
¶
Backend request timed out.
BACKEND_NOT_FOUND
class-attribute
instance-attribute
¶
Backend executable or service not found.
PREFLIGHT_PATH_MISSING
class-attribute
instance-attribute
¶
Required path does not exist (working_dir, referenced file).
PREFLIGHT_PROMPT_TOO_LARGE
class-attribute
instance-attribute
¶
Prompt exceeds recommended token limit.
PREFLIGHT_WORKING_DIR_INVALID
class-attribute
instance-attribute
¶
Working directory is not accessible or not a directory.
PREFLIGHT_VALIDATION_SETUP
class-attribute
instance-attribute
¶
Validation target path or pattern is invalid.
NETWORK_CONNECTION_FAILED
class-attribute
instance-attribute
¶
Network connection failed (refused, reset, unreachable).
NETWORK_DNS_ERROR
class-attribute
instance-attribute
¶
DNS resolution failed.
NETWORK_SSL_ERROR
class-attribute
instance-attribute
¶
SSL/TLS handshake or certificate error.
NETWORK_TIMEOUT
class-attribute
instance-attribute
¶
Network operation timed out.
UNKNOWN
class-attribute
instance-attribute
¶
Unclassified error - requires investigation.
category
property
¶
Get the category prefix (first digit) of this error code.
Returns:
| Type | Description |
|---|---|
str
|
Category string like "execution", "rate_limit", "validation", etc. |
is_retriable
property
¶
Check if this error code is generally retriable.
Returns:
| Type | Description |
|---|---|
bool
|
True if errors with this code are typically retriable. |
Functions¶
get_retry_behavior
¶
Get precise retry behavior for this error code.
Returns error-code-specific delay and retry recommendations. Uses module-level _RETRY_BEHAVIORS constant to avoid rebuilding the lookup table on every call.
Returns:
| Type | Description |
|---|---|
RetryBehavior
|
RetryBehavior with delay, retriability, and reason. |
Source code in src/marianne/core/errors/codes.py
get_severity
¶
Get the severity level for this error code.
Severity assignments: - CRITICAL: Fatal errors requiring immediate attention - ERROR: Most error codes (default) - WARNING: Degraded but potentially temporary conditions - INFO: Reserved for future diagnostic codes
Returns:
| Type | Description |
|---|---|
Severity
|
Severity level for this error code. |
Source code in src/marianne/core/errors/codes.py
FatalError
¶
Bases: Exception
Non-recoverable error that should stop the job.
GracefulShutdownError
¶
Bases: Exception
Raised when Ctrl+C is pressed to trigger graceful shutdown.
This exception is caught by the runner to save state before exiting.
RateLimitExhaustedError
¶
Bases: FatalError
Rate limit or quota exhaustion — job should PAUSE, not FAIL.
Subclasses FatalError for backward compatibility: existing
except FatalError blocks still catch it, but more specific
except RateLimitExhaustedError blocks intercept first when
ordered before except FatalError.
Attributes:
| Name | Type | Description |
|---|---|---|
resume_after |
When the rate limit resets (ISO datetime), or None. |
|
backend_type |
Which backend hit the limit (e.g., "claude-cli"). |
|
quota_exhaustion |
True if daily/monthly quota is exhausted, False if it's a per-minute rate limit. |
Source code in src/marianne/core/errors/exceptions.py
JobCompletionSummary
¶
Bases: BaseModel
Summary of a completed job run.
Pydantic v2 model tracking key metrics for display at job completion: - Sheet success/failure/skip counts - Validation pass rate - Cost tracking - Duration and retry statistics - Hook execution results
Attributes¶
success_rate
property
¶
Calculate sheet success rate as percentage.
Skipped sheets are excluded from the denominator since they were never attempted (e.g., skip_when_command conditions met).
Functions¶
from_checkpoint
classmethod
¶
Construct a summary from checkpoint state.
Computes rates from sheet states, sums costs and durations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
checkpoint
|
CheckpointState
|
The checkpoint state to summarize. |
required |
Returns:
| Type | Description |
|---|---|
JobCompletionSummary
|
JobCompletionSummary with computed metrics. |
Source code in src/marianne/core/models.py
to_dict
¶
Convert summary to dictionary for JSON output.
Source code in src/marianne/core/models.py
GroundingDecisionContext
dataclass
¶
GroundingDecisionContext(passed, message, confidence=1.0, should_escalate=False, recovery_guidance=None, hooks_executed=0)
Context from grounding hooks for completion mode decisions.
Encapsulates grounding results to inform decision-making about whether to retry, complete, or escalate.
Functions¶
from_results
classmethod
¶
Build context from grounding results list.
Source code in src/marianne/core/summary.py
disabled
classmethod
¶
SheetExecutionMode
¶
Bases: str, Enum
Mode of sheet execution.