Opus Convergence Analysis¶
Internal Reference — This is a meta-analysis document, not user documentation.
Date: 2026-01-16 Analysis: Patterns to converge between Marianne Opus (v20, 20 cycles) and RLF Opus (v15, 15 cycles)
Executive Summary¶
Two independent self-improving opus scores have evolved in parallel: - Marianne Opus (v20): Orchestration framework evolution, 9-movement structure - RLF Opus (v15): HTTP API wiring evolution, 5-sheet structure
Both have independently discovered similar patterns, validating their utility. This document identifies patterns to converge, creating stronger scores on both sides.
Patterns FROM RLF → Marianne¶
1. Proactive Discovery Mode ⭐ HIGH VALUE¶
RLF Pattern:
proactive_mode:
evaluation_cycle_threshold: 3 # After 3 consecutive eval cycles
graduated_model:
cycle_3: "LIGHT PROACTIVE" # ONE small improvement
cycle_4_plus: "FULL PROACTIVE" # Multiple improvements
light_proactive_options:
A: "Error Message Review - audit 3-5 error paths"
B: "Performance Spot Check - profile ONE endpoint"
C: "Test Gap Assessment - identify ONE untested edge case"
D: "Documentation Audit - review ONE public API doc"
Why Marianne needs this: - Marianne has no mechanism to prevent idle cycles - When no high-CV candidates emerge, Marianne should still improve - Proactive discovery provides bounded, safe improvement work
Integration approach: - Add to Sheet 1 (External Discovery) phase - Check for consecutive evaluation cycles in header - If threshold reached, mandate ONE proactive task from options
2. Variance Tier Model ⭐ HIGH VALUE¶
RLF Pattern:
variance_tiers:
TIGHT:
expected_variance: "5-50%"
task_types: ["pure delegation", "simple wiring"]
TIGHT_MODERATE:
expected_variance: "10-50%"
task_types: ["light transformation", "DTO reuse"]
MODERATE:
expected_variance: "25-100%"
task_types: ["manual parsing", "new DTOs"]
LOOSE:
expected_variance: "50-200%"
task_types: ["embedded logic", "auth/branching"]
Why Marianne needs this: - Marianne tracks LOC accuracy but doesn't set expectations by task type - Creates unrealistic pressure for high accuracy on complex tasks - Tier-based approach acknowledges irreducible variance
Integration approach: - Add variance tier classification to Sheet 5 (Specification) - Use tier to set pass/fail criteria in Sheet 7 (Validation) - Track variance within tier as success metric (not raw accuracy)
3. Task Type CV Modifiers ⭐ MEDIUM VALUE¶
RLF Pattern:
task_type_cv_modifiers:
wiring: +0.05 # Infrastructure exists
bug_fix: +0.05 # Regression prevention
refactoring: +0.03 # No new features
documentation: +0.10 # No code changes
proactive_discovery: +0.03 # Polish work
true_integration: -0.05 # Cascading changes
Why Marianne needs this: - Marianne calculates CV without task type adjustment - Documentation improvements have artificially low CV - Maintenance work gets undervalued relative to features
Integration approach: - Add to CV calculation in Sheet 3 (TDF Synthesis) - Apply AFTER domain/boundary average calculation - Document modifier applied in synthesis output
4. Task Type LOC Multipliers ⭐ MEDIUM VALUE¶
RLF Pattern:
task_type_loc_multipliers:
wiring:
impl: "× 0.8" # Infrastructure exists
test: "× 1.0"
proactive_discovery:
impl: "× 0.8" # Polish work is small
test: "× 0.5" # May not need extensive tests
green_field:
impl: "× 1.35"
test: "× 1.5"
true_integration:
impl: "× 1.5"
test: "× 2.0"
Why Marianne needs this: - Marianne applies multipliers inconsistently - Proactive/maintenance work is systematically overestimated - Clear multipliers enable accurate planning
Integration approach: - Add to Sheet 5 (Specification) LOC formulas - Apply based on task classification from Sheet 3
5. Stable Deferral Threshold ⭐ MEDIUM VALUE¶
RLF Pattern:
stable_deferral:
threshold_cycles: 10 # After 10 cycles deferred
behavior: "STOP RE-EVALUATING"
rationale: |
If something has been deferred 10+ cycles without user friction,
it's not actually needed. Stop wasting discovery cycles on it.
reactivation_triggers:
- explicit_user_request
- new_user_friction_evidence
- product_decision_made
Why Marianne needs this: - Marianne re-evaluates deferred candidates every cycle - Wastes discovery effort on perpetually low-priority items - Stable deferrals should be marked as CLOSED
Integration approach: - Add deferral cycle counter to research candidates - At 10+ cycles, mark as STABLE_DEFERRAL - Only re-evaluate if explicit trigger
6. Test Coverage Multiplier for Enum Complexity ⭐ MEDIUM VALUE¶
RLF Pattern:
test_coverage_multiplier:
simple_types: 1.0
# Single struct request/response
single_enum: 1.5
# One enum type in request/response
multi_variant_enum: 2.5
# Enum with 4+ variants - each needs test
nested_enum: 3.0
# Recursive/nested enum variants
Why Marianne needs this: - Marianne's test LOC formulas miss enum complexity - Multi-variant types need per-variant test coverage - Systematic underestimation when enums involved
Integration approach: - Add enum detection to Sheet 5 (Specification) - Apply multiplier to test LOC calculation - Document detected enum complexity
7. LOC Accuracy Trend Tracking ⭐ LOW VALUE (already partial)¶
RLF Pattern:
# Tracked in opus header
LOC_ACCURACY_TREND:
v4: "23% variance (baseline)"
v5: "52% variance (handler type + patterns) <- BIG WIN"
v6: "52% variance (plateau)"
v7: "37.8% variance (REGRESSION - soft LOC noise)"
v8: "6.6% variance (2-step model validated)"
# ...
Why Marianne might benefit: - Visualizes LOC model evolution over time - Identifies regression points in estimation accuracy - Marianne tracks accuracy but doesn't plot trend
Integration approach: - Add trend section to opus header - Update each cycle with variance achieved - Flag regressions for formula adjustment
Patterns FROM Marianne → RLF¶
1. Pattern Trust Scoring ⭐ HIGH VALUE¶
Marianne Pattern:
pattern_trust:
formula: "base + quarantine_penalty + validation_bonus + age_factor + effectiveness_modifier"
range: [0, 1]
relevance_adjustments:
quarantined: "-0.3 score penalty"
high_trust: "+0.1 to +0.2 bonus"
validated: "+0.05 bonus"
Why RLF could benefit: - RLF doesn't track pattern effectiveness over time - Patterns discovered in early cycles may become stale - Trust scoring enables selective pattern application
2. Synergy Pair Validation ⭐ HIGH VALUE¶
Marianne Pattern:
synergy_pair:
criteria: "Both candidates address same problem space"
implementation_order: "Foundation first, dependent second"
combined_cv: "Average of pair CVs"
example:
pair: ["Pattern Quarantine", "Pattern Trust"]
shared_space: "Safe autonomous learning"
order: "Quarantine first (provides status), Trust second (uses status)"
Why RLF could benefit: - RLF implements single candidates per cycle - Related features could be paired for coherent evolution - Reduces context switching between cycles
3. TDF Quadruplet Analysis ⭐ MEDIUM VALUE¶
Marianne Pattern: - Marianne uses full TDF (COMP, SCI, CULT, EXP, META) - Boundary analysis between all domain pairs - Recognition level assessment (P0-P5)
Why RLF could benefit: - RLF uses simplified TDF without META domain - No explicit recognition level tracking - Full TDF could improve candidate selection
Shared Patterns (Independently Discovered)¶
Both opus scores independently discovered these patterns, validating their importance:
- Coverage Gates (Marianne: ≥80% new code, RLF: >70% coverage)
- Code Review During Implementation (both track early catch ratio)
- Evaluation Cycles (both allow cycles without implementation)
- Multi-Tier LOC Estimation (both use multipliers for complexity)
- Fixture Catalog Tracking (both maintain test fixture inventory)
Integration Priority¶
| Pattern | From | To | Priority | Complexity | Value |
|---|---|---|---|---|---|
| Proactive Discovery Mode | RLF | Marianne | 1 | Medium | High |
| Variance Tier Model | RLF | Marianne | 2 | Low | High |
| Task Type CV Modifiers | RLF | Marianne | 3 | Low | Medium |
| Task Type LOC Multipliers | RLF | Marianne | 4 | Low | Medium |
| Stable Deferral Threshold | RLF | Marianne | 5 | Low | Medium |
| Test Coverage (Enum) | RLF | Marianne | 6 | Medium | Medium |
| Pattern Trust Scoring | Marianne | RLF | 7 | High | High |
| Synergy Pair Validation | Marianne | RLF | 8 | Medium | High |
| TDF Quadruplet | Marianne | RLF | 9 | High | Medium |
Recommended Convergence Strategy¶
Phase 1: Quick Wins (v21)¶
- Add Task Type CV Modifiers to Marianne
- Add Variance Tier Model to Marianne
- Add Stable Deferral Threshold to Marianne
Phase 2: Structural (v22)¶
- Add Proactive Discovery Mode to Marianne
- Add Task Type LOC Multipliers to Marianne
- Add Test Coverage (Enum) Multiplier to Marianne
Phase 3: Cross-Pollination (v23+)¶
- Port Pattern Trust Scoring to RLF
- Port Synergy Pair Validation to RLF
- Consider unified opus template
Next Steps¶
- Create
marianne-opus-convergence.yamlscore to implement convergence - Create
score-creation-skill.mddocumenting patterns from both - Update example scores with lessons learned