tokens
tokens
¶
Token estimation and budget tracking for prompt assembly.
Provides a centralized token estimation utility and a budget tracker that enforces context window limits during prompt construction. This is the single source of truth for token estimation — all other modules (preflight, backends) should import from here rather than maintaining their own ratios.
The estimation uses a conservative chars-per-token ratio (3.5) that deliberately overestimates token counts by ~15%. This is intentional: underestimation causes context window overflow (agent gets truncated mid-instruction), while overestimation merely wastes budget (safe).
Classes¶
BudgetAllocation
dataclass
¶
TokenBudgetTracker
dataclass
¶
Tracks token budget usage during prompt assembly.
Enforces context window limits by tracking allocations as prompt components
are added. Each allocation is named (e.g., 'template', 'patterns', 'specs')
for diagnostic visibility via breakdown().
The tracker prevents silent over-allocation: allocate() returns False
when content would exceed the remaining budget, and can_fit() checks
without side effects.
Attributes¶
Functions¶
__post_init__
¶
remaining
¶
Tokens remaining in the budget.
Returns:
| Type | Description |
|---|---|
int
|
Non-negative remaining token count. Never goes below 0 |
int
|
even if over-allocation somehow occurred. |
utilization
¶
Fraction of budget used (0.0 to 1.0).
Returns:
| Type | Description |
|---|---|
float
|
Utilization ratio. Returns 0.0 for zero-budget trackers |
float
|
to avoid division by zero. |
Source code in src/marianne/core/tokens.py
can_fit
¶
Check if content fits within the remaining budget.
Does not modify the tracker state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
Any
|
Content to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the estimated token count fits within remaining budget. |
Source code in src/marianne/core/tokens.py
allocate
¶
Allocate tokens for a prompt component.
If the content fits within the remaining budget, the allocation is recorded and True is returned. If it does not fit, the allocation is rejected and False is returned — no state is modified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
Any
|
Content to allocate budget for. |
required |
component
|
str
|
Name of the prompt component (for diagnostics). |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if allocation succeeded, False if it would exceed budget. |
Source code in src/marianne/core/tokens.py
breakdown
¶
Get per-component token allocation breakdown.
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Dict mapping component names to their allocated token counts. |
dict[str, int]
|
Components with multiple allocations are summed. |
Source code in src/marianne/core/tokens.py
Functions¶
estimate_tokens
¶
Estimate token count for arbitrary input.
Converts the input to a string representation and applies a conservative chars-per-token ratio. The estimate deliberately overestimates to prevent context window overflow.
Accepted types: str, dict, list, None. Other types are
coerced via str().
.. warning:: CJK / Non-Latin Text Underestimation
The _CHARS_PER_TOKEN = 3.5 ratio is calibrated for English text.
CJK characters (Chinese, Japanese, Korean) typically tokenize to
approximately 1 token per character, meaning this function
underestimates CJK token counts by 3.5-7x. For example, 600 CJK
characters produce ~172 estimated tokens but consume 600-1200 actual
tokens. This can cause context window overflow for non-English content.
Fix planned: InstrumentProfile.ModelCapacity will provide per-model tokenizers or script-aware estimation ratios.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
Any
|
Input to estimate. Strings are measured directly. Dicts and lists are serialized to JSON. None returns 0. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count (always >= 0). |
Source code in src/marianne/core/tokens.py
get_effective_window_size
¶
Get the effective input token budget for a model/instrument combination.
Returns the context window size minus output token reservation. When both model and instrument are provided, returns the minimum of the two — the instrument may impose a stricter limit than the model's native window.
For unknown models/instruments, returns a conservative default.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str | None
|
Model name or identifier. None uses the default window. |
None
|
instrument
|
str | None
|
Instrument (backend) name. None imposes no instrument limit. Unknown instruments impose no additional limit. |
None
|
Returns:
| Type | Description |
|---|---|
int
|
Effective input token budget (always > 0). |