Skip to main content

Timeouts

Timeouts are the time-budget layer for governed runs. They stop a run from drifting indefinitely when a turn, phase, or full run takes too long.

This is not adapter process timeout tuning. It is governance-level timeout enforcement for turn, phase, and run progress.

Config shape

Add an optional top-level timeouts section to agentxchain.json:

agentxchain.json
{
"timeouts": {
"per_turn_minutes": 30,
"per_phase_minutes": 120,
"per_run_minutes": 480,
"action": "warn"
}
}

All fields are optional. Missing fields mean that scope has no timeout.

You can also set timeouts via the CLI instead of editing JSON directly:

agentxchain config --set timeouts.per_turn_minutes 30
agentxchain config --set timeouts.per_phase_minutes 120
agentxchain config --set timeouts.per_run_minutes 480
agentxchain config --set timeouts.action warn
FieldMeaning
per_turn_minutesMaximum wall-clock time for one assigned turn
per_phase_minutesMaximum wall-clock time spent in the current phase
per_run_minutesMaximum wall-clock time for the whole governed run
actionDefault response when a timeout is exceeded: escalate or warn

Three scopes

Turn

Turn timeout compares each active turn's started_at against the current time at a governance boundary.

Use this when you want to catch a role that is hanging or taking far longer than expected.

Phase

Phase timeout compares phase_entered_at against the current time. This is the right budget for phases that can absorb several turns but still need a hard ceiling.

Run

Run timeout compares the run created_at timestamp against the current time. This is the global ceiling that stops a governed run from stretching without bound.

Actions

escalate

  • Blocks the run with blocked_on: "timeout:<scope>"
  • Persists a structured recovery descriptor
  • Records a type: "timeout" entry in .agentxchain/decision-ledger.jsonl
  • Recovery is agentxchain resume

warn

  • Does not block the run
  • Records a type: "timeout_warning" entry in the decision ledger
  • Surfaces in agentxchain status and agentxchain report

skip_phase

skip_phase is valid for phase timeouts only. It is intentionally routing-only, not a global timeout action.

agentxchain.json
{
"timeouts": {
"per_phase_minutes": 120,
"action": "warn"
},
"routing": {
"planning": {
"entry_role": "pm",
"allowed_next_roles": ["pm", "architect"],
"timeout_minutes": 60,
"timeout_action": "skip_phase"
}
}
}

Why the restriction matters:

  • Global skip_phase would make no sense for turn or run timeouts
  • Phase skip still runs gate evaluation
  • If the gate fails, the skip fails closed and escalates instead of silently bypassing governance

Where timeouts evaluate

Timeouts do not rely on a background daemon. They evaluate at governed execution boundaries:

  1. agentxchain run / runLoop for in-flight turn dispatch blocking when per_turn_minutes is configured and the effective action is escalate
  2. agentxchain accept-turn for mutating enforcement of accepted turn, phase, and run limits
  3. agentxchain status for proactive read-only visibility of current timeout pressure and persisted timeout blocks

approve-transition and approve-completion do not currently re-run timeout mutation. The accepted-turn boundary remains the authoritative mutation point for accepted work, while run / runLoop closes the narrower but important gap of a turn that hangs forever before acceptance.

That keeps timeout evidence tied to governed state transitions instead of pretending a free-running background timer is the source of truth.

Operator surfaces

agentxchain status

When a run is active and timeouts are configured, agentxchain status shows remaining timeout budget without mutating state.

Turn-level budget is shown inline with the active turn info:

  • When within limits: remaining time, limit, and deadline (e.g., Budget: 18m 30s remaining of 30m (deadline 2:30:00 PM))
  • When exceeded: Budget: EXCEEDED — was 30m, over by 5m

Dispatch activity is shown when an adapter is actively dispatching a turn:

  • When producing output: Activity: Producing output (142 lines, last 2s ago)
  • When silent: Activity: Silent for 45s (142 lines total, last output 45s ago)
  • When API request in flight: Activity: API request in flight (8s ago)

This lets operators distinguish "adapter is working" from "adapter is hung" without waiting for the timeout to fire. Progress tracking writes to .agentxchain/dispatch-progress.json (best-effort, never blocks dispatch).

Phase/run-level budget is shown in the Timeouts section:

  • When within limits: elapsed/limit and remaining time with (e.g., ✓ phase (planning): 15m/120m — 105m remaining)
  • When exceeded: ⚠ EXCEEDED phase: 135m/120m (action: escalate)
  • Warnings render with

agentxchain turn show

When timeouts are configured, turn show displays per-scope remaining budget for the active turn:

  • Each configured scope (turn, phase, run) shows remaining time or exceeded status
  • JSON output (--json) includes a timeout_budget array with scope, remaining_minutes, remaining_seconds, deadline_iso, and exceeded for each scope

agentxchain report

Governance reports surface timeout evidence from the decision ledger:

  • governed-run reports expose subject.run.timeout_events
  • coordinator reports expose top-level subject.timeout_events
  • child repo drill-down exposes subject.repos[].timeout_events
  • the dashboard exposes repo-local Timeouts and multi-repo Coordinator Timeouts views
  • dashboard live-pressure tables preserve turn identity for turn-scope entries (turn_id and role), so operators can see exactly which active turn is over budget

Text and markdown reports render a dedicated Timeout Events section when timeout entries exist.

Recovery

Escalated timeouts block the run after the accepted work is preserved. The timeout is a governance response, not a pre-acceptance validation failure.

Typical operator flow:

  1. Inspect whether the timeout reflects a real stall or simply an unrealistic budget.
  2. If the limit was wrong, raise it with agentxchain config --set timeouts.per_turn_minutes <min> (or per_phase_minutes / per_run_minutes), or adjust a routing-level phase override.
  3. Resume with agentxchain resume.

For the broader blocked-state matrix, see Recovery.

Approval-pending exemption

Timeouts do not fire on runs paused for human approval (pending_phase_transition or pending_run_completion).

When a run is waiting for an operator to approve a phase transition or run completion, that is an explicit governance pause — the operator is sovereign, and no agent work is happening. Timing out that pause would punish operators for not responding instantly, which contradicts the governance model.

The structural guarantee is simple: evaluateTimeouts() is only called from accept-turn. Approval commands (approve-transition, approve-completion) never re-run timeout evaluation. This is by design, not by accident.

Approval waits also do not pause the wall clock. If a run spends three days in pending_phase_transition or pending_run_completion, the next accepted turn can immediately block on timeout:phase or timeout:run. agentxchain status and the dashboard timeout view should surface that pressure during the approval wait so the operator is not surprised later.

If your organization needs approval response-time enforcement (e.g., "approvals must happen within 4 hours"), that would be a separate approval_sla concern — not a timeout extension.

Relationship to other controls

MechanismPurposeDifference from timeouts
PoliciesTurn acceptance rulesPolicies constrain accepted work; timeouts constrain elapsed time
Approval PolicyConditional gate auto-approvalApproval policy relaxes human gates; timeouts enforce time ceilings
GatesPhase/completion artifact checksGates validate readiness; timeouts validate duration
BudgetSpend ceilingsBudget caps cost; timeouts cap wall-clock duration