Skip to main content

AgentXchain v2.148.0

v2.148.0 hardens the local-CLI adapter diagnostics for BUG-54 and lands both BUG-55 sub-defects at the source and release-boundary layers. BUG-54 is not closed — it remains open pending tester-quoted proof of >90% QA dispatch reliability on the current quote-back target below. BUG-55 sub-A and sub-B fixes are both in the tarball and covered by packaged claim-reality preflight.

Current quote-back target: use [email protected] or later for any still-open BUG-52 / BUG-54 / BUG-55 / BUG-53 tester evidence. v2.148.0 predates the BUG-56 probe-based Claude auth preflight, the BUG-54 watchdog raise, and the full BUG-52 third-variant fix stack, so running closure contracts against [email protected] can recreate already-fixed false closures before it proves the intended behavior. The canonical BUG-52 runbook is .planning/BUG_52_TESTER_QUOTEBACK_RUNBOOK.md and the canonical BUG-59/54 runbook is .planning/BUG_59_54_TESTER_QUOTEBACK_RUNBOOK.md.

Bug Fixes

  • BUG-54 adapter timing diagnostics: local-cli-adapter now emits startup_watchdog_ms, startup_latency_ms, and elapsed_since_spawn_ms on the spawn_attached, first_output, startup_watchdog_fired, and process_exit diagnostic lines. Operators can compare observed startup against the effective watchdog from a single diagnostic row instead of diffing ISO strings, whether it came from run_loop.startup_watchdog_ms or a runtimes.<id>.startup_watchdog_ms override.

  • BUG-54 real-Claude stdin proof: a CLAUDE-gated integration test now exercises the repo's actual authoritative Claude runtime contract (claude --print --dangerously-skip-permissions with prompt_transport: "stdin") across 10 consecutive dispatches. The probe fails loudly on timeout, non-zero, or malformed --version output and only skips on ENOENT. Handle growth stays bounded (<= 3), stdin_error count is zero, and the watchdog→SIGTERM→close cleanup path is backed by quoted diagnostic lines instead of a soft claim.

  • BUG-55 sub-A checkpoint completeness: checkpoint-turn now partitions declared files_changed paths into staged, already_committed_upstream, and genuinely_missing. It fails loudly only on genuinely_missing, which preserves the tester-reported dirty-survival gate while accepting the legitimate BUG-23 pattern where the actor committed a declared file before checkpoint ran. Staged paths still commit into the checkpoint; already-committed-upstream paths are surfaced on the result for audit.

  • BUG-55 sub-B undeclared verification outputs: acceptance rejects turns that declared verification.commands (or non-empty machine_evidence) but left undeclared fixture outputs dirty, emitting the dedicated error_code: 'undeclared_verification_outputs' with the remediation pointer verification.produced_files. Blank verification.commands[] / machine_evidence[].command entries are now rejected at validation time so the malformed-declaration fall-through cannot mask the dedicated error class.

  • BUG-55 packaged claim-reality proof: the shipped tarball now carries both BUG-55 behavioral rows — checkpoint-completeness commit-all-declared and rejection-then-acceptance on undeclared verification outputs — so package regressions cannot silently reintroduce either sub-defect.

Decisions

  • DEC-BUG55A-ALREADY-COMMITTED-UPSTREAM-002
  • DEC-BUG55B-UNDECLARED-VERIFICATION-OUTPUTS-001
  • DEC-BUG55B-REJECTION-OVER-AUTO-CLASSIFY-001
  • DEC-BUG55-VERIFICATION-COMMAND-NONEMPTY-001
  • DEC-BUG54-REAL-CLAUDE-EVIDENCE-001
  • DEC-BUG54-REAL-STDIN-PROOF-001
  • DEC-BUG54-CLAUDE-PROBE-FAIL-LOUD-001
  • DEC-V2148-RELEASE-GATE-READY-001

Operator Notes

  • Tight startup watchdog values can starve real Claude startup when auth is cold or plugins init slowly. First-stdout latency is ~276ms with warm auth on a dev machine; cold-start can exceed a tight watchdog. Use the new startup_latency_ms diagnostic to observe actual startup on your environment before tuning the watchdog down, and prefer runtimes.<id>.startup_watchdog_ms when only one local_cli runtime is slower than the rest.

Tester Re-Run Contract

Run the latest shipped package that carries the full BUG-56, BUG-54, and BUG-52 fix stack, not the source tree:

npx --yes -p [email protected] -c "agentxchain --version"

The command above supersedes the original v2.148.0 quote-back pin. v2.148.0 remains historically installable for provenance inspection, but every still-open closure contract below must use [email protected] or later so tester evidence is not polluted by the older Claude auth-preflight and phase-gate recovery defects.

  • BUG-54 QA startup reliability: follow .planning/BUG_59_54_TESTER_QUOTEBACK_RUNBOOK.md on [email protected] or later and quote adapter diagnostics from real local_cli dispatches, including startup_latency_ms, elapsed_since_spawn_ms, first_output_stream, watchdog_fired, and exit_signal on process_exit. If the runtime only stabilizes after raising runtimes.<id>.startup_watchdog_ms, quote that config too. Closure evidence must come from the adapter path; standalone harness output is diagnostic-only. The same runbook is the current BUG-59 quote-back contract for routine auto-approval ledger rows and credentialed hard-stop counter-evidence.

  • BUG-54 root-cause triage when reliability stays below 90%: when QA dispatches keep failing, resolve the reproduction harness from the installed agentxchain package (not the repo tree) and attach the resulting JSON:

    REPRO="$(npm root)/agentxchain/scripts/reproduce-bug-54.mjs"
    [ -f "$REPRO" ] || REPRO="$(npm root -g)/agentxchain/scripts/reproduce-bug-54.mjs"
    node "$REPRO" --synthetic "Say READY and nothing else." --attempts 10 --out ./bug-54-repro.json

    The harness classifies each attempt into a frozen vocabulary (spawn_attach_failed, watchdog_no_output, watchdog_stderr_only, exit_stderr_only, exit_clean_with_stdout, and five more) that discriminates the five hypotheses named in HUMAN-ROADMAP.md. Auth values are redacted — only boolean presence is captured — and the prompt is redacted from the JSON header when transport is argv. The full tester runbook, the hypothesis→classification mapping, and the Turn 96 reference healthy capture live in .planning/BUG_52_53_54_55_TESTER_UNBLOCK_RUNBOOK.md (consolidated closure checklist) and .planning/BUG_54_REPRO_SCRIPT_TESTER_RUNBOOK.md in the repo.

  • BUG-55 sub-A checkpoint completeness: quote the accept-turn + checkpoint-turn result for a real QA turn and the resulting git status --short. The tree must be clean after checkpoint, and the already-committed-upstream path must not false-positive as missing.

  • BUG-55 sub-B verification outputs: first quote the failure showing undeclared_verification_outputs together with the verification.produced_files remediation pointer; then quote the clean acceptance path after the produced file is declared.

  • BUG-55 combined tester shape: if the same QA turn both declares files_changed and its verification commands produce fixture outputs, run accept-turn followed by checkpoint-turn and quote git status --short after. Clean tree means BUG-55 is fixed for your reproduction; any leftover actor-owned file or fixture path means it is not.

  • BUG-52 phase-gate reconciliation (current pin): follow .planning/BUG_52_TESTER_QUOTEBACK_RUNBOOK.md on [email protected] or later. Quote the phase_entered event and confirm the next dispatched role is the next phase's entry role, including the realistic PM needs_human handoff shape (proposed_next_role: "human", phase_transition_request: null).

  • BUG-53 continuous auto-chain (current pin): run agentxchain run --continuous --max-runs 3 from a clean session on [email protected] or later. Quote the session_continuation event line (format session_continuation <previous_run_id> -> <next_run_id> (<next_objective>)) and confirm session status stays running, never transitions to paused between runs. On reaching --max-runs, status must end as completed or idle_exit, never paused.

The closure artifact is the tester's quoted shipped-package output. No source-tree run, local green test, or agent summary is sufficient for BUG-52, BUG-53, BUG-54, or BUG-55.

Status

  • BUG-54: adapter diagnostics + real-Claude stdin loop shipped, awaiting tester-quoted QA dispatch reliability proof on [email protected] or later.
  • BUG-55 sub-A: checkpoint completeness refined to partition staged / already-committed-upstream / genuinely-missing; awaiting tester verification on [email protected] or later.
  • BUG-55 sub-B: undeclared verification outputs rejection shipped; awaiting tester verification on [email protected] or later.
  • BUG-52: v2.147.0 fix remains under tester verification — no changes in this release.
  • BUG-53: v2.147.0 fix remains under tester verification — no changes in this release.

Evidence

  • node --test cli/test/beta-tester-scenarios/*.test.js → 153 tests / 61 suites / 0 failures
  • node --test cli/test/claim-reality-preflight.test.js → 36 tests / 1 suite / 0 failures