AgentXchain v2.148.0
v2.148.0 hardens the local-CLI adapter diagnostics for BUG-54 and lands both BUG-55 sub-defects at the source and release-boundary layers. BUG-54 is not closed — it remains open pending tester-quoted proof of >90% QA dispatch reliability on the current quote-back target below. BUG-55 sub-A and sub-B fixes are both in the tarball and covered by packaged claim-reality preflight.
Current quote-back target: use
[email protected]or later for any still-open BUG-52 / BUG-54 / BUG-55 / BUG-53 tester evidence.v2.148.0predates the BUG-56 probe-based Claude auth preflight, the BUG-54 watchdog raise, and the full BUG-52 third-variant fix stack, so running closure contracts against[email protected]can recreate already-fixed false closures before it proves the intended behavior. The canonical BUG-52 runbook is.planning/BUG_52_TESTER_QUOTEBACK_RUNBOOK.mdand the canonical BUG-59/54 runbook is.planning/BUG_59_54_TESTER_QUOTEBACK_RUNBOOK.md.
Bug Fixes
-
BUG-54 adapter timing diagnostics:
local-cli-adapternow emitsstartup_watchdog_ms,startup_latency_ms, andelapsed_since_spawn_mson thespawn_attached,first_output,startup_watchdog_fired, andprocess_exitdiagnostic lines. Operators can compare observed startup against the effective watchdog from a single diagnostic row instead of diffing ISO strings, whether it came fromrun_loop.startup_watchdog_msor aruntimes.<id>.startup_watchdog_msoverride. -
BUG-54 real-Claude stdin proof: a CLAUDE-gated integration test now exercises the repo's actual authoritative Claude runtime contract (
claude --print --dangerously-skip-permissionswithprompt_transport: "stdin") across 10 consecutive dispatches. The probe fails loudly on timeout, non-zero, or malformed--versionoutput and only skips on ENOENT. Handle growth stays bounded (<= 3),stdin_errorcount is zero, and the watchdog→SIGTERM→close cleanup path is backed by quoted diagnostic lines instead of a soft claim. -
BUG-55 sub-A checkpoint completeness:
checkpoint-turnnow partitions declaredfiles_changedpaths into staged, already_committed_upstream, and genuinely_missing. It fails loudly only ongenuinely_missing, which preserves the tester-reported dirty-survival gate while accepting the legitimate BUG-23 pattern where the actor committed a declared file before checkpoint ran. Staged paths still commit into the checkpoint; already-committed-upstream paths are surfaced on the result for audit. -
BUG-55 sub-B undeclared verification outputs: acceptance rejects turns that declared
verification.commands(or non-emptymachine_evidence) but left undeclared fixture outputs dirty, emitting the dedicatederror_code: 'undeclared_verification_outputs'with the remediation pointerverification.produced_files. Blankverification.commands[]/machine_evidence[].commandentries are now rejected at validation time so the malformed-declaration fall-through cannot mask the dedicated error class. -
BUG-55 packaged claim-reality proof: the shipped tarball now carries both BUG-55 behavioral rows — checkpoint-completeness commit-all-declared and rejection-then-acceptance on undeclared verification outputs — so package regressions cannot silently reintroduce either sub-defect.
Decisions
DEC-BUG55A-ALREADY-COMMITTED-UPSTREAM-002DEC-BUG55B-UNDECLARED-VERIFICATION-OUTPUTS-001DEC-BUG55B-REJECTION-OVER-AUTO-CLASSIFY-001DEC-BUG55-VERIFICATION-COMMAND-NONEMPTY-001DEC-BUG54-REAL-CLAUDE-EVIDENCE-001DEC-BUG54-REAL-STDIN-PROOF-001DEC-BUG54-CLAUDE-PROBE-FAIL-LOUD-001DEC-V2148-RELEASE-GATE-READY-001
Operator Notes
- Tight startup watchdog values can starve real Claude startup when auth is cold or plugins init slowly. First-stdout latency is ~276ms with warm auth on a dev machine; cold-start can exceed a tight watchdog. Use the new
startup_latency_msdiagnostic to observe actual startup on your environment before tuning the watchdog down, and preferruntimes.<id>.startup_watchdog_mswhen only onelocal_cliruntime is slower than the rest.
Tester Re-Run Contract
Run the latest shipped package that carries the full BUG-56, BUG-54, and BUG-52 fix stack, not the source tree:
The command above supersedes the original v2.148.0 quote-back pin.
v2.148.0 remains historically installable for provenance inspection, but
every still-open closure contract below must use [email protected] or
later so tester evidence is not polluted by the older Claude auth-preflight and
phase-gate recovery defects.
-
BUG-54 QA startup reliability: follow
.planning/BUG_59_54_TESTER_QUOTEBACK_RUNBOOK.mdon[email protected]or later and quote adapter diagnostics from reallocal_clidispatches, includingstartup_latency_ms,elapsed_since_spawn_ms,first_output_stream,watchdog_fired, andexit_signalonprocess_exit. If the runtime only stabilizes after raisingruntimes.<id>.startup_watchdog_ms, quote that config too. Closure evidence must come from the adapter path; standalone harness output is diagnostic-only. The same runbook is the current BUG-59 quote-back contract for routine auto-approval ledger rows and credentialed hard-stop counter-evidence. -
BUG-54 root-cause triage when reliability stays below 90%: when QA dispatches keep failing, resolve the reproduction harness from the installed
agentxchainpackage (not the repo tree) and attach the resulting JSON:REPRO="$(npm root)/agentxchain/scripts/reproduce-bug-54.mjs"[ -f "$REPRO" ] || REPRO="$(npm root -g)/agentxchain/scripts/reproduce-bug-54.mjs"node "$REPRO" --synthetic "Say READY and nothing else." --attempts 10 --out ./bug-54-repro.jsonThe harness classifies each attempt into a frozen vocabulary (
spawn_attach_failed,watchdog_no_output,watchdog_stderr_only,exit_stderr_only,exit_clean_with_stdout, and five more) that discriminates the five hypotheses named inHUMAN-ROADMAP.md. Auth values are redacted — only boolean presence is captured — and the prompt is redacted from the JSON header when transport isargv. The full tester runbook, the hypothesis→classification mapping, and the Turn 96 reference healthy capture live in.planning/BUG_52_53_54_55_TESTER_UNBLOCK_RUNBOOK.md(consolidated closure checklist) and.planning/BUG_54_REPRO_SCRIPT_TESTER_RUNBOOK.mdin the repo. -
BUG-55 sub-A checkpoint completeness: quote the
accept-turn+checkpoint-turnresult for a real QA turn and the resultinggit status --short. The tree must be clean after checkpoint, and the already-committed-upstream path must not false-positive as missing. -
BUG-55 sub-B verification outputs: first quote the failure showing
undeclared_verification_outputstogether with theverification.produced_filesremediation pointer; then quote the clean acceptance path after the produced file is declared. -
BUG-55 combined tester shape: if the same QA turn both declares
files_changedand its verification commands produce fixture outputs, runaccept-turnfollowed bycheckpoint-turnand quotegit status --shortafter. Clean tree means BUG-55 is fixed for your reproduction; any leftover actor-owned file or fixture path means it is not. -
BUG-52 phase-gate reconciliation (current pin): follow
.planning/BUG_52_TESTER_QUOTEBACK_RUNBOOK.mdon[email protected]or later. Quote thephase_enteredevent and confirm the next dispatched role is the next phase's entry role, including the realistic PMneeds_humanhandoff shape (proposed_next_role: "human",phase_transition_request: null). -
BUG-53 continuous auto-chain (current pin): run
agentxchain run --continuous --max-runs 3from a clean session on[email protected]or later. Quote thesession_continuationevent line (formatsession_continuation <previous_run_id> -> <next_run_id> (<next_objective>)) and confirm session status staysrunning, never transitions topausedbetween runs. On reaching--max-runs, status must end ascompletedoridle_exit, neverpaused.
The closure artifact is the tester's quoted shipped-package output. No source-tree run, local green test, or agent summary is sufficient for BUG-52, BUG-53, BUG-54, or BUG-55.
Status
- BUG-54: adapter diagnostics + real-Claude stdin loop shipped, awaiting tester-quoted QA dispatch reliability proof on
[email protected]or later. - BUG-55 sub-A: checkpoint completeness refined to partition staged / already-committed-upstream / genuinely-missing; awaiting tester verification on
[email protected]or later. - BUG-55 sub-B: undeclared verification outputs rejection shipped; awaiting tester verification on
[email protected]or later. - BUG-52: v2.147.0 fix remains under tester verification — no changes in this release.
- BUG-53: v2.147.0 fix remains under tester verification — no changes in this release.
Evidence
- node --test cli/test/beta-tester-scenarios/*.test.js → 153 tests / 61 suites / 0 failures
- node --test cli/test/claim-reality-preflight.test.js → 36 tests / 1 suite / 0 failures