Veriscope is a CLI-first training monitor that emits auditable capsule bundles (JSON + reports) and enforces comparability gates for cross-run diffs—an artifact contract for go/no-go training decisions, not another dashboard.
# Run monitored training (runner wrapper)
$ veriscope run gpt --outdir run_01 -- --dataset tiny --seed 123
# Validate the capsule bundle
$ veriscope validate run_01
# Diff two runs only when comparable under contract
$ veriscope diff run_01 run_02
Veriscope’s core output is a capsule bundle you can validate, report, and diff. Comparisons are only allowed when runs are comparable under the contract (non-partial, matching recomputed window-signature hash, matching preset identity, and governance checks in compare workflows).
Standard outputs: window_signature.json, results.json, results_summary.json, plus optional governance/provenance artifacts.
Machine-verifiable and shareable.
Contract-aligned CLI workflows: validate capsule integrity, generate reports (text/markdown), and inspect key fields without trusting embedded hashes.
Cross-run diffs are only allowed when runs are comparable under contract: non-partial, matching recomputed window-signature hash, preset identity, and governance integrity checks in compare workflows.
Record manual judgements and notes as overlays (append-only governance logs) without altering raw artifacts. “What happened” and “what we decided” remain separable and auditable.
Calibration emits a report artifact (FAR/delay/overhead + constraint checks) and either a preset candidate or explicit rejection. Validity is scoped to a specific window-signature hash.
CLI runners for CIFAR (PyTorch), GPT (nanoGPT), and HF (Transformers), plus a pilot harness for control vs injected-pathology runs with shareable outputs.
Declare a window signature, run alongside training, emit capsule artifacts, validate them, then report and diff only when the contract says it’s legitimate. Governance overlays keep manual judgements auditable without mutating raw outputs.
# 1) Run a monitored training job (runner wrappers)
$ veriscope run gpt --outdir OUTDIR -- --dataset tiny --seed 123
# 2) Validate the capsule bundle
$ veriscope validate OUTDIR
# 3) Produce a shareable report
$ veriscope report OUTDIR --format md > report.md
# 4) Diff two runs only when comparable under contract
$ veriscope diff OUTDIR_A OUTDIR_B
# 5) Record a manual judgement without mutating raw artifacts
$ veriscope override OUTDIR --status pass --reason "Known infra noise"
Produce verifiable artifacts, gate cross-run diffs by contract, and score FAR/delay/overhead when control + injected runs are available.