Open-source — Veriscope (AGPL + Commercial)

Contract-first training reliability

Veriscope is a CLI-first training monitor that emits auditable capsule bundles (JSON + reports) and enforces comparability gates for cross-run diffs—an artifact contract for go/no-go training decisions, not another dashboard.

Quick Start Pilot kit docs

veriscope — quick preview

# Run monitored training (runner wrapper)
$ veriscope run gpt --outdir run_01 -- --dataset tiny --seed 123

# Validate the capsule bundle
$ veriscope validate run_01

# Diff two runs only when comparable under contract
$ veriscope diff run_01 run_02

Scoped to an explicit window signature and calibration protocol; comparisons are gated by a contract-defined comparability predicate.

A capsule contract, not a dashboard

Veriscope’s core output is a capsule bundle you can validate, report, and diff. Comparisons are only allowed when runs are comparable under the contract (non-partial, matching recomputed window-signature hash, matching preset identity, and governance checks in compare workflows).

Auditable capsule bundles

Standard outputs: window_signature.json, results.json, results_summary.json, plus optional governance/provenance artifacts. Machine-verifiable and shareable.

Validate / Report / Inspect

Contract-aligned CLI workflows: validate capsule integrity, generate reports (text/markdown), and inspect key fields without trusting embedded hashes.

Diffs gated by comparability

Cross-run diffs are only allowed when runs are comparable under contract: non-partial, matching recomputed window-signature hash, preset identity, and governance integrity checks in compare workflows.

Governance without mutation

Record manual judgements and notes as overlays (append-only governance logs) without altering raw artifacts. “What happened” and “what we decided” remain separable and auditable.

Window-scoped calibration

Calibration emits a report artifact (FAR/delay/overhead + constraint checks) and either a preset candidate or explicit rejection. Validity is scoped to a specific window-signature hash.

Runners + pilot kit

CLI runners for CIFAR (PyTorch), GPT (nanoGPT), and HF (Transformers), plus a pilot harness for control vs injected-pathology runs with shareable outputs.

The full workflow

Declare a window signature, run alongside training, emit capsule artifacts, validate them, then report and diff only when the contract says it’s legitimate. Governance overlays keep manual judgements auditable without mutating raw outputs.

Decisions live in artifacts (pass/warn/fail/skip), not only in exit codes
Comparability gates prevent “silent drift” comparisons
Governance overlays preserve raw artifacts (no mutation)
Calibration protocol produces a report artifact + preset candidate/rejection

veriscope — full workflow

# 1) Run a monitored training job (runner wrappers)
$ veriscope run gpt --outdir OUTDIR -- --dataset tiny --seed 123

# 2) Validate the capsule bundle
$ veriscope validate OUTDIR

# 3) Produce a shareable report
$ veriscope report OUTDIR --format md > report.md

# 4) Diff two runs only when comparable under contract
$ veriscope diff OUTDIR_A OUTDIR_B

# 5) Record a manual judgement without mutating raw artifacts
$ veriscope override OUTDIR --status pass --reason "Known infra noise"

Python 3.9+

Runtime

CLI-first

Run · Validate · Report · Diff

CPU & GPU

Local + cloud

Dual license

AGPL-3.0 + Commercial