Stats

Coverage, discovery confidence, audit-flag distribution, and compute spend for the genome-wide run.

19,293

Genes annotated

99.98% of 19,296

270,143

Discoveries

avg 14.0 per gene

653,396

Source papers

distinct PMIDs in corpus

$2,505

Total spend

$0.130/gene

Discovery confidence

Tier × preponderance · 270,143 findings

High

113,996 · 42.2%

Medium

138,950 · 51.4%

Low

17,197 · 6.4%

Assigned per-finding by the reading pass from method tier and weight of evidence. High = direct biochemical/structural/genetic proof; Low = single-paper observational claim.

Audit flags

Narratives the deterministic concordance detector held for revision

Identity — wrong gene	46
Grounding — ungrounded claim	133
Behavior — model fault	27
Total flagged	206 (1.1%)

Ten rules (R1–R10) collapse into three tiers by what each implies: identity means the narrative concerns the wrong gene (paralog/alias collision or alt product), grounding is a claim not supported by the corpus, and behavior is a model fault such as a synthesis-stage refusal.

Faithfulness vs corpus

Per-claim grade against the abstracts the model read · cross-family judge (Prometheus-8x7b)

Claims graded	97,111 (17,360 genes)
Supported	95,121 (97.95%)
Contradicted	28 (0.029%)
Flagged (unadjudicated upper bound)	1,990 (2.05%)

Each cited narrative claim graded 1–5 against the abstract the model actually read. 2.05% flagged is an unadjudicated upper bound on error; outright contradictions are 0.029% of claims.

Pairwise quality vs UniProt

Blind, position-swapped head-to-head · cross-family judge (Prometheus-8x7b)

Affinage wins	13,229 (90.7%)
Ties	1,243 (8.5%)
UniProt wins	118 (0.8%)
Win rate over decided pairs	99.1%

Across the 14,590 genes where both Affinage and UniProt carry substantive content. Each pair is judged blind to source and run in both orderings, so a verdict only counts when it survives the position swap.

Discovery year histogram

269,955 dated findings · 1980 → 2026 · post-2020 in red

1980 2003 2026

≥ 2020: 88,534 findings (32.8% of dated).

Compute ledger

USD · batch rates · April 2026 run

Full-genome annotation $2,504.80 19,293 genes · $0.130/gene

Total LLM spend $2,504.80