Affinage

Stats

Coverage, discovery confidence, audit-flag distribution, and compute spend for the genome-wide run.

19,293
Genes annotated
99.98% of 19,296
270,143
Discoveries
avg 14.0 per gene
653,396
Source papers
distinct PMIDs in corpus
$2,505
Total spend
$0.130/gene
Discovery confidence
Tier × preponderance · 270,143 findings
High
113,996 · 42.2%
Medium
138,950 · 51.4%
Low
17,197 · 6.4%

Assigned per-finding by the reading pass from method tier and weight of evidence. High = direct biochemical/structural/genetic proof; Low = single-paper observational claim.

Audit flags
Narratives the deterministic concordance detector held for revision
Identity — wrong gene 46
Grounding — ungrounded claim 133
Behavior — model fault 27
Total flagged 206 (1.1%)

Ten rules (R1–R10) collapse into three tiers by what each implies: identity means the narrative concerns the wrong gene (paralog/alias collision or alt product), grounding is a claim not supported by the corpus, and behavior is a model fault such as a synthesis-stage refusal.

Faithfulness vs corpus
Per-claim grade against the abstracts the model read · cross-family judge (Prometheus-8x7b)
Claims graded 97,111 (17,360 genes)
Supported 95,121 (97.95%)
Contradicted 28 (0.029%)
Flagged (unadjudicated upper bound) 1,990 (2.05%)

Each cited narrative claim graded 1–5 against the abstract the model actually read. 2.05% flagged is an unadjudicated upper bound on error; outright contradictions are 0.029% of claims.

Pairwise quality vs UniProt
Blind, position-swapped head-to-head · cross-family judge (Prometheus-8x7b)
Affinage wins 13,229 (90.7%)
Ties 1,243 (8.5%)
UniProt wins 118 (0.8%)
Win rate over decided pairs 99.1%

Across the 14,590 genes where both Affinage and UniProt carry substantive content. Each pair is judged blind to source and run in both orderings, so a verdict only counts when it survives the position swap.

Discovery year histogram
269,955 dated findings · 1980 → 2026 · post-2020 in red
1980 2003 2026

≥ 2020: 88,534 findings (32.8% of dated).

Compute ledger
USD · batch rates · April 2026 run
Full-genome annotation $2,504.80 19,293 genes · $0.130/gene
Total LLM spend $2,504.80