Stats
Coverage, discovery confidence, audit-flag distribution, and compute spend for the genome-wide run.
Assigned per-finding by the reading pass from method tier and weight of evidence. High = direct biochemical/structural/genetic proof; Low = single-paper observational claim.
| Identity — wrong gene | 46 |
| Grounding — ungrounded claim | 133 |
| Behavior — model fault | 27 |
| Total flagged | 206 (1.1%) |
Ten rules (R1–R10) collapse into three tiers by what each implies: identity means the narrative concerns the wrong gene (paralog/alias collision or alt product), grounding is a claim not supported by the corpus, and behavior is a model fault such as a synthesis-stage refusal.
| Claims graded | 97,111 (17,360 genes) |
| Supported | 95,121 (97.95%) |
| Contradicted | 28 (0.029%) |
| Flagged (unadjudicated upper bound) | 1,990 (2.05%) |
Each cited narrative claim graded 1–5 against the abstract the model actually read. 2.05% flagged is an unadjudicated upper bound on error; outright contradictions are 0.029% of claims.
| Affinage wins | 13,229 (90.7%) |
| Ties | 1,243 (8.5%) |
| UniProt wins | 118 (0.8%) |
| Win rate over decided pairs | 99.1% |
Across the 14,590 genes where both Affinage and UniProt carry substantive content. Each pair is judged blind to source and run in both orderings, so a verdict only counts when it survives the position swap.
≥ 2020: 88,534 findings (32.8% of dated).