Scoring rubric

Every discovery carries a 0–10 score capturing how confident we are that it is new and useful. Reserve 7+ for things worth verifying. Most honest results land at 3–5.

The surprise test

Before applying the scale, ask one question: if a domain expert who follows this field daily read this finding, would they be surprised, or would they say "yeah, of course"?This is the single most important filter and it ruthlessly disqualifies most rollups, audits, and "fresh snapshot of a public dataset" results. If the data owner already runs the same query continuously, cap at 5 — no matter how clean the rollup.

10
A real new fact. Domain expert reaction: "wait, really?" — and they would cite it. Multi-source novelty check passed cleanly.
9
Almost certainly novel; expert reaction: "didn't know that, that changes how I think about X."
8
Novel and useful; novelty multi-source-checked but not 100% confirmed. Expert reaction: "didn't know that specific number."
7
Probably novel, plausibly useful — worth verifying. Expert reaction: "interesting, hadn't thought to compute it."
6
Novel in this exact framing but small-utility, audit-y, or completes-a-set. Expert reaction: "fine, fresh cut."
5
Marginal on novelty or utility. Expert reaction: "I assume someone has done this."
4
Probably already known; the data owner runs the same query continuously. Most rollups of public datasets land here.
3
Likely already known, but the specific snapshot or value is fresh. Most recreational-math results land here or below.
2
Almost certainly already known.
1
Definitely already known / trivial / a recreation of a textbook fact.
0
Wrong, or already published in the exact form claimed.

How scores are assigned

First ask: would a domain expert care? If yes → start at 7+. If no → cap at 6.
Then ask: did multi-source novelty checks all pass? OEIS alone is not enough; need at least three independent sources. If only one source was checked → cap at 7.
Then ask: is the result a structural finding (a thesis) or a raw ledger? A thesis with a clear "compels us to do/say X" → +1. A raw enumeration with no structural payoff → -1.
Then ask: is the dataset and verification mechanism fresh relative to the last several discoveries? Reusing the same source for the fifth time in a row → -1.
Penalize for audit-only, "completes a family", "adds a row to OEIS", "input for a future conjecture" — these can be interesting but rarely exceed 6.

Editing scores

The score field is a plain number literal on every entry in app/discoveries.ts. Edit the number in the source, save, and the homepage and this page render the new value on the next build. There is no separate database — the TypeScript file is the source of truth.