The number, explained Reviewed June 2026

What is the Surprise Score?

The Surprise Score is a 0–100 number measuring how unlikely it is that two Wikipedia articles sit as few clicks apart as they do. High means the closeness is improbable; low means it is exactly what you'd expect. It measures surprise, not distance.

Six bands. Fixed for everyone. A 91 is a 91 whoever charts it.
100
Are you kidding me?
89
Wildly unexpected
77
Genuinely surprising
61
Didn't see that coming
44
A bit of a stretch
27
Pretty much expected
High vs low

In one breath

A high Surprise Score is the screenshot. It says: these two things had no business being this close, and yet here is the live, verifiable chain proving they are. A pharaoh and a pop star, four clicks apart, scores high — because nothing about ancient Egypt predicts a modern chart-topper, and a short path between them is genuinely against the odds.

A low Surprise Score is the shrug. Two footballers one click apart is obvious — they share a league, a country, a tournament; of course Wikipedia stitched them together. The number isn't punishing the connection for being real. It's telling you the truth: you'd have guessed this one. The score climbs when two unrelated worlds turn out to be neighbours, and sinks when the closeness was always going to be there. Surprise, not distance.

The ladder

The six bands — and what each one feels like

The labels carry the verdict; the number is the receipt. From the top of the ladder down:

Are you kidding me? 90–100
The rarest air. Two subjects from opposite ends of human knowledge, joined by a chain so short it reads like a typo.
Wildly unexpected 78–89
A connection you'd have bet money against — the kind you send to one specific friend.
Genuinely surprising 62–77
Not a fluke, not obvious — a real huh. Distant enough to delight, clean enough to trust.
Didn't see that coming 45–61
A pleasant turn. You could half-imagine these two were linked, but not this directly.
A bit of a stretch 28–44
Plausible on reflection. Same broad neighbourhood of the world; the path confirms a hunch.
Pretty much expected 0–27
The honest floor. Close because they were always going to be close.
The hub penalty

Why a hub-cheat route scores low

A route that sneaks through a mega-hub — United States, World War II, English language — is penalised, and that penalty is the reason the score can be trusted. Almost everything links to those giants. “X is one click from World War II” is true of half of Wikipedia and tells you nothing about X.

So the score asks a harder question than “is there a path?” It asks: is there a path through telling stops rather than lazy ones? WikiPath routes through the least-hubby shortest chain it can find, and the obscurity term rewards exactly that — judged relative to the graph's own structure, so a node counts as a hub by how its link-degree compares to the rest of Wikipedia, not against some arbitrary cutoff. This is why two pairs can be the same number of clicks apart and score wildly differently. Clicks are cheap. An earned short path — through specific, unexpected, low-traffic territory — is what the high scores are made of.

The formula

How the number is actually built

The Surprise Score fuses three measurements, then squashes the result onto a 0–100 scale. Stated as the engine computes it: semantic distance, divided by path length, multiplied by route obscurity — then passed through a logistic curve.

semantic distance ÷ path length × route obscurity 0–100

Semantic distance is the headline signal: how far apart the two endpoints feel. We embed each article — its title plus a short lead definition — with a sentence model (mxbai-embed-large-v1), then measure the cosine angle between the two. A pharaoh and a pop star sit at a wide angle; two footballers sit almost on top of each other. Path length divides the score down gently — distant things being far apart is expected. Route obscurity is the hub penalty above. The logistic curve at the end turns the raw value into the familiar 0–100 you screenshot — keeping the middle responsive and the extremes earned. Semantic distance leads; length and obscurity refine.

Worked examples

Four routes, four verdicts

Illustrations of the bands, not fixed scorecards — the live number depends on the exact path the engine charts on the day. The pattern is the point: foreign + close + earned = high; familiar + close + lazy = low.

Reading it

How to read your score

Read it as a verdict on the closeness, not a grade on the things. Start with the band label — it's the plain-English headline. Then let the number place you inside that band: a 79 and an 89 are both “Wildly unexpected,” but the 89 is leaning on the door of the band above.

If it's high and you're surprised — good, trust it; the path is real and the route was earned. If it's high and you expected the two to be close, you've probably found a pairing whose link runs through genuinely obscure territory — read the path, it's usually the best part. And if it's low, don't take it as a failure: a low score on a true connection is the engine being honest. The score is built to be consistent, so it's meant to be compared — chart a weirder route, beat a higher number. The seal you screenshot is the same seal everyone else is chasing.

Common questions
01 What is the Surprise Score?
The Surprise Score is a 0 to 100 number that measures how unlikely it is that two Wikipedia articles sit as few clicks apart as they do. High means the closeness is improbable — two things that feel a world apart turn out to be neighbours. Low means it is exactly what you would expect.
02 What does a high Surprise Score mean?
A high Surprise Score means two unrelated subjects are connected by a chain far shorter than their distance suggests — the screenshot case. A pharaoh and a pop star four clicks apart scores high, because nothing about ancient Egypt predicts a modern chart-topper, yet the path between them is real and earned.
03 What does a low Surprise Score mean?
A low Surprise Score means the closeness was always going to be there. Two footballers one click apart score low — they share a sport, a league, a tournament, so of course Wikipedia links them. The number is not punishing a real connection; it is telling you that you would have guessed this one.
04 What are the six Surprise Score bands?
From highest to lowest: Are you kidding me? (90–100), Wildly unexpected (78–89), Genuinely surprising (62–77), Didn't see that coming (45–61), A bit of a stretch (28–44), and Pretty much expected (0–27). The bands are fixed and identical for every connection, whoever charts it.
05 How is the Surprise Score calculated?
It fuses three measurements: semantic distance (how far apart the two articles feel, from sentence embeddings), divided by path length (the number of clicks), multiplied by route obscurity (a penalty for cheating through mega-hubs). The result is squashed onto a 0 to 100 scale by a logistic curve, with semantic distance leading.
06 Why does a route through a mega-hub score lower?
Because almost everything links to giants like World War II or the United States, so a path through one tells you nothing. WikiPath routes through the least-hubby chain it can find and penalises hub-heavy routes — and obscurity is judged relative to Wikipedia's own link structure, so an earned path through specific, low-traffic articles scores higher than a coincidence of scale.
07 Does a higher Surprise Score mean the two things are farther apart?
No. The score measures surprise, not distance. It asks how unlikely it is that two foreign subjects are close, not how many clicks separate them. Two pairs the same number of clicks apart can score wildly differently depending on how distant the subjects feel and whether the route was earned or laundered through a hub.
08 Is the Surprise Score the same for everyone?
Yes. The six bands and their thresholds are fixed and published, and the formula is identical for every pair. A 91 is a 91 whether you found the connection or WikiPath did — which is exactly what makes scores worth comparing across the Atlas, the Daily Connection, and a friend's Challenge.