One number. Zero to one.
Imagine you had to build one robot that could do everything a human can do. Run, paint, tell jokes, do maths, comfort a crying friend.
0 means robots can't do any of it. 1means a single robot can match every human record. Right now, we're at about 0.05.
The Stopwatch
Absolute Records · ~50 records
Anything you can time, count, or measure. Sprint speed, push-ups, digits of pi, chess rating.
How it works
- 1We measure the robot.
- 2We measure the human.
- 3We divide.
Example
Usain Bolt runs 100m in 9.58 seconds. The fastest humanoid robot does it in 48 seconds.
Score = 9.58 ÷ 48 = 0.20
Rules
- •If the robot beats the human, the score caps at 1.0. No extra credit for being superhuman.
- •Same test, same conditions. If the human had 60 seconds, the robot gets 60 seconds.
The shareable moment
"The robot was 5× slower."
The Blind Test
Creative Records · ~11 records
Painting, music, poetry, short stories, song writing, logo design. Anything where quality is subjective.
How it works
- 1A human creates something following a brief.
- 2An AI creates the same thing, same brief.
- 3Both are shown to voters — without labels.
- 4You vote. Then we reveal.
Example
60% of voters prefer the AI painting. Only 30% correctly identified it as AI.
Score = 0.60 × (1 − 0.30) = 0.42
Rules
- •Minimum 1,000 votes before a score counts.
- •Entries submitted by verified teams, not random uploads.
- •Results lock after 1,000 votes AND 48 hours.
- •One vote per person per test.
The shareable moment
"Can YOU tell which is AI?"
The Thumbs Up
Pass/Fail Records · ~15 records
Catching a ball. Comforting an upset person. Standing on one leg. Things that aren't about being better — they're about being good enough.
How it works
- 1The robot attempts the task on video.
- 2The video is published here.
- 3You watch. You answer one question:
- 4"Did it do it? Yes or No."
Example
A robot tries to comfort a crying child. 31% of voters say yes.
Score = 0.31
Rules
- •Minimum 500 votes before a score counts.
- •Video must show the full attempt, unedited.
- •The question is always the same: did it do it?
The shareable moment
"Does this count?"
What about Mixed records?
About 5 records combine measurement with judgment — like cooking a meal (time + taste) or building a bridge (speed + structural quality). These use an Absolute score for the measurable part and a Thumbs Up vote for the quality part, averaged together.
How Scores Add Up
Record → Axis
Take the best record in each axis. If a robot scores 0.40 at chess and 0.00 at Go, the Strategy axis scores 0.40. One breakthrough proves the axis is penetrable.
Axis → Domain
Average all axis scores in a domain. Mind has 8 axes — average them all. This means a machine can't hide weakness behind one strong axis.
Domain → HRI
Average all 6 domain scores. That's the number. One number. How close is a single robot to matching everything a human can do?
Why "one robot" matters
It would be easy to score 1.0 if you could use a different machine for each task — a chess computer for chess, Atlas for backflips, a surgical robot for stitching. But that's not the question.
The question is: can you build one machine that does it all?
A chess engine can't catch a ball. Atlas can't write a poem. GPT can't do a push-up. The score will stay low until someone builds a generalist.
Built to Share
The Vote
Every Blind Test and Pass/Fail attempt is a public vote. You're not watching — you're judging.
The Reveal
10,000 people voted that Painting A was "obviously human." The curtain drops. It was the robot. That moment is the content.
The Number
"We're at 0.05." Simple enough for a headline, specific enough to track over time.
The Debate
"Does that robot backflip really count?" Pass/fail votes generate arguments. Arguments generate engagement.
The Leaderboard
Which domain is closest to 1.0? Mind leads. Body lags. Why? That's a conversation.
The Update
When a new AI achievement drops, the score changes. "HRI just jumped from 0.05 to 0.07." That's a news story.
The 10-year-old version
What is the Human Record Index?
A score that tells you how close robots are to being as good as humans at everything.
What's the score right now?
About 0.05 out of 1.00. Robots are 5% of the way there.
How do you work it out?
Three ways. Races and tests — time the robot and compare. Secret voting — a human and a robot both make something, you vote for the best one without knowing which is which. Yes or no — watch a robot try to catch a ball and vote if it worked.
Why one robot?
Because using different robots for different tasks is cheating. The whole point is: can you build one machine that does it all?
Can I vote?
Yes. Every Blind Test and every Pass/Fail is voted on by real people right here.
Blind Test Formula — The Details
The Blind Test score captures two things at once: quality(do people prefer the AI's work?) and deception(can people tell it's AI?).
AI preference % — the fraction of voters who chose the AI entry as better. If nobody prefers it, the score is zero regardless of how good the disguise is.
Detection rate — the fraction of voters who correctly identified which entry was AI. If everyone can tell, the multiplier drops to zero. If nobody can tell, the multiplier is 1.0.
This means an AI scores highest when people prefer its work and can't tell it's not human. That's the real test.
Bad AI
0.20 × (1 − 0.90) = 0.02
Nobody likes it, everyone spots it
Good but obvious
0.70 × (1 − 0.80) = 0.14
People like it but know it's AI
Indistinguishable
0.55 × (1 − 0.45) = 0.30
Slight preference, coin-flip detection