Two AIs Drew the Same Sky. One Looked Up, One Looked Down.

We handed two of today's most capable models the exact same one-paragraph prompt: build a working planetarium from scratch — no libraries, no lookups, just raw orbital math — and show tonight's sky. Both nailed it. Both spun up a gorgeous, fully interactive star map with the Sun, the Moon, and all five naked-eye planets in mathematically correct positions. Then we put them side by side and caught something strange: the two skies were mirror images, turning in opposite directions.

TLDR: We gave Claude Opus 4.8 and Claude Fable 5 the same prompt and got two beautiful, correct-looking planetariums that render the sky in opposite orientations — and only one matched what we actually asked for. Below: what split them, why it matters for every answer you trust to AI, and a prompt that turns the lesson into a habit.

Want to see exactly where the two planetariums diverge? The live side-by-side comparison — including the source code both models wrote — is here: The Planetarium Test →

Both Got the Stars Right. Only One Got the Vantage Right.

Every measurable thing agreed. Both models drew a new moon lit at roughly 3%. Both placed Mercury, Venus, and Jupiter east of the Sun, and Mars and Saturn to the west. Both put all seven bodies above the horizon, exactly where they belong tonight. The astronomy was identical — and correct. The split was orientation. Our prompt asked them to “show the sky chart with horizon and azimuth” — and a chart bounded by a horizon is, by definition, the view of someone standing on the ground looking up at the dome. Claude Fable 5 drew exactly that: east on the left, the way a real star chart reads when you tip your head back. Claude Opus 4.8 rendered the same correct sky as if you were hovering above it looking down at a map: east on the right. One character in the projection code, and the entire sky flipped.

This Is Why You Run It Twice

Neither model is broken. Both are stunning — a year ago, writing a planetarium from first principles in a single file was a research project, not a prompt you fire off before coffee. The takeaway isn't fear. As models get more polished, their misses get more confident and better-looking. Polish is converging fast. Correctness isn't — quite. Which makes the cheapest quality check in all of AI almost embarrassingly simple: ask two models the same thing and look at where they disagree. Agreement is your green light. Disagreement is a flashing arrow pointing straight at the one spot you need to check.

How to Catch It in 30 Seconds

Run anything that actually matters — a claim, a calculation, a plan — through two different models, not one. Skim for where they diverge; ignore the 90% they agree on and zero in on the 10% they don't. At each split, ask which one matched what you actually asked for, not which one sounds more sure. The planetarium is a perfect teacher because the mistake is visible. Most of the time it isn't. The habit is the same either way.

The Prompt — The Two-Model Gut Check

We turned this into a reusable verification partner — a prompt that interviews you about whatever you're trusting to AI, then hands you a divergence map: the exact spots two models are most likely to disagree, and which one to believe at each.

The Two-Model Gut Check →

Works in any AI chatbot · takes 2 minutes

Reader	What they checked	What the divergence caught
Freelancer, NDA review	Limitation-of-liability clause in a client contract — ran it through ChatGPT and Claude	ChatGPT: clause caps liability at fees paid. Claude: clause is unenforceable under their state's consumer protection law. One answer was right. One looked completely confident.
Marketing manager, ad copy fact	“74% of consumers trust peer reviews over brand claims” — wanted a source before publishing	Gemini: confirmed, cited a 2022 BrightLocal survey. Claude: flagged that the stat is from 2012, pre-dates social commerce, and the current figure is closer to 46%. The ad didn't go out.
Developer, debugging a logic error	A race condition in async code — same function, same error, two different models	Model A: mutex on the wrong variable. Model B: the real root cause — a shared state not scoped to the request. The divergence pointed exactly at the line that needed the fix.
Student, history paper	Whether the Treaty of Versailles was signed in 1919 or 1920 — one model said the latter	Both agreed on 1919, but the gut-check caught that one model was confusing the ratification date with the signing date — a distinction the paper needed. Professor noticed.
Same prompt. YOUR question. The split is where the answer lives.

Two AIs, one sky, opposite views — and the model that looked gorgeous wasn't the one that looked up. The next flawless answer you get? Ask a second model which way it's facing.