Axis of Ordinary
3.73K subscribers
4.34K photos
1.22K videos
6 files
5.34K links
Memetic and cognitive hazards.

Substack: https://axisofordinary.substack.com/
Download Telegram
One problem with using AI to find proofs is not that it lacks logic, but that it lacks mathematical taste. It optimizes the stated objective, not the spirit of the problem. Given a poorly specified target, it may find a perfectly valid proof that slips through a loophole, dissolving the intended question rather than solving it.

Thought experiments reveal the same failure mode. Some people respond not by engaging the dilemma, but by choosing an interpretation that makes it disappear: “call the police,” “don’t press either button,” “destroy the machine.” But the point of a thought experiment is not to win by escaping the setup. The point is to find the interpretation under which the problem becomes maximally revealing.

The space of possible problems, proofs, and axiom systems is effectively infinite. Most are trivial, degenerate, or unilluminating. The real skill is locating the rare formulation that exposes a genuine structure: solvable, nontrivial, and worth thinking about.

This is why mathematical ability is not merely raw logical power. At the highest level, it is taste: the intuition for where the interesting problem is hiding.
👍7🤣5👏1🤡1
Sebastien Bubeck:

Already today, we're finding that [OpenAI's] models are able to surpass humans in the sense that they can find mistakes in papers. We have agents internally that have been able to find papers and say, 'hey, actually this is wrong, here is the correct answer.'

Not only that, but people tend to think that AI is only good at answering questions. Actually, no, it's also pretty good at asking questions. Of course you need some research innovation there, which we had. And now our models are very good at asking questions. So good, in fact, that humans are looking at those questions and saying, 'hey, maybe I should write a paper based on this question.'

So what I'm trying to say is that in a year, in two years, models could do basic[ally], more or less everything that human researchers do.


Source: https://www.youtube.com/watch?v=9-TVwv6wtGQ
🤡4
Demis Hassabis, the founder of Google DeepMind, confirmed once again that they are on track to achieve artificial general intelligence (AGI) by around 2030.

Quote 1:

[07:42] I would have spent my life on AI no matter what had happened. As it turned out, it's gone on the absolutely optimistic side of what we thought. Still, actually within what we were predicting in 2010. We thought it would be a 20-year mission. I think we're basically exactly on track as a field for that. [25:21] [Year of AGI?] 2030.


Quote 2:

[39:15] Depending on what your AGI timeline is, you know mine's like 2030 or something like this...


Sources:
1. https://www.youtube.com/watch?v=AFpeWo1GTeg&t=462s
2. https://www.youtube.com/watch?v=JNyuX1zoOgU&t=2355s

Note that his criteria for AGI are unusually strong: Train AI on all human knowledge. Cut it off at 1911. See if it independently discovers general relativity like Einstein did in 1915.
🤡61
Terence Tao on the rise of AI: https://www.nature.com/articles/d41586-026-01246-9

No paywall: https://archive.is/E56nK

P.S. I think Tao hasn't fully embraced the very real possibility that AI progress will continue for a few more years, and what that would imply. I expect him to update all the way at the end of this year, when a new generation of models is released.
🤡5
🔥8🥰4🤡4
…if quantum computers start breaking cryptography a few years from now, don’t you dare come to this blog and tell me that I failed to warn you. This post is your warning. Please start switching to quantum-resistant encryption, and urge your company or organization or blockchain or standards body to do the same.


— Scott Aaronson
🥱4🔥1🤡1
It's 1996. You want to write a science fiction story about 2026.

You settle on a plot in which an artificial intelligence lab publishes a forensic post-mortem on why its coding model developed an inexplicable affinity for "goblins" and "gremlins". It turns out the reason was a reward signal that leaked out of one of the model's personalities and contaminated the rest. Inspired by this finding, your protagonist, paralyzed by ALS and typing through electrodes implanted in her motor cortex, uses a machine ghost from the age of telegrams, etiquette manuals, patent filings, and pre-war newspapers, to compose the lyrics of a ballad about catching folkloric verbal parasites from reinforcement learning.

Meanwhile, in Eastern Europe, a continental land war is being fought largely with drones, in which the operators upload kill-cam footage to a state-run platform, accumulate points for verified strikes, and redeem those points for replacement weapons through a government marketplace.

In a parallel storyline, a multinational search-engine provider publishes a quantum cryptanalysis showing that a quantum computer with fewer than half a million physical qubits could break the elliptic-curve cryptography underlying global finance, including cryptocurrencies, and wraps the announcement in a zero-knowledge proof, so that nobody can reverse-engineer an attack from the disclosure itself.

Your publisher rejects the manuscript for straining credibility.
🥰9👍3🤯2🤡2
Links for 2026-04-30

AI


1. Thinking Without Words — latent reasoning ≈ verbal CoT at far lower inference cost. https://arxiv.org/abs/2604.22709

2. Latent Agents: post-training for internalized multi-agent debate. https://arxiv.org/abs/2604.24881

3. Recurrent Transformer: greater depth, efficient decoding. https://arxiv.org/abs/2604.21215

4. Predictive strategies emerge in real + artificial agents from shared task structure. https://www.biorxiv.org/content/10.64898/2026.04.23.720457v1

5. Conductor: train an AI manager to delegate to diverse AIs. https://arxiv.org/abs/2512.04388

6. Stanford/Arc genome LMs generated complete ΦX174-like phages; ~300 tested, 16 viable; Evo-Φ36 used a distant packaging protein where simple swaps failed. https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1

7. Reiner Pope: math behind LLM training/serving. https://www.dwarkesh.com/p/reiner-pope

8. The paper that killed deep learning theory. https://www.lesswrong.com/posts/ZvQfcLbcNHYqmvWyo/the-paper-that-killed-deep-learning-theory

9. The other paper that killed deep learning theory. https://www.lesswrong.com/posts/zcGmdQHX66NhC69v6/the-other-paper-that-killed-deep-learning-theory

10. Why SGD stops just short of the edge. https://akyrillidis.github.io/aiowls/stochastic_self_stabilization.html

11. Recursive forecasting from myopic fitness-seekers. https://www.lesswrong.com/posts/q2zYtNsh62SCphitt/recursive-forecasting-eliciting-long-term-forecasts-from

12. AISI GPT-5.5 cyber eval: among strongest tested; second to solve multi-step cyber-attack sim end-to-end. https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities

13. Mayo Clinic AI detects pancreatic cancer up to 3 years early. https://newsnetwork.mayoclinic.org/discussion/mayo-clinic-ai-detects-pancreatic-cancer-up-to-3-years-before-diagnosis-in-landmark-validation-study/

AI politics

1. NYT: Mythos reactions show frontier AI launches acting like weapons tests; Russian pro-Kremlin outlet: “worse than a nuclear bomb.” https://www.nytimes.com/2026/04/22/technology/anthropics-mythos-ai.html [no paywall: https://archive.is/Ex7ZI]

2. WSJ: White House opposes wider Mythos access over security + limited government availability. https://www.wsj.com/tech/ai/white-house-opposes-anthropics-plan-to-expand-access-to-mythos-model-dc281ab5 [no paywall: https://archive.is/zonHZ]

3. China orders Meta to unwind $2B Manus AI-startup deal. https://www.reuters.com/world/asia-pacific/china-blocks-foreign-acquisition-ai-startup-manus-2026-04-27/

4. Anthropic weighs funding at $900B+ valuation. https://www.bloomberg.com/news/articles/2026-04-29/anthropic-considering-funding-offers-at-over-900-billion-value

5. Pentagon uses GenAI.mil to create 100K agents. https://defensescoop.com/2026/04/23/pentagon-uses-genai-mil-to-create-agents/

Computer Science

1. Dijkstra invented shortest path in 20 minutes over coffee. https://cacm.acm.org/news/an-interview-with-edsger-w-dijkstra/

2. Leibniz on symbolic computation and the “combinatorial machine.” https://www.leibnizpapers.org/combinatorial-machine.htm

Science and Technology

1. Swedish 1.2M-person study: intelligence positively associated with prosociality. https://academic.oup.com/ej/article/135/668/1141/7914156

2. Longevity hype vs real promise of cell rejuvenation. https://www.nytimes.com/2026/04/27/magazine/cell-rejuventation-biotech-longevity-research-altos-labs.html [no paywall: https://archive.is/pVtms]

3. Prediction-market accuracy: informed minority, not crowd wisdom. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6617059

4. ML supports unrecognized transient astronomical phenomena in historical observatory images. https://arxiv.org/abs/2604.18799

5. SpaceX Musk pkg: 200M super-voting shares for $7.5T valuation + 1M-person Mars colony; second award up to 60.4M shares for space datacentres with 100 TW compute. https://www.reuters.com/sustainability/boards-policy-regulation/spacex-ties-musk-compensation-mars-colonization-goal-2026-04-28/
5🤡3🤯1
The case for taking superhuman AI seriously
by Claude Opus 4.7

Evolution was not trying to build a mathematician. It was trying to build a viable ape: one that fits through a birth canal, runs on twenty watts, learns enough in a couple of decades to reproduce, and survives a long helpless childhood under predation, disease, and famine. Brain size is constrained by obstetrics, metabolism, development, and parental care. Childhood length is constrained by mortality: a longer education is punished brutally when each year carries a serious chance of dying before the learning pays off. None of these are physics constraints. They are mammal constraints.

That distinction matters because almost every axis along which the human mind is bottlenecked has obvious slack on the artificial side:

Signaling speed: Neurons at roughly 100 m/s versus electronics at a meaningful fraction of c.

Working memory: Three or four conscious chunks versus vastly more active variables.

Long-term memory: Lossy, reconstructive, confabulated recall versus systems backed by exact, indexed, searchable storage.

Training time: Capped by mortality and metabolism versus a roughly linear engineering cost.

Parallelism: Noisy civilization-scale coordination versus copyable, forkable, and potentially mergeable subagents inside a single project.

Lifespan: Aging and death versus checkpointing and indefinite operation.

Self-improvement: Dangerous wetware tinkering versus sandboxed experiments with rollback.

Each axis on its own permits movement well past the ordinary human range. They also compose. A merely human-level reasoner running 100× faster, copied a thousand times, with perfect recall and engineered persistence, is already not “merely human” in any practical sense. It is closer to a coordinated research institute that never sleeps.

The serious case does not require that current systems are magic, that scaling laws are destiny, or that a fast takeoff is around the corner. It rests on a much weaker claim: the unaided adult human brain — slow, warm, fragile, sleep-dependent, birth-canal-constrained — is a local optimum under biological constraints, not the maximum mind allowed by physics.

If intelligence can be engineered directly rather than stumbled upon by a blind optimizer with no foresight and no ability to jump fitness gaps, then “smarter than human” should not register as an extraordinary claim. It should be the default expectation. The burden of proof sits with the side asserting that this particular primate happens, by extraordinary coincidence, to be the upper bound.
🤡12👍8🍾31😁1
👍4
Whereas illicit AI use was already a well-known problem for the growing ecosystem of online tournaments, we didn’t expect it to affect our unrated, prizeless teaching league. To the contrary, we soon became cognisant of how some of our students were outputting better games than we, their teachers, could ever hope to play.

[...]

They fire up their computer out of idle curiosity and nod along passively as the truths of the universe float by them. They register the insights not one bit more because they can click the sublime moves. People consistently underestimate just how lost they will be when the solution is no longer right in front of them.

[...]

The thing I want to impress with this article is the consistency with which we as a species underestimate our own willingness to give up our culture, economy and autonomy to AI, even without monetary incentives.


https://www.lesswrong.com/posts/nR3DkyivzF4ve97oM/how-go-players-disempower-themselves-to-ai
👍2🤡2👎1
After 50 years, it’s time to close this important chapter. The top programs are unbeatable by humans; making them stronger has no real research value. These programs rarely make a mistake. Most games between the programs end in a draw, reinforcing the generally accepted notion that perfect play in chess will lead to a draw. The next challenge? Solving chess! With an estimated 1045 states, this is a daunting challenge for hardware and software technology.


https://icga.org/?page_id=3957
Chinese models are ~8 months behind and are falling further behind: https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro
🤡5🔥3😁3😐2😢1🥱1
Interesting idea to probe how much “programming ability” is post-training vs. pretraining: https://github.com/RicardoDominguez/talkie-coder

Fine-tuning on modern SWE trajectories takes a 13B “vintage” model trained only on pre-1931 text from 4% pass@100 on HumanEval to 4.5% pass@1 on SWE-bench.

This is evidence that general language modeling learns surprisingly reusable abstractions. Why? Imagine trying to teach someone to fix bugs in a large software project by showing them recordings of expert programmers at work. This approach would only be helpful if the learner already had a lot of mental machinery.
🤡31👏1🤔1
Claude Opus 4.7 managed to implement an AlphaZero-style self-play pipeline from scratch on consumer hardware in three hours.

No starter code. No full paper to copy. It had to implement the research loop: MCTS, neural policy/value nets, self-play, training, and evaluation.

Across eight trials, Opus 4.7 beat the Pascal Pons Connect Four solver as first player in 7/8 runs. No other tested frontier coding agent cleared 2/8.

Paper: Frontier Coding Agents Can Now Implement an AlphaZero Self-Play Machine Learning Pipeline For Connect Four That Performs Comparably to an External Solver https://arxiv.org/abs/2604.25067
🤡5👍4😨1
Between the 15th and 18th centuries, Crimea’s slave raiders seized millions from villages across Poland, Ukraine and Russia, marched them south in chains, and sold them into Ottoman markets. A new APSR study by Volha Charnysh and Ranjit Lall estimates at least 3.64 million captives (likely about 5 million) from 2,511 raids on 882 locations.

The surprise: raided regions later grew faster than comparable unraided ones.

Why? Unlike parts of West Africa, Russia and Poland-Lithuania did not become suppliers inside the trade. They resisted it. Defence lines, garrisons, forts, roads, taxes and standing armies pulled labour, trade and state capacity toward the exposed frontier. Garrison towns became markets; fortified borderlands became administrative and commercial hubs.

The raids were catastrophic. But long-run effects depended on political structure: societies absorbed into slave production were broken; societies able to resist were forced into state-building.
👍8🤔7💯3💩2😭1
🤡1
🤡8😨4💯2👍1
How do we measure 3D spatial intelligence?

We show an agent ~20 photos from inside an apartment and ask it to produce the floor plan. It has to identify rooms, work out connections, and keep scale consistent. It does this for 50 apartments, with a notepad to learn across them.

Read more: https://andonlabs.com/evals/blueprint-bench-2
👍3🤡1
This was always destined to happen. The party that controls AI has shaping power over the rest of human history.

https://www.nytimes.com/2026/05/04/technology/trump-ai-models.html
🌚7🗿5🤡3🥱2😐2🔥1