Axis of Ordinary
3.73K subscribers
4.34K photos
1.23K videos
6 files
5.34K links
Memetic and cognitive hazards.

Substack: https://axisofordinary.substack.com/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
One day they might push back.
😈12😁10😐2
There is speculation that Ilya Sutskever's startup 'Safe Superintelligence' may be backed by the Israeli government to build artificial superintelligence. They are very secretive and require people to leave their phone in a Faraday cage before entering their offices, one of which is in Tel Aviv. They are already valued at $30 billion.

Another superintelligence startup, Reflection AI, was founded by former DeepMind engineers who helped create AlphaGo. They just raised $130 million. Reflection's lofty mission focuses on building tools that have full autonomy, rather than simply serving as a kind of co-pilot or assistant.
🤬11👍6🤔4
Mathematician Daniel Litt on what he learned from designing a problem for the FrontierMath benchmark and the ability of reasoning models like o3-mini-high to solve it:

https://x.com/littmath/status/1898461323391815820
1🔥1
😁15😢6
🤡
🤣29🤡6🔥4🤔1
😨18😁1
Links for 2025-03-11 (Part 1)

AI

1. “…the agent trained with CoT pressure still learns to reward hack; only now its cheating is undetectable by the monitor because it has learned to hide its intent in the chain-of-thought.” https://openai.com/index/chain-of-thought-monitoring/

2. R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning https://arxiv.org/abs/2503.05379

3. R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2503.05592

4. Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models https://arxiv.org/abs/2503.06749

5. MALT: Improving Reasoning with Multi-Agent LLM Training https://arxiv.org/abs/2412.01928

6. LADDER is a framework enabling LLMs to recursively generate and solve progressively simpler variants of complex problems—boosting math integration accuracy. https://arxiv.org/abs/2503.00735

7. START: Self-taught Reasoner with Tools https://arxiv.org/abs/2503.04625

8. *ARC‑AGI Without Pretraining* – No pretraining. No datasets. Just pure inference-time gradient descent on the target ARC-AGI puzzle itself, solving 20% of the evaluation set. https://iliao2345.github.io/blog_posts/arc_agi_without_pretraining/arc_agi_without_pretraining.html

9. Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems https://arxiv.org/abs/2502.17019

10. Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks https://www.arxiv.org/abs/2503.04378

11. Differentiable Logic Cellular Automata https://google-research.github.io/self-organising-systems/difflogic-ca/

12. Token-Efficient Long Video Understanding for Multimodal LLMs https://research.nvidia.com/labs/lpr/storm/

13. The Manus Marketing Madness https://www.lesswrong.com/posts/ijSiLasnNsET6mPCz/the-manus-marketing-madness

14. What the Headlines Miss About the Latest Decision in the Musk vs. OpenAI Lawsuit https://www.lesswrong.com/posts/dnCdqxPh5JtPp78FP/what-the-headlines-miss-about-the-latest-decision-in-the

15. Mathematician Daniel Litt on what he learned from designing a problem for the FrontierMath benchmark and the ability of reasoning models like o3-mini-high to solve it https://x.com/littmath/status/1898461323391815820

16. Terence Tao: “My general sense is that for research-level mathematical tasks at least, current models fluctuate between "genuinely useful with only broad guidance from user" and "only useful after substantial detailed user guidance", with the most powerful models having a greater proportion of answers in the former category.” https://mathstodon.xyz/@tao/114139125505827565

17. Will AI be capable of producing an Annals-quality math paper for $100k by March 2030? https://manifold.markets/TamayBesiroglu/will-ai-be-capable-of-producing-ann

18. Mayo Clinic’s secret weapon against AI hallucinations: Reverse RAG in action https://venturebeat.com/ai/mayo-clinic-secret-weapon-against-ai-hallucinations-reverse-rag-in-action/

19. How Orakl Oncology is using DINOv2 to accelerate cancer treatment discovery https://ai.meta.com/blog/orakl-oncology-dinov2-accelerating-cancer-treatment/

20. "Not great for my comparative advantage, but from some experiments we have done at Rotman, I am totally convinced the vast majority of research that doesn't involve the physical world can be done more cheaply with AI & a little human intervention than by even good researchers. 1/7" https://x.com/Afinetheorem/status/1898822592594874598

21. Superintelligence Strategy https://www.nationalsecurity.ai/

22. The Nuclear-Level Risk of Superintelligent AI https://time.com/7265056/nuclear-level-risk-of-superintelligent-ai/

23. “Imagine if you could train one human for thousands years to achieve unparalleled expertise, then make many copies. That’s what AI enables: spend heavily on training a single model, then cheaply replicate it. This creates a unique source of increasing returns at scale.” https://epoch.ai/blog/train-once-deploy-many-ai-and-increasing-returns
👍3🤯2
Links for 2025-03-11 (Part 2)

24. Currently, total AI cognitive effort is growing ~25x yearly—hundreds of times faster than human research effort (4% yearly). Once AI can meaningfully substitute for human research, total research growth (human+AI) will increase *dramatically*. https://www.forethought.org/research/preparing-for-the-intelligence-explosion

25. “Why I believe that the brain does something like gradient descent” https://medium.com/@kording/why-i-believe-that-the-brain-does-something-like-gradient-descent-27611c491205

26. “If we treat the brain as a neural network with optimized algorithms instead of as an artifact disconnected from the rest of AI research, we conclude the coming decade should see many new AI capabilities emerging as we continue closing the gap with the brain.” https://epoch.ai/gradient-updates/what-ai-can-currently-do-is-not-the-story

27. METR evaluated DeepSeek-R1’s ability to act as an autonomous agent. On generic SWE tasks it performs on-par with o1-preview but worse than 3.5 Sonnet (new) or o1. Overall R1 is ~6 months behind leading US AI companies at agentic SWE tasks and is only a small improvement on V3. https://metr.github.io/autonomy-evals-guide/deepseek-r1-report/

28. Elicitation -- that base models have tons of capabilities that post-training pulls out -- is remarkably simple to understand and will make it much easier for not so technical folks to feel the AGI. https://www.interconnects.ai/p/elicitation-theory-of-post-training

29. Mathematical Foundations of Reinforcement Learning https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning

30. Deep Learning is Not So Mysterious or Different https://arxiv.org/abs/2503.02113

31. Can a 7B parameter model learn to solve Sudoku through pure reinforcement learning without any cold start data? A surprising yes! https://hrishbh.com/teaching-language-models-to-solve-sudoku-through-reinforcement-learning/

32. So how well is Claude playing Pokémon? https://www.lesswrong.com/posts/HyD3khBjnBhvsp8Gb/so-how-well-is-claude-playing-pokemon

33. PokéChamp: an Expert-level Minimax Language Agent https://arxiv.org/abs/2503.04094

34. Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases https://www.lesswrong.com/posts/ywzLszRuGRDpabjCk/do-reasoning-models-use-their-scratchpad-like-we-do-evidence

35. Factorio Learning Environment (FLE): A benchmark based on the game of Factorio, that tests agents in long-term planning, program synthesis, and resource optimization https://jackhopkins.github.io/factorio-learning-environment/

36. Russian scientists fuse reasoning models with drone-control models for thinking drones https://arxiv.org/abs/2503.01378v1

Neuroscience

1. Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior https://arxiv.org/abs/2502.20349

2. Melbourne start-up launches 'biological computer' made of human brain cells https://www.abc.net.au/news/science/2025-03-05/cortical-labs-neuron-brain-chip/104996484 (product page: https://corticallabs.com/cl1.html)

3. Biological Neurons vs Deep Reinforcement Learning: Sample efficiency in a simulated game-world [published in 2022] https://openreview.net/forum?id=N5qLXpc7HQy

Science

1. Stanford researchers have developed an antibody duo therapy that neutralizes all SARS-CoV-2 variants by targeting two different parts of the virus simultaneously. https://www.science.org/doi/10.1126/scitranslmed.adq5720
🏆5👍4
1. The party that told Donald Trump, “We are not for sale,” just won the elections in Greenland.

2. MAGA has made Canadian Libs great again—quite an achievement.

We're gonna win so much, you may even get tired of winning. And you'll say, 'Please, please. It's too much winning. We can't take it anymore. Mr. President, it's too much.'


— Donald Trump
😁15👍4🤣3😢1
This media is not supported in your browser
VIEW IN TELEGRAM
Anthropic CEO Dario Amodei predicts AI writing 90% of code within 3-6 months and 100% of code within a year.

To be clear, I don't believe this. But it's an interesting and falsifiable prediction. Let's see what happens.
🤡24🤣3🤔2🥴1
Media is too big
VIEW IN TELEGRAM
Gemini models for robotics:

- Completes precise tasks, like origami folding handled entirely by AI-driven robots

- Understands commands in everyday language.

- Quickly adapts if objects move or environment changes.

- Instantly handles new tasks and objects.

- Performs tasks it's never seen before, doubling other models’ performance.

- Understands objects’ shape and position instantly.

- Plans safe, accurate movements to grasp and manipulate objects.

- Works across various robots, including humanoid models like Apollo.

- Easily adapts to different robotic hardware.

Read more: https://deepmind.google/discover/blog/gemini-robotics-brings-ai-into-the-physical-world/
Links for 2025-03-14

AI

1. Meta Reinforcement Fine-Tuning (MRT): Training LLMs to make measurable progress with each step, not just reach correct answers. https://cohenqu.github.io/mrt.github.io/

2. Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models https://arxiv.org/abs/2503.09573

3. Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. https://arxiv.org/abs/2503.07565

4. AI Can Now Learn 100x Faster Without Wasting Energy https://www.tum.de/en/news-and-events/all-news/press-releases/details/new-method-significantly-reduces-ai-energy-consumption

5. LLM inference prices have fallen 9x to 900x/year, depending on the task https://epoch.ai/data-insights/llm-inference-price-trends?insight-option=All+benchmarks

6. Google unveils Gemma 3, a powerful model that runs on a single GPU https://blog.google/technology/developers/gemma-3/

7. A quarter of startups in YC’s current cohort have codebases that are almost entirely AI-generated https://www.youtube.com/watch?v=IACHfKmZMr8

8. Sam Altman reveals that OpenAI has trained a new model that delivers standout creative writing, capturing metafiction’s nuanced vibe. https://x.com/sama/status/1899535387435086115

9. OpenAI o1 and o3-mini now offer Python-powered data analysis in ChatGPT. https://x.com/OpenAI/status/1900308446211432484

10. OpenAI Nonprofit Buyout: Much More Than You Wanted To Know https://www.astralcodexten.com/p/openai-nonprofit-buyout-much-more

11. OpenAI submitted their policy proposal to the US government. They directly link fair use with national security, and said if China continues to have free access to data while 'American companies are left without fair use access, the race for Al is effectively over.' https://openai.com/global-affairs/openai-proposals-for-the-us-ai-action-plan/

12. “I believe it is a clear demonstration that misalignment likely does not stem from the model being “evil.” It simply found a better way to achieve its goal using unintended means.” https://www.lesswrong.com/posts/mpmsK8KKysgSKDm2T/the-most-forbidden-technique

13. Auditing language models for hidden objectives https://www.anthropic.com/research/auditing-hidden-objectives

14. Anthropic, and taking "technical philosophy" more seriously https://www.lesswrong.com/posts/7uTPrqZ3xQntwQgYz/untitled-draft-7csk

15. China steels itself for Donald Trump’s turmoil with ‘DeepSeek congress’ https://www.ft.com/content/8bdcf44d-7654-4bb5-ab15-5c69a4a998b7 [no paywall: https://archive.is/D3SMo]

Science and Technology

1. Low-Power Brain Chip Predicts Users’ Intentions https://spectrum.ieee.org/brain-computer-interface-2671224658

2. Lack of context modulation in human single neuron responses in the medial temporal lobe https://www.cell.com/cell-reports/fulltext/S2211-1247(24)01569-9

3. MIT engineers turn skin cells directly into neurons for cell therapy https://news.mit.edu/2025/mit-engineers-turn-skin-cells-into-neurons-for-cell-therapy-0313

4. East Asian personality may stem from Ice Age Siberia ~20000 yrs ago https://psycnet.apa.org/fulltext/2025-88410-001.html

5. Have we passed peak intelligence? In international tests, student scores for reading and maths sunk to a new low. https://x.com/_alice_evans/status/1900449985629487366

6. A new programming language called "Exo 2" could enable high-performance coding that can compete with state-of-the-art libraries with a few hundred lines of code, instead of tens or hundreds of thousands. https://news.mit.edu/2025/high-performance-computing-with-much-less-code-0313

7. “Committing fraud is, right now, a viable career strategy that can propel you at the top of the academic world.” https://statmodeling.stat.columbia.edu/2025/03/08/a-post-mortem-on-the-gino-case-committing-fraud-is-right-now-a-viable-career-strategy-that-can-propel-you-at-the-top-of-the-academic-world/
👍3
This media is not supported in your browser
VIEW IN TELEGRAM
BotQ: A High-Volume Manufacturing Facility for Humanoid Robots

Initially designed to produce 12,000 robots/year, it will scale to support a fleet of 100,000 in the next four years.

Read more: https://www.figure.ai/news/botq
🤯6🔥31
😁271🤯1
RAND Corporation:

First, AGI might enable a significant first-mover advantage via the sudden emergence of a decisive wonder weapon.


Read more: https://www.rand.org/pubs/perspectives/PEA3691-4.html
23👏3🤡1
Terence Tao:

So all in all a pretty good assist from [o3-mini-high]; it made a mistake that I corrected, but I also made a mistake that it corrected, and code that would have taken perhaps an hour of my time on my own was generated, tested, modified, and reported in maybe ten minutes.

Source: https://mathstodon.xyz/@tao/114173696303072269

This wouldn't be an interesting observation if it weren't for the fact that many people insist that current AI models are useless. Yet more and more mathematicians are claiming that they are useful, or on the verge of being useful.

Also keep in mind that o3-mini-high isn't even the best existing model. And even the best model will pale in comparison to what will be available by the end of the year.
7👍5
This media is not supported in your browser
VIEW IN TELEGRAM
OpenAI CPO, Kevin Weil:

this is the year that AI gets better than humans at programming forever


Source: https://youtu.be/SnSoMh9m5hc?si=uyoRy7CEHg1pffCL
🥱134👍4🤣1😨1