Axis of Ordinary
3.73K subscribers
4.34K photos
1.23K videos
6 files
5.34K links
Memetic and cognitive hazards.

Substack: https://axisofordinary.substack.com/
Download Telegram
Media is too big
VIEW IN TELEGRAM
Gemini models for robotics:

- Completes precise tasks, like origami folding handled entirely by AI-driven robots

- Understands commands in everyday language.

- Quickly adapts if objects move or environment changes.

- Instantly handles new tasks and objects.

- Performs tasks it's never seen before, doubling other models’ performance.

- Understands objects’ shape and position instantly.

- Plans safe, accurate movements to grasp and manipulate objects.

- Works across various robots, including humanoid models like Apollo.

- Easily adapts to different robotic hardware.

Read more: https://deepmind.google/discover/blog/gemini-robotics-brings-ai-into-the-physical-world/
Links for 2025-03-14

AI

1. Meta Reinforcement Fine-Tuning (MRT): Training LLMs to make measurable progress with each step, not just reach correct answers. https://cohenqu.github.io/mrt.github.io/

2. Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models https://arxiv.org/abs/2503.09573

3. Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. https://arxiv.org/abs/2503.07565

4. AI Can Now Learn 100x Faster Without Wasting Energy https://www.tum.de/en/news-and-events/all-news/press-releases/details/new-method-significantly-reduces-ai-energy-consumption

5. LLM inference prices have fallen 9x to 900x/year, depending on the task https://epoch.ai/data-insights/llm-inference-price-trends?insight-option=All+benchmarks

6. Google unveils Gemma 3, a powerful model that runs on a single GPU https://blog.google/technology/developers/gemma-3/

7. A quarter of startups in YC’s current cohort have codebases that are almost entirely AI-generated https://www.youtube.com/watch?v=IACHfKmZMr8

8. Sam Altman reveals that OpenAI has trained a new model that delivers standout creative writing, capturing metafiction’s nuanced vibe. https://x.com/sama/status/1899535387435086115

9. OpenAI o1 and o3-mini now offer Python-powered data analysis in ChatGPT. https://x.com/OpenAI/status/1900308446211432484

10. OpenAI Nonprofit Buyout: Much More Than You Wanted To Know https://www.astralcodexten.com/p/openai-nonprofit-buyout-much-more

11. OpenAI submitted their policy proposal to the US government. They directly link fair use with national security, and said if China continues to have free access to data while 'American companies are left without fair use access, the race for Al is effectively over.' https://openai.com/global-affairs/openai-proposals-for-the-us-ai-action-plan/

12. “I believe it is a clear demonstration that misalignment likely does not stem from the model being “evil.” It simply found a better way to achieve its goal using unintended means.” https://www.lesswrong.com/posts/mpmsK8KKysgSKDm2T/the-most-forbidden-technique

13. Auditing language models for hidden objectives https://www.anthropic.com/research/auditing-hidden-objectives

14. Anthropic, and taking "technical philosophy" more seriously https://www.lesswrong.com/posts/7uTPrqZ3xQntwQgYz/untitled-draft-7csk

15. China steels itself for Donald Trump’s turmoil with ‘DeepSeek congress’ https://www.ft.com/content/8bdcf44d-7654-4bb5-ab15-5c69a4a998b7 [no paywall: https://archive.is/D3SMo]

Science and Technology

1. Low-Power Brain Chip Predicts Users’ Intentions https://spectrum.ieee.org/brain-computer-interface-2671224658

2. Lack of context modulation in human single neuron responses in the medial temporal lobe https://www.cell.com/cell-reports/fulltext/S2211-1247(24)01569-9

3. MIT engineers turn skin cells directly into neurons for cell therapy https://news.mit.edu/2025/mit-engineers-turn-skin-cells-into-neurons-for-cell-therapy-0313

4. East Asian personality may stem from Ice Age Siberia ~20000 yrs ago https://psycnet.apa.org/fulltext/2025-88410-001.html

5. Have we passed peak intelligence? In international tests, student scores for reading and maths sunk to a new low. https://x.com/_alice_evans/status/1900449985629487366

6. A new programming language called "Exo 2" could enable high-performance coding that can compete with state-of-the-art libraries with a few hundred lines of code, instead of tens or hundreds of thousands. https://news.mit.edu/2025/high-performance-computing-with-much-less-code-0313

7. “Committing fraud is, right now, a viable career strategy that can propel you at the top of the academic world.” https://statmodeling.stat.columbia.edu/2025/03/08/a-post-mortem-on-the-gino-case-committing-fraud-is-right-now-a-viable-career-strategy-that-can-propel-you-at-the-top-of-the-academic-world/
👍3
This media is not supported in your browser
VIEW IN TELEGRAM
BotQ: A High-Volume Manufacturing Facility for Humanoid Robots

Initially designed to produce 12,000 robots/year, it will scale to support a fleet of 100,000 in the next four years.

Read more: https://www.figure.ai/news/botq
🤯6🔥31
😁271🤯1
RAND Corporation:

First, AGI might enable a significant first-mover advantage via the sudden emergence of a decisive wonder weapon.


Read more: https://www.rand.org/pubs/perspectives/PEA3691-4.html
23👏3🤡1
Terence Tao:

So all in all a pretty good assist from [o3-mini-high]; it made a mistake that I corrected, but I also made a mistake that it corrected, and code that would have taken perhaps an hour of my time on my own was generated, tested, modified, and reported in maybe ten minutes.

Source: https://mathstodon.xyz/@tao/114173696303072269

This wouldn't be an interesting observation if it weren't for the fact that many people insist that current AI models are useless. Yet more and more mathematicians are claiming that they are useful, or on the verge of being useful.

Also keep in mind that o3-mini-high isn't even the best existing model. And even the best model will pale in comparison to what will be available by the end of the year.
7👍5
This media is not supported in your browser
VIEW IN TELEGRAM
OpenAI CPO, Kevin Weil:

this is the year that AI gets better than humans at programming forever


Source: https://youtu.be/SnSoMh9m5hc?si=uyoRy7CEHg1pffCL
🥱134👍4🤣1😨1
Links for 2025-03-17

AI

1. Agents Play Thousands of 3D Video Games: Traditional game AI requires extensive training or hand-coding. PORTAL differs by using LLMs to design the policy structure as an architect, not to play directly as actors. The LLM "writes code" to structure tactics. https://zhongwen.one/projects/portal/

2. Microsoft has released this useful tool for performing R&D with LLM-based agents. https://github.com/microsoft/RD-Agent

3. A key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo. TL;DR: We can do LLM training across data centers in a way that scales incredibly well to larger and larger models! https://arxiv.org/abs/2503.09799

4. Compute-Optimal LLMs Provably Generalize Better with Scale https://openreview.net/forum?id=MF7ljU8xcf

5. Transformers without Normalization https://arxiv.org/abs/2503.10622

6. SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion https://arxiv.org/abs/2503.11576

7. Cerebras just announced 6 new AI datacenters that process 40M tokens per second — and it could be bad news for Nvidia https://venturebeat.com/ai/cerebras-just-announced-6-new-ai-datacenters-that-process-40m-tokens-per-second-and-it-could-be-bad-news-for-nvidia/

8. The tiny chips behind Amazon’s big AI investment https://www.semafor.com/article/03/14/2025/amazons-trainium-chips-to-be-tested-by-anthropic

9. AI Tools for Existential Security https://www.forethought.org/research/ai-tools-for-existential-security

10. System built by Google DeepMind team takes individual views and generates a set of group statements https://www.lesswrong.com/posts/j9K4Wu9XgmYAY3ztL/habermas-machine

11. Really powerful AI could wreck society by making governments too powerful https://arxiv.org/abs/2503.05710

12. Robin Hanson lost a bet that “Systems in GPT line will by 2025 make <$1B in customer revenue clearly tied to such systems.” https://x.com/robinhanson/status/1901329487532548511

Science and Technology

1. A socratic dialogue over the utility of DNA language models https://www.owlposting.com/p/a-socratic-dialogue-over-the-utility

2. A torpor-like state in mice slows blood epigenetic aging and prolongs healthspan https://www.nature.com/articles/s43587-025-00830-4

3. “Magpies and Crows Are Using “Anti-Bird Spikes” to Make Their Nests.” https://www.audubon.org/magazine/apparently-magpies-and-crows-are-using-anti-bird-spikes-make-their-nests

4. The Hypercuriosity Theory of ADHD https://epsig.substack.com/p/the-hypercuriosity-theory-of-adhd

5. “Metacognition Broke My Nail-Biting Habit” https://www.lesswrong.com/posts/RW3B4EcChkvAR6Ydv/metacognition-broke-my-nail-biting-habit

6. What Did We Learn From Torturing Babies? https://marginalrevolution.com/marginalrevolution/2025/03/what-do-we-learn-from-torturing-babies.html

7. Was our universe born inside a black hole? https://www.space.com/space-exploration/james-webb-space-telescope/is-our-universe-trapped-inside-a-black-hole-this-james-webb-space-telescope-discovery-might-blow-your-mind

Intelligence

1. Intelligence is unequally distributed between individuals, countries and its inhabitants. This even affects economic growth. https://schweizermonat.ch/the-worldwide-distribution-of-intelligence/

2. The search for a test that produces no racial differences in performance and still predicts performance on the job is a quest to find the impossible. https://www.sciencedirect.com/science/article/abs/pii/S0160289624000862

Politics

1. Alignment is EASY and Roko's Basilisk is GOOD?! AI Doom Debate with Roko Mijic https://www.youtube.com/watch?v=AY4jD26RntE

2. Most Externalities are Solved with Technology, Not Coordination https://www.maximum-progress.com/p/most-externalities-are-solved-with

3. “We Were Badly Misled About the Event That Changed Our Lives” https://www.nytimes.com/2025/03/16/opinion/covid-pandemic-lab-leak.html [no paywall: https://archive.is/iweAg]
👍3
Random sampling works better than you think: Gemini 1.5 = o1. The secret? Self-verification magically gets easier with scale.


Thinking for longer (e.g. o1) is only one of many axes of test-time compute. In a new Google paper, the authors instead focus on scaling the search axis.

By just randomly sampling 200 responses and self-verifying, Gemini 1.5 (an ancient early 2024 model!) beats o1-Preview and approaches o1. This is without finetuning, RL, or ground-truth verifiers.

This was surprising: search is bottlenecked by verification, and models are notoriously bad at self-verifying (think hallucinations) and self-consistency doesn't scale. The magic is that self-verification naturally becomes easier at scale! You'd expect that picking out a correct solution becomes harder the larger your pool of solutions is, but the opposite is the case!


Read more: https://eric-zhao.com/blog/sampling
👍5
This media is not supported in your browser
VIEW IN TELEGRAM
In this video, Atlas is demonstrating policies developed using reinforcement learning with references from human motion capture and animation. This work was done as part of a research partnership between Boston Dynamics and the Robotics and AI Institute (RAI Institute).
6🤯6😨1
“Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

These results appear robust. The authors were able to retrodict back to GPT-2. They further ran experiments on SWE-bench Verified and found a similar trend.

Read more: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Links for 2025-03-19

AI

1. The first automated theorem-proving framework for (hyperbolic) PDE solvers: now you can build *formally verified* physics simulations, with provable mathematical and physical correctness properties. https://arxiv.org/abs/2503.13877

2. Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations https://www.lesswrong.com/posts/E3daBewppAiECN3Ao/claude-sonnet-3-7-often-knows-when-it-s-in-alignment

3. Anthropic: “some reflections from the past year of red teaming models in these domains” https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team

4. “I wouldn't be surprised if, in three to five years, language models are capable of performing most (all?) cognitive economically-useful tasks beyond the level of human experts.” https://nicholas.carlini.com/writing/2025/thoughts-on-future-ai.html

5. R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization https://arxiv.org/abs/2503.12937

6. DAPO: An Open-Source LLM Reinforcement Learning System at Scale https://arxiv.org/abs/2503.14476

7. Introducing the First End-to-End Platform for Reinforcement Fine-Tuning https://predibase.com/blog/introducing-reinforcement-fine-tuning-on-predibase

8. Cancermorphic Computing Toward Multilevel Machine Intelligence https://arxiv.org/abs/2503.12743

9. Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models https://arxiv.org/abs/2503.13551

10. DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding https://arxiv.org/abs/2503.12797

11. LG AI research unveils EXAONE Deep, a reasoning AI with enhanced reasoning capabilities capable of competing with these industry-leading models. https://www.lgresearch.ai/blog/view?seq=543

12. An Open Foundation Model for Humanoid Robots https://research.nvidia.com/publication/2025-03_nvidia-isaac-gr00t-n1-open-foundation-model-humanoid-robots

13. MARLadona: Towards Cooperative Team Play Using Multi-Agent Reinforcement Learning https://www.youtube.com/watch?v=klETyDnWO2w

14. “AI has a profound ability to model more complex - and mysterious - systems, from the human body and global weather to Earth in its entirety.” https://www.ted.com/talks/raia_hadsell_the_ai_breakthroughs_we_ve_overlooked_and_how_they_re_transforming_science

15. Dreaming of daily life with superintelligent AI https://www.ted.com/talks/stephanie_zhan_dreaming_of_daily_life_with_superintelligent_ai

16. Synthetic Data Paves the Way for Self-Driving Cars https://spectrum.ieee.org/synthetic-data-self-driving

Neuroscience

1. A Neuralink Rival Says Its Eye Implant Restored Vision in Blind People https://www.wired.com/story/science-corporation-neuralink-eye-implant-restored-vision-blind-people/ [no paywall: https://archive.is/bkuLo]

2. To the brain, Esperanto and Klingon appear the same as English or Mandarin https://news.mit.edu/2025/esperanto-klingon-appear-same-english-mandarin-in-brain-0318

Tech & Science

1. Ripping the fabric of space. Engineered bioweapons. Uncontrolled AGI. Extinction of the human species: What could cause it and how likely is it to occur? https://www.cambridge.org/core/journals/cambridge-prisms-extinction/article/extinction-of-the-human-species-what-could-cause-it-and-how-likely-is-it-to-occur/D8816A79BEF5A4C30A3E44FD8D768622

2. The feasibility of bug-resistant software, referencing DARPA's HACMS project, and introduces "flexHEG," a hardware security framework ensuring confidentiality, integrity, and availability without enabling covert surveillance, highlighting its adaptability and robust defense against various attacks. https://www.youtube.com/watch?v=4wgImjg9PPc

3. “The world will soon use human germline genomic engineering technology. The benefits will be enormous: Our children will be long-lived, will have strong and diverse capacities, and will be halfway to the end of all illness.” https://www.lesswrong.com/posts/rxcGvPrQsqoCHndwG/the-principle-of-genomic-liberty
👍8🥴21
When do experts expect AGI to arrive?

In four years, the mean estimate for when AGI will be developed has dropped from 50 years to five years.

Read more: https://80000hours.org/2025/03/when-do-experts-expect-agi-to-arrive/

See also: Why AGI could be here by 2028 https://80000hours.org/agi/guide/why-agi-could-be-here-by-2028/
😁7🥱5👍1
This media is not supported in your browser
VIEW IN TELEGRAM
John Conway: "It's impossible to tie a knot without letting go of the ends of the string."
😁14🤯5
New paper by Robert Plomin, the leading figure in behavioral genetics.

The effect of family environment on cognitive ability disappears in adulthood.

Except educational attainment (years of schooling) because of decisions in adolescence to go to university.

...genetics is the systematic, stable force responsible for individual differences in cognitive abilities in adulthood after the impact of earlier shared environment has disappeared.


Read more: https://osf.io/preprints/psyarxiv/qndj6_v1
👍9🤔3
Media is too big
VIEW IN TELEGRAM
Singing a Jewish folk song in a Central Asian Muslim country in a Middle Eastern style 🤯

13 year old girl Sofiya Fadeyeva on The Voice Uzbekistan.
🤮2215👏5
Links for 2025-03-22

AI

1. Neo-1: the world’s most advanced atomistic foundation model, unifying structure prediction and all-atom de novo generation for the first time - to decode and design the structure of life https://www.vant.ai/neo-1

2. RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness https://arxiv.org/abs/2405.17220

3. The KoLMogorov Test: can CodeLMs compress data by code generation? https://arxiv.org/abs/2503.13992v1

4. Knowledge and reasoning exhibit different scaling behaviors https://arxiv.org/abs/2503.10061

5. "Human-Inspired Agent Design in Web Automation: From Principles to Practice," shows how our multi-agent approach surpasses OpenAI Operator and Claude Computer Use by a wide margin and hit a state-of-the-art 88% on Mind2Web. https://getinvisible.com/articles/human-inspired-agent-design-in-web-automation

6. The "think" tool: Enabling Claude to stop and think in complex tool use situations https://www.anthropic.com/engineering/claude-think-tool

7. TokenSet: A fundamentally new paradigm for image generation https://github.com/Gengzigang/TokenSet

8. What Makes a Reward Model a Good Teacher? An Optimization Perspective https://arxiv.org/abs/2503.15477

9. Google plans to release new ‘open’ AI models for drug discovery https://techcrunch.com/2025/03/18/google-plans-to-release-new-open-ai-models-for-drug-discovery/

10. Fully AI driven weather prediction system could start revolution in forecasting https://www.cam.ac.uk/research/news/fully-ai-driven-weather-prediction-system-could-start-revolution-in-forecasting

11. VLMs as GeoGuessr Masters: Exceptional Performance, Hidden Biases, and Privacy Risks https://arxiv.org/abs/2502.11163

12. RF-DETR, the current SOTA for real-time object detection, fully open source and Apache 2.0 for the community. https://blog.roboflow.com/rf-detr/

13. Three Types of Intelligence Explosion https://www.forethought.org/research/three-types-of-intelligence-explosion

14. AI takeoff will likely be diffuse and salient: AI-driven automation will occur widely and transform a large share of the economy; impacts will be highly visible to most people and highly disruptive https://epoch.ai/gradient-updates/most-ai-value-will-come-from-broad-automation-not-from-r-d

15. “This is a tiny niche story but it may actually be the most catastrophic thing to happen globally yesterday.” https://x.com/robertwiblin/status/1900498802403901549

Science and Technology

1. Deciphering language processing in the human brain through LLM representations https://research.google/blog/deciphering-language-processing-in-the-human-brain-through-llm-representations/

2. Is Dark Energy Getting Weaker? New Evidence Strengthens the Case. https://www.quantamagazine.org/is-dark-energy-getting-weaker-new-evidence-strengthens-the-case-20250319/

3. A Mysterious Startup Is Developing a New Form of Solar Geoengineering https://www.wired.com/story/a-mysterious-startup-is-developing-a-new-form-of-solar-geoengineering/ [no paywall: https://archive.is/7Iia7]

4. World’s tiniest LED display has 90nm wide pixels -- smaller than a virus! It enables a record-high pixel density of 127,000 pixels per inch (PPI) and can empower future near-eye displays in AR/VR. https://www.nature.com/articles/s41586-025-08685-w

5. “…as the costs of solar continue to fall, this drawback becomes less binding, and it becomes economically justifiable to “overbuild” solar infrastructure and use it to serve increasingly large fractions of our energy demand.” https://www.construction-physics.com/p/understanding-solar-energy

Miscellaneous

1. Towards a scale-free theory of intelligent agency https://www.lesswrong.com/posts/5tYTKX4pNpiG4vzYg/towards-a-scale-free-theory-of-intelligent-agency

2. Elite Coordination via the Consensus of Power https://www.lesswrong.com/posts/zqffB6gokoivwwn7X/elite-coordination-via-the-consensus-of-power

3. NASAs Coding Requirements Are Insane https://www.youtube.com/watch?v=JWKadu0ks20
👍4