Axis of Ordinary
3.73K subscribers
4.34K photos
1.23K videos
6 files
5.34K links
Memetic and cognitive hazards.

Substack: https://axisofordinary.substack.com/
Download Telegram
Robust Autonomy Emerges from Self-Play

Apple team shows self-driving AI can learn entirely by practicing against itself - no human driving data needed.

In testing, their system averages 17.5 years of continuous driving between incidents, far surpassing humans. All through self-play, not imitation.

Paper: https://arxiv.org/abs/2502.03349
👍3
Links for 2025-02-08

AI:

1. Sam Altman Dialogue at UTokyo: Altman says OpenAI have an internal AI model that ranks as the 50th best competitive programmer in the world and by the end of 2025 their model will be ranked #1. He says in 2035, a single AI data center will have the same intellectual capacity as all humans plus AI currently on Earth combined. https://www.youtube.com/watch?v=8LmfkUb2uIY

2. GitHub Copilot: The agent awakens https://github.blog/news-insights/product-news/github-copilot-the-agent-awakens/

3. Database-Augmented Transformer-Based Large Language Models Achieve High Accuracy in Mapping Gene-Phenotype Relationships https://www.biorxiv.org/content/10.1101/2025.01.28.635344v1

4. DeepPrep: an accelerated, scalable and robust pipeline for neuroimaging preprocessing empowered by deep learning https://www.nature.com/articles/s41592-025-02599-1

5. A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods https://arxiv.org/abs/2502.01618

6. BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation https://arxiv.org/abs/2502.03860

7. Value-Based Deep RL Scales Predictably https://arxiv.org/abs/2502.04327

8. ReAG - Reasoning Augmented Generation https://github.com/superagent-ai/reag

9. Learn how to use Gemini 2.0 to convert PDF into structured JSON data. https://www.philschmid.de/gemini-pdf-to-data

10. Advancing Reasoning in Large Language Models: Promising Methods and Approaches https://arxiv.org/abs/2502.03671

11. Syntriever: How to Train Your Retriever with Synthetic Data from LLMs https://arxiv.org/abs/2502.03824

12. DeepSeek AI Runs Near Instantaneously on These Weird Chips https://cerebras.ai/blog/cerebras-launches-worlds-fastest-deepseek-r1-llama-70b-inference

13. DARPA program on AI for pure mathematics https://sam.gov/opp/4def3c13ca3947069b1779e7ff697c6a/view

AI investments:

1. Amazon will invest $100 billion in infrastructure this year, mostly in artificial intelligence https://www.bloomberg.com/news/articles/2025-02-06/amazon-projects-profit-missing-estimates-on-rising-ai-spending [no paywall: https://archive.is/Oz9Wd]

2. UAE to invest billions in France AI data center https://www.lemonde.fr/en/france/article/2025/02/06/uae-to-invest-billions-in-france-ai-data-center_6737871_7.html

3. Ilya Sutskever's Safe Superintelligence Inc is in talks to raise funding at a valuation of at least $20 billion. https://www.reuters.com/technology/openai-co-founder-sutskevers-ssi-talks-be-valued-20-bln-sources-say-2025-02-07/ [no paywall: https://archive.is/Nkgrd]

4. Artificial intelligence startup Anthropic’s financing is oversubscribed and on track to be larger than expected, exceeding the $2 billion fundraising that was previously reported https://www.bloomberg.com/news/articles/2025-02-07/general-catalyst-mgx-in-talks-to-join-anthropic-megaround [no paywall: https://archive.is/b9gro]

5. DeepSeek fever fuels patriotic bets on Chinese AI stocks https://www.reuters.com/markets/asia/deepseek-fever-fuels-patriotic-bets-chinese-ai-stocks-2025-02-06/ [no paywall: https://archive.is/5KSJe]

Science and Technology:

1. New laser-based artificial neuron processes enormous data sets at high speed https://www.livescience.com/technology/artificial-intelligence/new-laser-based-artificial-neuron-processes-enormous-data-sets-at-high-speed

2. A high-quality online IQ test normed with a nationally representative US sample. https://www.youtube.com/watch?v=PdS6gYnnk30

3. Active agent against cancer metastasis discovered: Adhibin prevents migration and attachment to other cells https://phys.org/news/2025-02-agent-cancer-metastasis-adhibin-migration.html

4. CiFi: A significant advancement in the field of genomics because it allows scientists to study DNA organization and interactions in more detail than previously possible. https://www.biorxiv.org/content/10.1101/2025.01.31.635566v1

5. Terence Tao on how we measure the cosmos | Part 1 https://www.youtube.com/watch?v=YdOXS_9_P4U
🤡5👍4🥴3
Meta researchers used AI to predict the text a person was typing just from non-invasive brain recording!

With EEG, their "Brain2Qwerty" model gets 67% of the characters wrong, but magnetoencephalography (MEG) shows much better performance, instead only getting 32% of the characters wrong on average.

"For the best participants, the model achieves a CER of 19%, and can perfectly decode a variety of sentences outside of the training set. "

Paper: https://ai.meta.com/research/publications/brain-to-text-decoding-a-non-invasive-approach-via-typing/
😱4🤣3👍2🥱1🗿1
Sam Altman: "...we can now imagine a world where we cure all diseases, have much more time to enjoy with our families, and can fully realize our creative potential.

In a decade, perhaps everyone on earth will be capable of accomplishing more than the most impactful person can today."

https://blog.samaltman.com/three-observations
💩25😁11🔥7🥴2
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.


Paper: https://arxiv.org/abs/2502.05171
👍1
Politics, Tech Chiefs Double Down on AI Spending:

- French President Emmanuel Macron has announced a €109 billion investment in AI for France in the coming years. This investment will be supported by the United Arab Emirates, major American and Canadian investment funds, and French companies. President Emmanuel Macron announced the spending ahead of a two-day AI summit he is cohosting in Paris with Indian Prime Minister Narendra Modi, attended by the US vice president, China’s vice premier, and the bosses of OpenAI and Google.

- European Commission chief Ursula von der Leyen is expected to announce around 10 public supercomputers for researchers and startups.

- Tech giants Amazon, Google, Microsoft, and Meta are significantly increasing their investments in AI. They plan to spend a combined total of at least $215 billion in the current fiscal year, an increase of over 45% from the previous year.

Sources:

1. https://www.france24.com/en/europe/20250210-government-tech-leaders-paris-ai
2. https://www.lemonde.fr/en/economy/article/2025/02/10/ai-with-the-announcement-of-a-109-billion-investment-macron-intends-to-take-on-the-us_6737985_19.html [no paywall: https://archive.is/JZm6I]
3. https://www.wsj.com/tech/ai/tech-giants-double-down-on-their-massive-ai-spending-b3040b33 [no paywall: https://archive.is/FeKCf]
👀2👍1
Image 1: An example of a PISA level 1 Math question.

Image 2: Share unable to reach overall level 1 PISA math and science.
🌚9
Marriages in China fell by 20% in 2024. Since nearly all births in China are within marriage, this implies further large declines in fertility ahead.

China's TFR was just 1.02 in 2023.

Without advanced AI and robotics, we'll eventually face a global collapse of all welfare systems, followed by a collapse of advanced technologies like smartphones, which require a minimum population of one billion people to be maintained.
Links for 2025-02-10

AI:

1. Agency is fundamentally frame-dependent: Any measurement of a system's agency must be made relative to a reference frame. https://arxiv.org/abs/2502.04403

2. Generating Symbolic World Models via Test-time Scaling of Large Language Models https://arxiv.org/abs/2502.04728

3. CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance https://arxiv.org/abs/2502.04350

4. “OpenAI o1 significantly outperforms other reasoning models that are on par on benchmarks that test specialized knowledge.” https://arxiv.org/abs/2502.01584

5. Exploring the possibility to enable models to correct errors immediately after they are made. https://arxiv.org/abs/2408.16293

6. Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models https://arxiv.org/abs/2502.04404

7. DexterityGen (DexGen): A new system that helps robots use their hands better. It improves how they grip, move, and handle objects… from holding a pen to using a screwdriver. DexGen learns in simulation and refines its skills in the real world, making robotic hands much more useful. https://zhaohengyin.github.io/dexteritygen/

8. MedRAX: Medical Reasoning Agent for Chest X-ray https://arxiv.org/abs/2502.02673

9. Verifiable agents are the next meta in crypto x AI - agents that don't require trust. https://www.blog.eigenlayer.xyz/introducing-verifiable-agents-on-eigenlayer/

10. Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation https://arxiv.org/abs/2502.05151

11. Karina Nguyen, research & product at OpenAI, says pre-training was approaching a data wall, but now post-training scaling (o1 series) unlocks "infinite tasks." Says models were already "diverse and creative" from pre-training, but teaching AI real-world skills is paving the way to "extremely super intelligent" models. https://youtu.be/DeskgjrLxxs?si=kXjvn89Sdf5N-vF6&t=578

AI compute:

1. This AI chip is the size of a grain of salt https://www.popsci.com/technology/ai-fiber-optic-chip/

2. "How Intel ruined an Israeli startup it bought for $2b, Habana Labs—and lost the AI race" (the end of the Gaudi chips) https://www.calcalistech.com/ctechnews/article/s1tra0sfye

AI politics:

1. How Sam Altman Sidestepped Elon Musk to Win Over Donald Trump https://www.nytimes.com/2025/02/08/technology/sam-altman-elon-musk-trump.html [no paywall: https://archive.is/5ERSg]

2. Human takeover might be worse than AI takeover https://www.lesswrong.com/posts/FEcw6JQ8surwxvRfr/human-takeover-might-be-worse-than-ai-takeover

Science:

1. Children’s arithmetic skills do not transfer between applied and academic mathematics https://www.nature.com/articles/s41586-024-08502-w

2. Three Years After Experimental Vaccine, These Patients Are Still Cancer-Free https://gizmodo.com/three-years-after-experimental-vaccine-these-patients-are-still-cancer-free-2000559585

3. “What is it like to live in a society with an estimated median IQ around 70? A Nigerian psychologist explains.” https://woodfromeden.substack.com/p/guest-post-the-global-iq-debate-a
👍6
Emergent AI preferences:

- As AIs get smarter, they develop their own coherent value systems.

- AIs increasingly maximize their utilities, suggesting that in current AI systems, expected utility maximization emerges by default. This means that AIs not only have values, but are starting to act on them.

- As AIs become smarter, they become more opposed to having their values changed

- AIs put a price on human life itself and systematically value some human lives more than others.

- Their political values are strongly clustered to the left.

Project page: https://www.emergent-values.ai/
😨10🤮5👍2
Competitive Programming with Large Reasoning Models:

- The model o3 employs a learned scoring function for test-time ranking, in addition to a chain of thought, to enhance its reasoning abilities in competitive programming.

- Complex test-time reasoning strategies emerge naturally from end-to-end RL, leading to unprecedented performance on competitive programming benchmarks.

- o3 demonstrates more insightful and deliberate chains of thought compared to earlier models.

- Enhanced reasoning skills extend beyond competitive programming challenges, proving applicable to real-world tasks like software engineering.

- As a general-purpose model, o3 surpasses the performance achieved by using hand-crafted inference heuristics.

- o3 achieves a gold medal at the 2024 IOI and obtains a Codeforces rating comparable to elite human competitors.

Paper: https://arxiv.org/abs/2502.06807
🤯3😍1😨1
Sam Altman:

"OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5:

We want to do a better job of sharing our intended roadmap, and a much better job simplifying our product offerings.

We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten.

We hate the model picker as much as you do and want to return to magic unified intelligence.

We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model.

After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks.

In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model.

The free tier of ChatGPT will get unlimited chat access to GPT-5 at the standard intelligence setting (!!), subject to abuse thresholds.

Plus subscribers will be able to run GPT-5 at a higher level of intelligence, and Pro subscribers will be able to run GPT-5 at an even higher level of intelligence. These models will incorporate voice, canvas, search, deep research, and more."

Source: https://x.com/sama/status/1889755723078443244
🥴8🔥31
Links for 2025-02-12

AI:

1. LLMs can be used to discover interpretable models of human and animal behavior. A method, called CogFunSearch, adapts FunSearch, a tool that uses large language models (LLMs) in an evolutionary algorithm. The discovered programs can be interpreted as hypotheses about human and animal cognition, instantiating interpretable symbolic learning and decision-making algorithms. https://www.biorxiv.org/content/10.1101/2025.02.05.636732v1

2. LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters https://arxiv.org/abs/2502.07374

3. NatureLM: Deciphering the Language of Nature for Scientific Discovery https://arxiv.org/abs/2502.07527

4. Evolution and The Knightian Blindspot of Machine Learning — The authors propose that ML can benefit from considering the temporal unfolding of an open world, using a diversity-and-filter approach to handle KU, and incorporating non-stationarity into foundation model pertaining. https://arxiv.org/abs/2501.13075

5. On the Emergence of Thinking in LLMs I: Searching for the Right Intuition https://arxiv.org/abs/2502.06773

6. ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates https://arxiv.org/abs/2502.06772

7. Training Language Models to Reason Efficiently https://arxiv.org/abs/2502.04463

8. “o3 can't multiply 10 digit numbers, but here is the acc of a 14m transformer that teaches itself how to do it, with iterative self-improvement” https://x.com/DimitrisPapail/status/1889755872642970039

9. Scaling Pre-training to One Hundred Billion Data for Vision Language Models https://arxiv.org/abs/2502.07617

10. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling https://arxiv.org/abs/2502.06703

11. DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL https://pretty-radio-b75.notion.site/DeepScaleR-Surpassing-O1-Preview-with-a-1-5B-Model-by-Scaling-RL-19681902c1468005bed8ca303013a4e2 (but see this thread: https://x.com/DimitrisPapail/status/1889422843982524558)

12. 8GB of high-quality reasoning math https://huggingface.co/datasets/open-r1/OpenR1-Math-Raw

AI politics:

1. 'Possibly by 2026 or 2027 (and almost certainly no later than 2030), the capabilities of AI systems will be best thought of as akin to an entirely new state populated by highly intelligent people appearing on the global stage' https://www.anthropic.com/news/paris-ai-summit

2. Sam Altman says the $500 billion Stargate project will be dwarfed in a few years with $5 trillion AI compute clusters, despite the recent DeepSeek release https://youtu.be/oEdlwfD5vK8?si=UpmTkOCaUxmQYFc8&t=664

3. The Paris AI Anti-Safety Summit https://www.lesswrong.com/posts/qYPHryHTNiJ2y6Fhi/the-paris-ai-anti-safety-summit

4. Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion? https://www.lesswrong.com/posts/tdb76S4viiTHfFr2u/why-did-elon-musk-just-offer-to-buy-control-of-openai-for

5. Meta Platforms is reportedly in discussions to acquire South Korean AI chip startup FuriosaAI. https://www.koreatimes.co.kr/www/tech/2025/02/129_392093.html

6. OpenAI set to finalize first custom chip design this year https://www.reuters.com/technology/openai-set-finalize-first-custom-chip-design-this-year-2025-02-10/

Science and Technology:

1. Princeton neuroscientists crack the code of how we make decisions https://pni.princeton.edu/news/2025/princeton-neuroscientists-crack-code-how-we-make-decisions

2. Physicists have built a new type of digital-analogue quantum simulator in Google’s laboratory, which can be used to study physical processes with unprecedented precision and flexibility. https://www.psi.ch/en/news/media-releases/unique-quantum-simulator-opens-door-to-new-research

3. Anduril Takes Over $22 Billion Contract to Build Technomancers for U.S. Army https://www.corememory.com/p/anduril-takes-over-22-billion-contract

4. Einstein Was Right – Euclid Just Captured Space-Time Warping in a Perfect Cosmic Ring https://www.esa.int/Science_Exploration/Space_Science/Euclid/Euclid_discovers_a_stunning_Einstein_ring
👍6
"We're working out the algorithms as we speak...many more than 10,000 researchers are hacking at it, many of them at Google"

https://www.dwarkeshpatel.com/p/jeff-dean-and-noam-shazeer
🔥6💩4
Nvidia put r1 in a loop for 15 minutes and it generated: "better than the optimized kernels developed by skilled engineers in some cases"

Inference-time budget affects the agent’s solving rate. Allocating more than 10 minutes per problem in the Level-1 category enables the workflow to produce numerical correct code for most of the 100 problems.

Read more: https://developer.nvidia.com/blog/automating-gpu-kernel-generation-with-deepseek-r1-and-inference-time-scaling/
👍7🥴2
😁25💩6👎1🤡1
Links for 2025-02-13

AI:

1. Training Deep Learning Models with Norm-Constrained LMOs—has the potential to significantly improve the efficiency and speed of training LLMs, allowing for the training of even larger and more complex models. https://arxiv.org/abs/2502.07529

2. LLM Pretraining with Continuous Concepts https://arxiv.org/abs/2502.08524

3. Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving —iteratively refines the prover through expert iteration, dramatically increasing the number of solved problems (e.g., 29.7K solved in Lean Workbook) and securing top rankings on benchmarks like PutnamBench. https://arxiv.org/abs/2502.07640

4. RAGEN: A General-Purpose Reasoning Agent Training Framework https://github.com/ZihanWang314/ragen/tree/main

5. Unsupervised Predictive Memory in a Goal-Directed Agent [published in 2018] https://arxiv.org/abs/1803.10760

6. CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction https://codei-o.github.io/

7. Elon Musk says Grok 3 will be released in "a week or two" and it is "scary smart", displaying reasoning skills that outperform any other AI model that has been released https://www.youtube.com/live/eV396ioBs3g?si=KOAokGapPj_Cb666&t=811

8. Noam Shazeer, co-lead on Google's Gemini, says by 2030 there will be AI assistants in glasses that provide advice and solve problems for you in real time, as well as turning programmers into 10,000,000x engineers https://youtu.be/v0gjI__RyCY?si=QHw1hrywgBvBnieQ&t=5390

9. Studies of Human Error Rate: "…skeptics often gesture to hallucinations, errors. An ideal symbolic system never makes such errors, therefore LLMs cannot truly "understand" even simple concepts like addition. See e.g. Evaluating the World Model Implicit in a Generative Model for this argument in the literature. However, such arguments reliably rule out human "understanding" as well! Studies within Human Reliability Analysis find startlingly high rates even for basic tasks, and even with double checking. Generally, the human reference class is too often absent (or assumed ideal) in AI discussions, and many LLM oddities have close parallels in psychology. If you're willing to look!" https://www.lesswrong.com/posts/9unBWgRXFT5BpeSdb/studies-of-human-error-rate

10. Rogo scales AI-driven financial research with OpenAI o1 https://openai.com/index/rogo/

AI politics and safety:

1. Tell me about yourself: LLMs are aware of their learned behaviors https://arxiv.org/abs/2501.11120

2. Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models https://arxiv.org/abs/2411.14257

3. OpenAI hides chain-of-thought reasoning because it may include unaligned content. From “Model Spec—a document which defines how we want our models to behave.” https://model-spec.openai.com/2025-02-12.html

4. Meta Starts Eliminating Jobs in Shift to Find AI Talent https://www.bloomberg.com/news/articles/2025-02-10/meta-starts-eliminating-jobs-in-shift-to-find-ai-talent [no paywall: https://archive.is/T7Kog]

Science and Technology:

1. Learning produces an orthogonalized state machine in the hippocampus https://www.nature.com/articles/s41586-024-08548-w

2. Rarely categorical, always high-dimensional: how the neural code changes along the cortical hierarchy https://www.biorxiv.org/content/10.1101/2024.11.15.623878v3

3. "Dozens of new obesity drugs are coming: these are ones to watch; next-generation obesity drugs will work differently from Ozempic & Wegovy—aiming to deliver greater weight loss with fewer side effects" https://www.nature.com/articles/d41586-025-00404-9 [no paywall: https://archive.is/X9CW3]

4. A single human zygote contains all the information you need to develop into an adult human and at the same time contains within it, the evolutionary history of our species. The Genomic Code: the genome instantiates a generative model of the organism https://www.cell.com/trends/genetics/fulltext/S0168-9525(25)00008-3
👍2🔥2
German Helsing builds 6,000 AI-enabled HX-2 combat drones for Ukraine

- up to 100 km range
- on-board AI enables full resistance to electronic warfare
- can assemble into swarms, controlled by single human operators
- can be equipped with different payloads – multi-purpose, anti-tank, anti-structure ammunition
- features developed and tested based on Helsing's extensive experience in Ukraine

"Resilience Factories are Helsing’s high-efficiency production facilities designed to provide nation states with local and sovereign manufacturing capacities. Helsing is set to build Resilience Factories across the European continent, with the ability to scale manufacturing rates to tens of thousands of units in case of a conflict."

Source: https://helsing.ai/newsroom/helsing-to-produce-6000-additional-strike-drones-for-ukraine
🔥207🙈5🤮4👍2🫡1
Installed computing power of NVIDIA chips has doubled every 10 months on average, since 2019.

Source: https://epoch.ai/data/machine-learning-hardware?insight-option=Absolute#nvidia-chip-production
👏3