Links for 2026-02-16
AI
1. How can agents learn in long, open-ended tasks where success is rare and rewards are sparse? Enter ∆Belief-RL: a framework that uses the agent’s own internal belief changes as an intrinsic reward signal to provide dense, turn-level credit assignment. https://bethgelab.github.io/delta-belief-rl/
2. OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips https://arstechnica.com/ai/2026/02/openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips/
3. Dario Amodei: We are “near the end of the exponential,” i.e., approaching a phase where systems become good enough to substitute for very high-end human cognitive labor in many settings. https://www.dwarkesh.com/p/dario-amodei-2
4. Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning https://arxiv.org/abs/2602.11748
5. Maximum Likelihood Reinforcement Learning https://zanette-labs.github.io/MaxRL/
6. InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery https://huggingface.co/papers/2602.08990
7. Pentagon’s use of Claude during Maduro raid sparks Anthropic feud https://www.axios.com/2026/02/13/anthropic-claude-maduro-raid-pentagon [no paywall: https://archive.is/EhDOQ]
8. How to target investments to develop new Al models that can uncover natural laws https://ifp.org/nlm/
9. Insights from senior engineers on how AI is changing their jobs. [PDF] https://www.thoughtworks.com/content/dam/thoughtworks/documents/report/tw_future%20_of_software_development_retreat_%20key_takeaways.pdf
10. lf-lean: The frontier of verified software engineering https://theorem.dev/blog/lf-lean/
11. LLM-powered program synthesis to automatically model and discover differences between human and LLM strategic behavior https://arxiv.org/abs/2602.10324
12. Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities https://www.lesswrong.com/posts/m5d4sYgHbTxBnFeat/human-like-metacognitive-skills-will-reduce-llm-slop-and-aid
13. “In 3 weeks, Gauss completed Terry Tao’s & Alex Kontorovich’s Strong Prime Number Theorem project—over the prior 18+ months of partial progress by human experts.” https://www.youtube.com/watch?v=AqUpIO8MGQU
14. Terry Tao - Machine assistance and the future of research mathematics - IPAM at UCLA https://www.youtube.com/watch?v=zJvuaRVc8Bg
15. AI fully pentested a vulnerable lab in 20 minutes and got root https://github.com/vitorallo/ai-pentest-poc
16. Soft Contamination Means Benchmarks Test Shallow Generalization https://www.arxiv.org/abs/2602.12413
17. Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design https://arxiv.org/abs/2602.10016
18. Someone built a wearable AI narrator that describes your life in real-time like a movie https://www.lampysecurity.com/post/the-infinite-audio-book
19. Claude Code can compose original music using only math https://www.josh.ing/blog/claude-composer
20. “I believe the brain may have something more to teach us about AI—and that, in the process, AI may have quite a bit to teach us about the brain.” https://asteriskmag.com/issues/13/the-sweet-lesson-of-neuroscience
21. US AI-Related Investment Keeps Breaking Records, Now Exceeding $1T Per Year https://www.apricitas.io/p/americas-1t-ai-gamble
Miscellaneous
1. This Snail’s Eyes Grow Back: Could They Help Humans do the Same? https://www.ucdavis.edu/news/snails-eyes-grow-back-could-they-help-humans-do-same
2. Polaris is now the first privately developed fusion energy machine to demonstrate measurable D‑T fusion and reach over 150 million °C. https://www.helionenergy.com/articles/helion-achieves-new-fusion-energy-milestones/
3. The thesis that government debt >90% of GDP leads to slowdown in growth was a product of an Excel error https://theconversation.com/the-reinhart-rogoff-error-or-how-not-to-excel-at-economics-13646
4. America needs way more tungsten than it can get and China controls supply https://www.noleary.com/blog/posts/1
AI
1. How can agents learn in long, open-ended tasks where success is rare and rewards are sparse? Enter ∆Belief-RL: a framework that uses the agent’s own internal belief changes as an intrinsic reward signal to provide dense, turn-level credit assignment. https://bethgelab.github.io/delta-belief-rl/
2. OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips https://arstechnica.com/ai/2026/02/openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips/
3. Dario Amodei: We are “near the end of the exponential,” i.e., approaching a phase where systems become good enough to substitute for very high-end human cognitive labor in many settings. https://www.dwarkesh.com/p/dario-amodei-2
4. Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning https://arxiv.org/abs/2602.11748
5. Maximum Likelihood Reinforcement Learning https://zanette-labs.github.io/MaxRL/
6. InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery https://huggingface.co/papers/2602.08990
7. Pentagon’s use of Claude during Maduro raid sparks Anthropic feud https://www.axios.com/2026/02/13/anthropic-claude-maduro-raid-pentagon [no paywall: https://archive.is/EhDOQ]
8. How to target investments to develop new Al models that can uncover natural laws https://ifp.org/nlm/
9. Insights from senior engineers on how AI is changing their jobs. [PDF] https://www.thoughtworks.com/content/dam/thoughtworks/documents/report/tw_future%20_of_software_development_retreat_%20key_takeaways.pdf
10. lf-lean: The frontier of verified software engineering https://theorem.dev/blog/lf-lean/
11. LLM-powered program synthesis to automatically model and discover differences between human and LLM strategic behavior https://arxiv.org/abs/2602.10324
12. Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities https://www.lesswrong.com/posts/m5d4sYgHbTxBnFeat/human-like-metacognitive-skills-will-reduce-llm-slop-and-aid
13. “In 3 weeks, Gauss completed Terry Tao’s & Alex Kontorovich’s Strong Prime Number Theorem project—over the prior 18+ months of partial progress by human experts.” https://www.youtube.com/watch?v=AqUpIO8MGQU
14. Terry Tao - Machine assistance and the future of research mathematics - IPAM at UCLA https://www.youtube.com/watch?v=zJvuaRVc8Bg
15. AI fully pentested a vulnerable lab in 20 minutes and got root https://github.com/vitorallo/ai-pentest-poc
16. Soft Contamination Means Benchmarks Test Shallow Generalization https://www.arxiv.org/abs/2602.12413
17. Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design https://arxiv.org/abs/2602.10016
18. Someone built a wearable AI narrator that describes your life in real-time like a movie https://www.lampysecurity.com/post/the-infinite-audio-book
19. Claude Code can compose original music using only math https://www.josh.ing/blog/claude-composer
20. “I believe the brain may have something more to teach us about AI—and that, in the process, AI may have quite a bit to teach us about the brain.” https://asteriskmag.com/issues/13/the-sweet-lesson-of-neuroscience
21. US AI-Related Investment Keeps Breaking Records, Now Exceeding $1T Per Year https://www.apricitas.io/p/americas-1t-ai-gamble
Miscellaneous
1. This Snail’s Eyes Grow Back: Could They Help Humans do the Same? https://www.ucdavis.edu/news/snails-eyes-grow-back-could-they-help-humans-do-same
2. Polaris is now the first privately developed fusion energy machine to demonstrate measurable D‑T fusion and reach over 150 million °C. https://www.helionenergy.com/articles/helion-achieves-new-fusion-energy-milestones/
3. The thesis that government debt >90% of GDP leads to slowdown in growth was a product of an Excel error https://theconversation.com/the-reinhart-rogoff-error-or-how-not-to-excel-at-economics-13646
4. America needs way more tungsten than it can get and China controls supply https://www.noleary.com/blog/posts/1
❤3🤡2👍1
In this proof, it comes up with kind of a miraculous identity on trees that I have no idea where it’s coming from, and I have asked experts who spend a lot more time thinking about combinatorics of trees than me, and they also don’t know where it’s coming from.
In this video, Sebastien Bubeck presents a hard combinatorics problem to benchmark the reasoning capabilities of modern AI models.
While students were able to prove one part of the conjecture, a second specific inequality regarding these tree shapes remained unproven for 10 years despite efforts by experts.
Bubeck notes that asking GPT-5 the hard question directly fails. Instead, they built a "scaffolding" system where different agents proposed, executed, and verified mathematical ideas.
To help the model, they first asked it to solve the easier, already-known inequality. Once it successfully reproduced that proof, they provided that context and asked it to attack the harder, unsolved generalization.
After two days of sequential computation, the system produced a proof. It discovered a "miraculous identity" on trees that human experts had not seen before.
Watch the talk: https://youtu.be/pNAlMBIPOnk
🤯4🤡2😢1
Examples of AI systems discovering novel out-of-distribution artifacts (or showing “above human” ingenuity):
1. DeepMind (AlphaEvolve): Discovered new, verifiably better algorithms, including a new algorithm for multiplying 4×4 complex matrices using 48 scalar multiplications (beating the prior best) and reclaimed ~0.7% of Google-wide compute via scheduling improvements.
2. AlphaTensor (matrix multiplication): Discovered new matrix-multiplication algorithms that improve best-known multiplication counts for specific tensor/matrix sizes (e.g., 4×4 over Z2 in the paper).
3. Gemini “AI-assisted research”: A large multi-author preprint reports examples where Gemini-based models helped refute conjectures, find subtle proof bugs (including in cryptography), transfer tools across fields, improve algorithms/bounds (e.g., faster methods for distance/streaming problems), and make partial progress on major conjectures.
4. AlphaDev (sorting): Discovered faster tiny sorting routines at assembly level; the fixed sort routines for 3/4/5 elements were integrated into the standard sort in the LLVM C++ library (libc++).
5. AlphaChip: RL-based macro placement that produced “superhuman” layouts used in multiple generations of TPU and other chips.
6. AlphaFold: AI-driven protein structure prediction at massive scale; recognized by the 2024 Nobel Prize in Chemistry (awarded to David Baker and to Demis Hassabis + John Jumper for protein structure prediction).
7. GNoME (materials discovery): Predicted ~2.2M candidate crystal structures (Nature paper); DeepMind also reports that hundreds were later realized experimentally (736 cited in their write-up).
8. GPT-5.2, theoretical physics: Proposed a new closed-form formula for a gluon scattering amplitude. The key formula was “first conjectured by GPT-5.2 Pro and then proved by a new internal OpenAI model,” and checked by the authors (Berends–Giele recursion + multiple nontrivial consistency identities).
9. Sébastien Bubeck / OpenAI (GPT-5 scaffolding, combinatorics): In an IPAM talk, Bubeck describes a proof where the scaffolded system finds a “miraculous identity on trees” that he (and tree-combinatorics experts he asked) didn’t recognize. This identity powers a proof of a hard tree-shape inequality that resisted experts for ~10 years.
10. AlphaGeometry2 (Olympiad geometry): The authors report that geometry experts / IMO medalists consider many solutions to exhibit “superhuman creativity” (especially via non-obvious auxiliary constructions).
11. AI found real security vulnerabilities in the most hardened, well-audited codebases on the planet: Anthropic reports finding/validating 500+ high-severity vulnerabilities in widely used open-source code. AISLE reports discovering 12 previously unknown OpenSSL vulnerabilities fixed in a coordinated security release.
12. Erdős problems registry: A community-maintained wiki tracking concrete AI-assisted progress on open Erdős problems, including fully formalized Lean proofs and writeups (e.g., the reported autonomous resolution of Erdős Problem #728).
13. Sakana AI's ALE-Agent took 1st place in a live AtCoder optimization contest (AHC058), beating 804 human participants. Organizers say it found an unexpected method.
14. DeepMind + mathematicians (knot theory): ML-guided analysis led to a genuinely new theorem linking the knot signature to the natural slope (“Advancing mathematics by guiding human intuition with AI”).
15. DeepMind et al. (Kazhdan–Lusztig polynomials): A graph neural network trained on Bruhat graphs helped inspire a new formula for KL polynomials for symmetric groups (progress on the “combinatorial invariance” direction).
16. ESA/NASA Hubble archive (AnomalyMatch): AI scanned ~99.6 million Hubble cutouts in ~2–3 days and surfaced ~1,400 anomalous objects (800+ previously undocumented), published in Astronomy & Astrophysics.
17. Halicin (antibiotic discovery): a molecule-GNN screened huge chemical libraries and identified “Halicin” as a structurally novel antibiotic candidate, validated in vitro and in mouse models.
1. DeepMind (AlphaEvolve): Discovered new, verifiably better algorithms, including a new algorithm for multiplying 4×4 complex matrices using 48 scalar multiplications (beating the prior best) and reclaimed ~0.7% of Google-wide compute via scheduling improvements.
2. AlphaTensor (matrix multiplication): Discovered new matrix-multiplication algorithms that improve best-known multiplication counts for specific tensor/matrix sizes (e.g., 4×4 over Z2 in the paper).
3. Gemini “AI-assisted research”: A large multi-author preprint reports examples where Gemini-based models helped refute conjectures, find subtle proof bugs (including in cryptography), transfer tools across fields, improve algorithms/bounds (e.g., faster methods for distance/streaming problems), and make partial progress on major conjectures.
4. AlphaDev (sorting): Discovered faster tiny sorting routines at assembly level; the fixed sort routines for 3/4/5 elements were integrated into the standard sort in the LLVM C++ library (libc++).
5. AlphaChip: RL-based macro placement that produced “superhuman” layouts used in multiple generations of TPU and other chips.
6. AlphaFold: AI-driven protein structure prediction at massive scale; recognized by the 2024 Nobel Prize in Chemistry (awarded to David Baker and to Demis Hassabis + John Jumper for protein structure prediction).
7. GNoME (materials discovery): Predicted ~2.2M candidate crystal structures (Nature paper); DeepMind also reports that hundreds were later realized experimentally (736 cited in their write-up).
8. GPT-5.2, theoretical physics: Proposed a new closed-form formula for a gluon scattering amplitude. The key formula was “first conjectured by GPT-5.2 Pro and then proved by a new internal OpenAI model,” and checked by the authors (Berends–Giele recursion + multiple nontrivial consistency identities).
9. Sébastien Bubeck / OpenAI (GPT-5 scaffolding, combinatorics): In an IPAM talk, Bubeck describes a proof where the scaffolded system finds a “miraculous identity on trees” that he (and tree-combinatorics experts he asked) didn’t recognize. This identity powers a proof of a hard tree-shape inequality that resisted experts for ~10 years.
10. AlphaGeometry2 (Olympiad geometry): The authors report that geometry experts / IMO medalists consider many solutions to exhibit “superhuman creativity” (especially via non-obvious auxiliary constructions).
11. AI found real security vulnerabilities in the most hardened, well-audited codebases on the planet: Anthropic reports finding/validating 500+ high-severity vulnerabilities in widely used open-source code. AISLE reports discovering 12 previously unknown OpenSSL vulnerabilities fixed in a coordinated security release.
12. Erdős problems registry: A community-maintained wiki tracking concrete AI-assisted progress on open Erdős problems, including fully formalized Lean proofs and writeups (e.g., the reported autonomous resolution of Erdős Problem #728).
13. Sakana AI's ALE-Agent took 1st place in a live AtCoder optimization contest (AHC058), beating 804 human participants. Organizers say it found an unexpected method.
14. DeepMind + mathematicians (knot theory): ML-guided analysis led to a genuinely new theorem linking the knot signature to the natural slope (“Advancing mathematics by guiding human intuition with AI”).
15. DeepMind et al. (Kazhdan–Lusztig polynomials): A graph neural network trained on Bruhat graphs helped inspire a new formula for KL polynomials for symmetric groups (progress on the “combinatorial invariance” direction).
16. ESA/NASA Hubble archive (AnomalyMatch): AI scanned ~99.6 million Hubble cutouts in ~2–3 days and surfaced ~1,400 anomalous objects (800+ previously undocumented), published in Astronomy & Astrophysics.
17. Halicin (antibiotic discovery): a molecule-GNN screened huge chemical libraries and identified “Halicin” as a structurally novel antibiotic candidate, validated in vitro and in mouse models.
👍6🤡2🔥1🤬1😢1
This media is not supported in your browser
VIEW IN TELEGRAM
Can humanoids perform agile, autonomous, long-horizon parkour—based on what they see in the world?
𝗣𝗲𝗿𝗰𝗲𝗽𝘁𝗶𝘃𝗲 𝗛𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗣𝗮𝗿𝗸𝗼𝘂𝗿 (𝗣𝗛𝗣): a framework that chains dynamic human skills using onboard depth perception for long-horizon traversal.
Project page: https://php-parkour.github.io/
𝗣𝗲𝗿𝗰𝗲𝗽𝘁𝗶𝘃𝗲 𝗛𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗣𝗮𝗿𝗸𝗼𝘂𝗿 (𝗣𝗛𝗣): a framework that chains dynamic human skills using onboard depth perception for long-horizon traversal.
Project page: https://php-parkour.github.io/
🤡2
Experiential Reinforcement Learning: a step toward AI that truly learn from experience.
ERL process:
1. First Attempt: The model makes an initial attempt at a task and receives environmental feedback.
2. Self-Reflection: If the first attempt fails, the model generates a verbal self-reflection to analyze what went wrong and how to improve.
3. Second Attempt: The model uses this reflection as guidance to produce a refined second attempt.
4. Internalization: Successful second attempts are "internalized" into the base policy using self-distillation. This allows the model to reproduce the improved behavior in the future without needing the extra reflection step at deployment.
5. Cross-Episode Memory: Successful reflections are stored in a persistent memory, providing stable corrective patterns that the model can reuse across different tasks.
It moves LLM training toward a system grounded in experience, where agents continually adapt and learn from their own interactions.
https://arxiv.org/abs/2602.13949
ERL process:
1. First Attempt: The model makes an initial attempt at a task and receives environmental feedback.
2. Self-Reflection: If the first attempt fails, the model generates a verbal self-reflection to analyze what went wrong and how to improve.
3. Second Attempt: The model uses this reflection as guidance to produce a refined second attempt.
4. Internalization: Successful second attempts are "internalized" into the base policy using self-distillation. This allows the model to reproduce the improved behavior in the future without needing the extra reflection step at deployment.
5. Cross-Episode Memory: Successful reflections are stored in a persistent memory, providing stable corrective patterns that the model can reuse across different tasks.
It moves LLM training toward a system grounded in experience, where agents continually adapt and learn from their own interactions.
https://arxiv.org/abs/2602.13949
🔥3🤡2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
Lyria 3: Google's latest generative music model
It can turn photos and text into dynamic tracks - complete with vocals and lyrics.
🔵 High-fidelity audio with crystal-clear 48kHz stereo tracks.
🔵 Realistic vocals that sound natural and expressive.
🔵 Lyrical clarity without missing or jumbled words.
🔵 Diverse range supporting multiple genres and languages.
🔵 Tempo settings
🔵 Specific vocal styles
🔵 Precise lyrics
Read more: https://blog.google/innovation-and-ai/products/gemini-app/lyria-3/
It can turn photos and text into dynamic tracks - complete with vocals and lyrics.
🔵 High-fidelity audio with crystal-clear 48kHz stereo tracks.
🔵 Realistic vocals that sound natural and expressive.
🔵 Lyrical clarity without missing or jumbled words.
🔵 Diverse range supporting multiple genres and languages.
🔵 Tempo settings
🔵 Specific vocal styles
🔵 Precise lyrics
Read more: https://blog.google/innovation-and-ai/products/gemini-app/lyria-3/
🔥7🤮7❤1👍1🤯1🤡1
Japan’s largest toilet maker is an “undervalued and overlooked” AI play, according to a UK-based activist investor.
Palliser Capital sent a letter to the board of Toto last week exhorting it to make more of its advanced ceramics segment, saying it holds a crucial position in the semiconductor supply chain. The segment generates 40 per cent of Toto’s operating profit.
Ubiquitous in Japan and now famous across the world, Toto is best known for its heated toilet seats and “Washlet” bidet features. But the manufacturer “has quietly evolved from a traditional domestic sanitary ware champion into a rising powerhouse in advanced ceramics for semiconductor manufacturing”, Palliser said.
Source: https://www.ft.com/content/4252e45f-75fb-4dfc-aebe-72de48b7fb8e
😁6❤🔥3✍2🤡2
Gemini 3.1 Pro is here:
The model is a step forward in reasoning, designed for workflows where a simple answer isn’t enough.
On ARC-AGI-2 – which tests for novel logic patterns – it more than doubles 3 Pro’s score.
This means it can help you visualize complex topics, organize scattered data, and bring creative projects to life.
🤡2
Links for 2026-02-20 [Part 1]
AI
1. BEACONS: a framework for creating neural network solvers for partial differential equations (PDEs) that are formally verified and capable of reliable extrapolation beyond their training data. BEACONS offers a path toward neural foundation models for physics that are as reliable and rigorous as classical numerical methods. https://arxiv.org/abs/2602.14853
2. Unified Latents (UL): How to train your latents https://arxiv.org/abs/2602.17270
3. Taalas Etches AI Models Onto Transistors To Rocket Boost Inference https://www.nextplatform.com/2026/02/19/taalas-etches-ai-models-onto-transistors-to-rocket-boost-inference/
4. A data-efficient route to thermodynamically consistent, transferable protein coarse-grained models. https://rotskoff-group.github.io/transferable-cg/
5. British scientist raising $1bn for new AI lab in Europe’s biggest seed round https://www.ft.com/content/dffe72d0-4064-4412-8ebc-50198a30d40e [no paywall: https://archive.is/HWPZC]
6. Mistral AI buys Koyeb in first acquisition to back its cloud ambitions https://techcrunch.com/2026/02/17/mistral-ai-buys-koyeb-in-first-acquisition-to-back-its-cloud-ambitions/
7. When Models Manipulate Manifolds: The Geometry of a Counting Task https://arxiv.org/abs/2601.04480
8. ZUNA, a 380M-parameter BCI foundation model for EEG data, a significant milestone in the development of noninvasive thought-to-text. Fully open source, Apache 2.0. https://www.zyphra.com/post/zuna
9. How Well Did Superforecasters and Experts Predict Wet Lab Skill Uplift from LLMs? https://forecastingresearch.substack.com/p/how-well-did-superforecasters-and
10. Did GPT 5.2 make a breakthrough discovery in theoretical physics? https://huggingface.co/blog/dlouapre/gpt-single-minus-gluons
11. A compiler expert reviews the Claude C compiler. https://www.modular.com/blog/the-claude-c-compiler-what-it-reveals-about-the-future-of-software
12. Memorization vs. generalization in deep learning: implicit biases, benign overfitting, and more https://infinitefaculty.substack.com/p/memorization-vs-generalization-in
13. SLA2: Sparse-Linear Attention with Learnable Routing and QAT https://arxiv.org/abs/2602.12675
14. Cops Are Buying ‘GeoSpy’, an AI That Geolocates Photos in Seconds https://www.404media.co/cops-are-buying-geospy-ai-that-geolocates-photos-in-seconds/ [no paywall: https://archive.is/ISxjv]
15. Lyria 3: Google’s latest generative music model https://deepmind.google/models/lyria/
AI
1. BEACONS: a framework for creating neural network solvers for partial differential equations (PDEs) that are formally verified and capable of reliable extrapolation beyond their training data. BEACONS offers a path toward neural foundation models for physics that are as reliable and rigorous as classical numerical methods. https://arxiv.org/abs/2602.14853
2. Unified Latents (UL): How to train your latents https://arxiv.org/abs/2602.17270
3. Taalas Etches AI Models Onto Transistors To Rocket Boost Inference https://www.nextplatform.com/2026/02/19/taalas-etches-ai-models-onto-transistors-to-rocket-boost-inference/
4. A data-efficient route to thermodynamically consistent, transferable protein coarse-grained models. https://rotskoff-group.github.io/transferable-cg/
5. British scientist raising $1bn for new AI lab in Europe’s biggest seed round https://www.ft.com/content/dffe72d0-4064-4412-8ebc-50198a30d40e [no paywall: https://archive.is/HWPZC]
6. Mistral AI buys Koyeb in first acquisition to back its cloud ambitions https://techcrunch.com/2026/02/17/mistral-ai-buys-koyeb-in-first-acquisition-to-back-its-cloud-ambitions/
7. When Models Manipulate Manifolds: The Geometry of a Counting Task https://arxiv.org/abs/2601.04480
8. ZUNA, a 380M-parameter BCI foundation model for EEG data, a significant milestone in the development of noninvasive thought-to-text. Fully open source, Apache 2.0. https://www.zyphra.com/post/zuna
9. How Well Did Superforecasters and Experts Predict Wet Lab Skill Uplift from LLMs? https://forecastingresearch.substack.com/p/how-well-did-superforecasters-and
10. Did GPT 5.2 make a breakthrough discovery in theoretical physics? https://huggingface.co/blog/dlouapre/gpt-single-minus-gluons
11. A compiler expert reviews the Claude C compiler. https://www.modular.com/blog/the-claude-c-compiler-what-it-reveals-about-the-future-of-software
12. Memorization vs. generalization in deep learning: implicit biases, benign overfitting, and more https://infinitefaculty.substack.com/p/memorization-vs-generalization-in
13. SLA2: Sparse-Linear Attention with Learnable Routing and QAT https://arxiv.org/abs/2602.12675
14. Cops Are Buying ‘GeoSpy’, an AI That Geolocates Photos in Seconds https://www.404media.co/cops-are-buying-geospy-ai-that-geolocates-photos-in-seconds/ [no paywall: https://archive.is/ISxjv]
15. Lyria 3: Google’s latest generative music model https://deepmind.google/models/lyria/
👍3🤡3
Links for 2026-02-20 [Part 2]
Agentic AI
1. Team of Thoughts: Efficient Test-time Scaling of Agentic Systems through Orchestrated Tool Calling https://arxiv.org/abs/2602.16485
2. The first AI that runs continuously, earns its own existence, and self-improves. It gets: 1. Its own crypto wallet and private keys 2. Ability to pay for servers and AI models using stablecoins 3. Access to deploy products, register domains, and market services 4. Permission to earn money and fund new copies of itself. If it runs out of money, it dies. If it earns enough, it replicates. https://web4.ai/
3. The Rise of RentAHuman, the Marketplace Where Bots Put People to Work https://www.wired.com/story/ai-agent-rentahuman-bots-hire-humans/ [no paywall: https://archive.is/AMZQr]
4. GLM-5: from Vibe Coding to Agentic Engineering https://arxiv.org/abs/2602.15763
5. Lossless Context Management (LCM), which reframes how agents handle long contexts. It outperforms Claude Code on long-context tasks. [PDF] https://papers.voltropy.com/LCM
6. “centimators.model_estimators.KerasCortex introduces a novel approach to model development by automating aspects of architecture search. It wraps a Keras-based estimator and leverages a Large Language Model (LLM) to recursively self-reflect on its own architecture.” https://crowdcent.github.io/centimators/user-guide/keras-cortex/
7. Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems https://arxiv.org/abs/2602.15198
8. Measuring AI agent autonomy in practice https://www.anthropic.com/research/measuring-agent-autonomy
9. Making smart contracts safer by evaluating AI agents’ ability to detect, patch, and exploit vulnerabilities in blockchain environments. https://openai.com/index/introducing-evmbench/
10. SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks https://arxiv.org/abs/2602.12670
11. A Guide to Which AI to Use in the Agentic Era https://www.oneusefulthing.org/p/a-guide-to-which-ai-to-use-in-the
Robotics
1. A neural blueprint for human-like intelligence in soft robots https://news.mit.edu/2026/neural-blueprint-human-intelligence-in-soft-robots-0219
2. SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control https://nvlabs.github.io/GEAR-SONIC/
Science and Technology
1. The Solar Power Unlock for SpaceX’s 100 kW/ton Compute Satellites https://research.33fg.com/analysis/the-solar-power-unlock-for-spacex-s-100-kw-ton-compute-satellites
2. A New Complexity Theory for the Quantum Age https://www.quantamagazine.org/a-new-complexity-theory-for-the-quantum-age-20260217/
3. 3D-printing platform rapidly produces complex electric machines https://news.mit.edu/2026/3d-printing-platform-rapidly-produces-complex-electric-machines-0218
4. Researchers develop 3D printing method to replicate structures as complex as human tissue https://thedailytexan.com/2026/02/12/researchers-develop-3d-printing-method-to-replicate-structures-as-complex-as-human-tissue/
5. Oxygen metabolism in descendants of the archaeal-eukaryotic ancestor https://www.nature.com/articles/s41586-026-10128-z
6. Scientists thought they understood global warming. Then the past three years happened. https://www.washingtonpost.com/climate-environment/interactive/2026/climate-change-temperature-rate-accelerating/ [no paywall: https://archive.is/vfhK7]
Agentic AI
1. Team of Thoughts: Efficient Test-time Scaling of Agentic Systems through Orchestrated Tool Calling https://arxiv.org/abs/2602.16485
2. The first AI that runs continuously, earns its own existence, and self-improves. It gets: 1. Its own crypto wallet and private keys 2. Ability to pay for servers and AI models using stablecoins 3. Access to deploy products, register domains, and market services 4. Permission to earn money and fund new copies of itself. If it runs out of money, it dies. If it earns enough, it replicates. https://web4.ai/
3. The Rise of RentAHuman, the Marketplace Where Bots Put People to Work https://www.wired.com/story/ai-agent-rentahuman-bots-hire-humans/ [no paywall: https://archive.is/AMZQr]
4. GLM-5: from Vibe Coding to Agentic Engineering https://arxiv.org/abs/2602.15763
5. Lossless Context Management (LCM), which reframes how agents handle long contexts. It outperforms Claude Code on long-context tasks. [PDF] https://papers.voltropy.com/LCM
6. “centimators.model_estimators.KerasCortex introduces a novel approach to model development by automating aspects of architecture search. It wraps a Keras-based estimator and leverages a Large Language Model (LLM) to recursively self-reflect on its own architecture.” https://crowdcent.github.io/centimators/user-guide/keras-cortex/
7. Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems https://arxiv.org/abs/2602.15198
8. Measuring AI agent autonomy in practice https://www.anthropic.com/research/measuring-agent-autonomy
9. Making smart contracts safer by evaluating AI agents’ ability to detect, patch, and exploit vulnerabilities in blockchain environments. https://openai.com/index/introducing-evmbench/
10. SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks https://arxiv.org/abs/2602.12670
11. A Guide to Which AI to Use in the Agentic Era https://www.oneusefulthing.org/p/a-guide-to-which-ai-to-use-in-the
Robotics
1. A neural blueprint for human-like intelligence in soft robots https://news.mit.edu/2026/neural-blueprint-human-intelligence-in-soft-robots-0219
2. SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control https://nvlabs.github.io/GEAR-SONIC/
Science and Technology
1. The Solar Power Unlock for SpaceX’s 100 kW/ton Compute Satellites https://research.33fg.com/analysis/the-solar-power-unlock-for-spacex-s-100-kw-ton-compute-satellites
2. A New Complexity Theory for the Quantum Age https://www.quantamagazine.org/a-new-complexity-theory-for-the-quantum-age-20260217/
3. 3D-printing platform rapidly produces complex electric machines https://news.mit.edu/2026/3d-printing-platform-rapidly-produces-complex-electric-machines-0218
4. Researchers develop 3D printing method to replicate structures as complex as human tissue https://thedailytexan.com/2026/02/12/researchers-develop-3d-printing-method-to-replicate-structures-as-complex-as-human-tissue/
5. Oxygen metabolism in descendants of the archaeal-eukaryotic ancestor https://www.nature.com/articles/s41586-026-10128-z
6. Scientists thought they understood global warming. Then the past three years happened. https://www.washingtonpost.com/climate-environment/interactive/2026/climate-change-temperature-rate-accelerating/ [no paywall: https://archive.is/vfhK7]
🤡5❤2👍2
Something many people miss about AI progress is that there can be sudden jumps in usefulness despite only minor gains in a model's intelligence. Incremental gains can be exponentially valuable.
Increasing the single-step success rate of a model from 99% to 99.9% can seem irrelevant, but for a task that requires 50 steps, it makes the difference between a coin flip and production-ready autonomy. Reducing the error rate from 1% to 0.1% might require exponentially more compute, but the payoff might yield a system that crosses a threshold from being a brittle copilot to agentic autonomy.
We've seen this with Claude Opus 4.5. It was an inflection point for adoption despite not being vastly smarter than the previous version. It just crossed a critical threshold.
Something very similar is true for human evolution. For hundreds of thousands of years, archaic humans were working with the same stone tools. Then a threshold was crossed. We stopped compounding errors in long-horizon tasks and started compounding correctness.
The phase change between an average person and someone like John von Neumann does not require a dramatically new brain architecture or a vastly higher number of neurons. Yet this difference is what enables someone to contribute to the development of nuclear weapons instead of being a garbage collector.
Next time you wonder why AI labs would bother spending exponentially more compute on minimal absolute gains, remember that a small delta in per-step reliability could make the difference between a brittle tool and something that can recursively self-improve.
Increasing the single-step success rate of a model from 99% to 99.9% can seem irrelevant, but for a task that requires 50 steps, it makes the difference between a coin flip and production-ready autonomy. Reducing the error rate from 1% to 0.1% might require exponentially more compute, but the payoff might yield a system that crosses a threshold from being a brittle copilot to agentic autonomy.
We've seen this with Claude Opus 4.5. It was an inflection point for adoption despite not being vastly smarter than the previous version. It just crossed a critical threshold.
Something very similar is true for human evolution. For hundreds of thousands of years, archaic humans were working with the same stone tools. Then a threshold was crossed. We stopped compounding errors in long-horizon tasks and started compounding correctness.
The phase change between an average person and someone like John von Neumann does not require a dramatically new brain architecture or a vastly higher number of neurons. Yet this difference is what enables someone to contribute to the development of nuclear weapons instead of being a garbage collector.
Next time you wonder why AI labs would bother spending exponentially more compute on minimal absolute gains, remember that a small delta in per-step reliability could make the difference between a brittle tool and something that can recursively self-improve.
🥴7👍6🤡2
This stuff is just super exciting. Imagine if everyone had access to their own personal Terence Tao. Even more exciting, what if we can achieve superhuman mathematical abilities? What could we learn?
Anyway, we still have to wait for expert evaluation of the proofs. It takes time because only a very few people have the combination of intelligence, expertise, and motivation to do this.
Read more: https://openai.com/index/first-proof-submissions/
Anyway, we still have to wait for expert evaluation of the proofs. It takes time because only a very few people have the combination of intelligence, expertise, and motivation to do this.
Read more: https://openai.com/index/first-proof-submissions/
🥴3🤡1
Media is too big
VIEW IN TELEGRAM
How Demis Hassabis would test if a model meets the criteria for AGI:
Train AI on all human knowledge. Cut it off at 1911. See if it independently discovers general relativity like Einstein did in 1915.
And now consider that he puts AGI in the ~5-10 year range.
Train AI on all human knowledge. Cut it off at 1911. See if it independently discovers general relativity like Einstein did in 1915.
And now consider that he puts AGI in the ~5-10 year range.
🤡4👍3
Professor of mathematics Daniel Litt writes about the future of math and his evolving views of AI progress: https://www.daniellitt.com/blog/2026/2/20/mathematics-in-the-library-of-babel
👍6🤡3🥴2
Links for 2026-02-22
AI
1. Did Claude 3 Opus align itself via gradient hacking? https://www.lesswrong.com/posts/ioZxrP7BhS5ArK59w/did-claude-3-opus-align-itself-via-gradient-hacking
2. DreamDojo: The first robot world model of its kind that demonstrates strong generalization to diverse objects and environments after post-training. https://dreamdojo-world.github.io/
3. From a handful of comments, LLMs can infer where you live, what you do, and your interests; then search for you on the web. https://arxiv.org/abs/2602.16800
4. Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens https://arxiv.org/abs/2602.13517
5. Claude Code Security: It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss. https://www.anthropic.com/news/claude-code-security
6. AI/ML, multiscale modeling, and emergence https://nanoscale.blogspot.com/2026/02/aiml-multiscale-modeling-and-emergence.html
7. The Country That’s Madly in Love With AI https://www.politico.com/news/magazine/2026/02/21/south-korea-ai-popular-why-00789618
Science and Technology
1. Battery storage costs fell 25% in 2025. https://www.semafor.com/article/02/19/2026/battery-storage-prices-drop-to-record-low-report-finds
2. A fluid can store solar energy and then release it as heat months later https://arstechnica.com/science/2026/02/dna-inspired-molecule-breaks-records-for-storing-solar-heat/
3. Element Biosciences announced that its high-throughput benchtop sequencing device called VITARI can deliver a whole genome for $100. https://www.sandiegouniontribune.com/2026/02/19/scrappy-san-diego-startup-goes-toe-to-toe-with-gene-sequencing-giant-illumina/
4. Microsoft’s Glass Chip Holds Terabytes of Data for 10,000 Years https://gizmodo.com/microsofts-glass-chip-holds-terabytes-of-data-for-10000-years-2000723455
5. Bacteria Frozen Inside 5,000-Year-Old Ice Cave Is Crazy Resistant to Antibiotics https://gizmodo.com/bacteria-frozen-inside-5000-year-old-ice-cave-is-crazy-resistant-to-antibiotics-2000723002
AI
1. Did Claude 3 Opus align itself via gradient hacking? https://www.lesswrong.com/posts/ioZxrP7BhS5ArK59w/did-claude-3-opus-align-itself-via-gradient-hacking
2. DreamDojo: The first robot world model of its kind that demonstrates strong generalization to diverse objects and environments after post-training. https://dreamdojo-world.github.io/
3. From a handful of comments, LLMs can infer where you live, what you do, and your interests; then search for you on the web. https://arxiv.org/abs/2602.16800
4. Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens https://arxiv.org/abs/2602.13517
5. Claude Code Security: It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss. https://www.anthropic.com/news/claude-code-security
6. AI/ML, multiscale modeling, and emergence https://nanoscale.blogspot.com/2026/02/aiml-multiscale-modeling-and-emergence.html
7. The Country That’s Madly in Love With AI https://www.politico.com/news/magazine/2026/02/21/south-korea-ai-popular-why-00789618
Science and Technology
1. Battery storage costs fell 25% in 2025. https://www.semafor.com/article/02/19/2026/battery-storage-prices-drop-to-record-low-report-finds
2. A fluid can store solar energy and then release it as heat months later https://arstechnica.com/science/2026/02/dna-inspired-molecule-breaks-records-for-storing-solar-heat/
3. Element Biosciences announced that its high-throughput benchtop sequencing device called VITARI can deliver a whole genome for $100. https://www.sandiegouniontribune.com/2026/02/19/scrappy-san-diego-startup-goes-toe-to-toe-with-gene-sequencing-giant-illumina/
4. Microsoft’s Glass Chip Holds Terabytes of Data for 10,000 Years https://gizmodo.com/microsofts-glass-chip-holds-terabytes-of-data-for-10000-years-2000723455
5. Bacteria Frozen Inside 5,000-Year-Old Ice Cave Is Crazy Resistant to Antibiotics https://gizmodo.com/bacteria-frozen-inside-5000-year-old-ice-cave-is-crazy-resistant-to-antibiotics-2000723002
🤡2👍1🥴1
A neat proof-of-concept paper showing that transformers can snap into a general algorithm, not just memorize examples.
A tiny transformer (~777 parameters) can learn 10-digit addition and then generalize to new numbers after a sudden “grokking” jump in performance.
Paper: https://github.com/yhavinga/gpt-acc-jax/blob/main/latex_report/report.pdf
A simple information-theoretic sanity check by GPT-5.2 Thinking:
A tiny transformer (~777 parameters) can learn 10-digit addition and then generalize to new numbers after a sudden “grokking” jump in performance.
Paper: https://github.com/yhavinga/gpt-acc-jax/blob/main/latex_report/report.pdf
A simple information-theoretic sanity check by GPT-5.2 Thinking:
Inputs: two 10-digit numbers → about 10^10 choices each → 10^20 possible pairs
Output: the sum is up to 11 digits → roughly ~34 bits of information (since log2(2*10^10) ≈ 34)
A full lookup table would need about
bits_needed ≈ 10^20 * 34 ≈ 3.4e21 bits
But the model has only ~777 weights. Even if you imagine 32-bit floats, that’s at most
bits_model ≤ 777 * 32 ≈ 2.5e4 bits
So: 3.4e21 / 2.5e4 ≈ 1e17 times more bits would be needed to store the full mapping.
Conclusion: it can’t be “memorize every input → output”. The only plausible route is compression: learn the rule (carry propagation) that generates the right answer for any input.
A crisp demonstration that the transformer machinery can represent and discover real algorithms under the right training setup.
🤡3🙏2❤1
Two European robotics startups are bringing AI-driven robotics into industrial production.
Sereact, a German robotics startup based in Stuttgart, released Cortex 2.0. It adds planning to manipulation by predicting future outcomes before committing the best one to motion.
Read more: https://cortex2.sereact.ai/
Mimic Robotics, a Swiss startup based in Zurich, focuses on “physical AI” for dexterous manipulation. They're collaborating with AUDI AG to deploy AI-driven robotic systems in industrial production, specifically highlighting an end-to-end “pixel-to-action” model on a bi-manual platform doing complex, long-horizon insertion tasks (a type of assembly operation that’s typically hard to automate robustly).
Read more: https://www.mimicrobotics.com/
Sereact, a German robotics startup based in Stuttgart, released Cortex 2.0. It adds planning to manipulation by predicting future outcomes before committing the best one to motion.
Read more: https://cortex2.sereact.ai/
Mimic Robotics, a Swiss startup based in Zurich, focuses on “physical AI” for dexterous manipulation. They're collaborating with AUDI AG to deploy AI-driven robotic systems in industrial production, specifically highlighting an end-to-end “pixel-to-action” model on a bi-manual platform doing complex, long-horizon insertion tasks (a type of assembly operation that’s typically hard to automate robustly).
Read more: https://www.mimicrobotics.com/
👍2