DL in NLP links
@dlinnlp_links
1.05K
subscribers
5
photos
1
file
653
links
AI and DeepLearning news/articles links I use for
@dlinnlp
posts
Download Telegram
Join
DL in NLP links
1.05K subscribers
DL in NLP links
https://x.com/stasbekman/status/1843483262129492200?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://archive.is/2024.10.07-184310/https://www.theatlantic.com/technology/archive/2024/10/terence-tao-ai-interview/680153/
archive.is
Weβre Entering Uncharted Territory for Math - The Atlantic
archived 7 Oct 2024 18:43:10 UTC
β€
1
DL in NLP links
https://x.com/arankomatsuzaki/status/1844567821184872544?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://x.com/pronounced_kyle/status/1845451573608186103
DL in NLP links
https://arxiv.org/pdf/2406.14517
DL in NLP links
https://x.com/yoavgo/status/1845835419264442772?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
π
1
DL in NLP links
https://arxiv.org/abs/2410.05258
arXiv.org
Differential Transformer
Transformer tends to overallocate attention to irrelevant context. In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise....
DL in NLP links
https://x.com/lchoshen/status/1849060908242231329?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://arxiv.org/abs/2410.01104
arXiv.org
Softmax is not Enough (for Sharp Size Generalisation)
A key property of reasoning systems is the ability to make sharp decisions on their input data. For contemporary AI systems, a key carrier of sharp behaviour is the softmax function, with its...
π
1
DL in NLP links
https://x.com/svlevine/status/1856924796996784244?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://arxiv.org/abs/2412.01799
arXiv.org
HPRM: High-Performance Robotic Middleware for Intelligent...
The rise of intelligent autonomous systems, especially in robotics and autonomous agents, has created a critical need for robust communication middleware that can ensure real-time processing of...
DL in NLP links
https://x.com/thehumanoidhub/status/1868219800532771248?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
X (formerly Twitter)
The Humanoid Hub (@TheHumanoidHub) on X
Meta Motivo, an open-source behavioral foundation model designed to control virtual, physics-based humanoid agents.
It aims to significantly simplify the creation of general-purpose humanoid agents for robotics and virtual avatars.
Try the demo: https:β¦
β€
1
π
1
DL in NLP links
https://x.com/gargighosh/status/1873522368301408749?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://r0bk.github.io/killedbyllm/
β€
2
DL in NLP links
https://www.kscale.dev/zbot
DL in NLP links
https://x.com/_avichawla/status/1890288316110778777?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
X (formerly Twitter)
Avi Chawla (@_avichawla) on X
KV caching in LLMs, clearly explained (with visuals):
π
1
DL in NLP links
https://itcanthink.substack.com/p/paper-notes-scaling-laws-for-pre
It Can Think!
Paper Notes: Scaling Laws for Pre-Training Agents and World Models
To make big investments in scaling robotic learning, we need to understand what the scaling laws for robotics data actually are.
DL in NLP links
https://www.twitch.tv/claudeplayspokemon
Twitch
ClaudePlaysPokemon - Twitch
Claude Opus 4.1 Plays Pokemon!
DL in NLP links
https://arxiv.org/abs/2503.10622
arXiv.org
Transformers without Normalization
Normalization layers are ubiquitous in modern neural networks and have long been considered essential. This work demonstrates that Transformers without normalization can achieve the same or better...
DL in NLP links
https://x.com/dbahdanau/status/1915933162892652746?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
X (formerly Twitter)
πΊπ¦
Dzmitry Bahdanau (@DBahdanau) on X
I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference!
Code: https://t.co/AgEyxXb7Xi
Blog: https://t.co/n4FRxiEcrr
β€
1