DL in NLP links
@dlinnlp_links
1.06K
subscribers
5
photos
1
file
653
links
AI and DeepLearning news/articles links I use for
@dlinnlp
posts
Download Telegram
Join
DL in NLP links
1.06K subscribers
DL in NLP links
https://x.com/gargighosh/status/1873522368301408749?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://r0bk.github.io/killedbyllm/
β€
2
DL in NLP links
https://www.kscale.dev/zbot
DL in NLP links
https://x.com/_avichawla/status/1890288316110778777?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
X (formerly Twitter)
Avi Chawla (@_avichawla) on X
KV caching in LLMs, clearly explained (with visuals):
π
1
DL in NLP links
https://itcanthink.substack.com/p/paper-notes-scaling-laws-for-pre
It Can Think!
Paper Notes: Scaling Laws for Pre-Training Agents and World Models
To make big investments in scaling robotic learning, we need to understand what the scaling laws for robotics data actually are.
DL in NLP links
https://www.twitch.tv/claudeplayspokemon
Twitch
ClaudePlaysPokemon - Twitch
Claude Opus 4.1 Plays Pokemon!
DL in NLP links
https://arxiv.org/abs/2503.10622
arXiv.org
Transformers without Normalization
Normalization layers are ubiquitous in modern neural networks and have long been considered essential. This work demonstrates that Transformers without normalization can achieve the same or better...
DL in NLP links
https://x.com/dbahdanau/status/1915933162892652746?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
X (formerly Twitter)
πΊπ¦
Dzmitry Bahdanau (@DBahdanau) on X
I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference!
Code: https://t.co/AgEyxXb7Xi
Blog: https://t.co/n4FRxiEcrr
β€
1