DL in NLP links
@dlinnlp_links
1.06K
subscribers
5
photos
1
file
653
links
AI and DeepLearning news/articles links I use for
@dlinnlp
posts
Download Telegram
Join
DL in NLP links
1.06K subscribers
DL in NLP links
https://huggingface.co/papers/2408.02622
huggingface.co
Paper page - Language Model Can Listen While Speaking
Join the discussion on this paper page
DL in NLP links
https://github.com/EleutherAI/cookbook
GitHub
GitHub - EleutherAI/cookbook: Deep learning for dummies. All the practical details and useful utilities that go into working with…
Deep learning for dummies. All the practical details and useful utilities that go into working with real models. - EleutherAI/cookbook
DL in NLP links
https://arxiv.org/abs/2405.09999
arXiv.org
Reward Centering
We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical...
DL in NLP links
https://huggingface.co/papers/2408.04619
huggingface.co
Paper page - Transformer Explainer: Interactive Learning of Text-Generative Models
Join the discussion on this paper page
DL in NLP links
https://github.com/jndean/LossRider
GitHub
GitHub - jndean/LossRider: A plotting tool that outputs Line Rider maps, so you can watch a man on a sled scoot down your loss…
A plotting tool that outputs Line Rider maps, so you can watch a man on a sled scoot down your loss curves.
🎿
- jndean/LossRider
DL in NLP links
https://arxiv.org/abs/2109.00137
arXiv.org
Implicit Behavioral Cloning
We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit...
DL in NLP links
https://icrt.dev/
icrt.dev
In-Context Imitation Learning via Next-Token Prediction
DL in NLP links
https://x.com/spikedoanz/status/1831127711856935273?s=12&t=757tdnLa___vKX7ZeJax5A
DL in NLP links
https://discuss.pytorch.org/t/distributed-w-torchtitan-introducing-async-tensor-parallelism-in-pytorch/209487
PyTorch Forums
[Distributed w/ TorchTitan] Introducing Async Tensor Parallelism in PyTorch
with Horace He, Less Wright, Luca Wehrstedt, Tianyu Liu, Wanchao Liang TL;DR We implemented experimental async tensor parallelism support in PyTorch. We integrated it in TorchTitan and observed: Up to ~29% forward pass speedup and ~8% E2E speedup in Llama3…
🔥
2
DL in NLP links
https://arxiv.org/abs/2409.12917
arXiv.org
Training Language Models to Self-Correct via Reinforcement Learning
Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training...
DL in NLP links
https://arxiv.org/pdf/2405.08007
DL in NLP links
Tldr: act dumb
DL in NLP links
https://x.com/kellerjordan0/status/1842300916864844014?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://x.com/stasbekman/status/1843483262129492200?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://archive.is/2024.10.07-184310/https://www.theatlantic.com/technology/archive/2024/10/terence-tao-ai-interview/680153/
archive.is
We’re Entering Uncharted Territory for Math - The Atlantic
archived 7 Oct 2024 18:43:10 UTC
❤
1
DL in NLP links
https://x.com/arankomatsuzaki/status/1844567821184872544?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
DL in NLP links
https://x.com/pronounced_kyle/status/1845451573608186103
DL in NLP links
https://arxiv.org/pdf/2406.14517
DL in NLP links
https://x.com/yoavgo/status/1845835419264442772?s=12&t=QgBLS4SmhE8cqdYBmhrqJA
👍
1
DL in NLP links
https://arxiv.org/abs/2410.05258
arXiv.org
Differential Transformer
Transformer tends to overallocate attention to irrelevant context. In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise....
DL in NLP links
https://x.com/lchoshen/status/1849060908242231329?s=12&t=QgBLS4SmhE8cqdYBmhrqJA