DL in NLP links – Telegram

DL in NLP links

1.06K subscribers

5 photos

1 file

653 links

AI and DeepLearning news/articles links I use for @dlinnlp posts

Download Telegram

About

Blog

Apps

Platform

DL in NLP links

1.06K subscribers

DL in NLP links

https://huggingface.co/papers/2408.02622

Paper page - Language Model Can Listen While Speaking

Join the discussion on this paper page

2.32K views07:36

DL in NLP links

https://github.com/EleutherAI/cookbook

GitHub - EleutherAI/cookbook: Deep learning for dummies. All the practical details and useful utilities that go into working with…

Deep learning for dummies. All the practical details and useful utilities that go into working with real models. - EleutherAI/cookbook

2.65K views16:42

DL in NLP links

https://arxiv.org/abs/2405.09999

Reward Centering

We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical...

2.82K views16:43

DL in NLP links

https://huggingface.co/papers/2408.04619

Paper page - Transformer Explainer: Interactive Learning of Text-Generative Models

Join the discussion on this paper page

2.92K views02:43

DL in NLP links

https://github.com/jndean/LossRider

GitHub - jndean/LossRider: A plotting tool that outputs Line Rider maps, so you can watch a man on a sled scoot down your loss…

A plotting tool that outputs Line Rider maps, so you can watch a man on a sled scoot down your loss curves. 🎿 - jndean/LossRider

3.5K views22:49

DL in NLP links

https://arxiv.org/abs/2109.00137

Implicit Behavioral Cloning

We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit...

3.73K views23:28

DL in NLP links

https://icrt.dev/

In-Context Imitation Learning via Next-Token Prediction

3.84K views04:33

DL in NLP links

https://x.com/spikedoanz/status/1831127711856935273?s=12&t=757tdnLa___vKX7ZeJax5A

3.99K views05:01

DL in NLP links

https://discuss.pytorch.org/t/distributed-w-torchtitan-introducing-async-tensor-parallelism-in-pytorch/209487

[Distributed w/ TorchTitan] Introducing Async Tensor Parallelism in PyTorch

with Horace He, Less Wright, Luca Wehrstedt, Tianyu Liu, Wanchao Liang TL;DR We implemented experimental async tensor parallelism support in PyTorch. We integrated it in TorchTitan and observed: Up to ~29% forward pass speedup and ~8% E2E speedup in Llama3…

🔥2

4.12K views03:55

DL in NLP links

https://arxiv.org/abs/2409.12917

Training Language Models to Self-Correct via Reinforcement Learning

Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training...

4.34K views04:29

DL in NLP links

https://arxiv.org/pdf/2405.08007

4.66K views17:30

DL in NLP links

Tldr: act dumb

4.46K views17:30

DL in NLP links

https://x.com/kellerjordan0/status/1842300916864844014?s=12&t=QgBLS4SmhE8cqdYBmhrqJA

3.9K views15:38

DL in NLP links

https://x.com/stasbekman/status/1843483262129492200?s=12&t=QgBLS4SmhE8cqdYBmhrqJA

4.05K views15:52

DL in NLP links

https://archive.is/2024.10.07-184310/https://www.theatlantic.com/technology/archive/2024/10/terence-tao-ai-interview/680153/

We’re Entering Uncharted Territory for Math - The Atlantic

archived 7 Oct 2024 18:43:10 UTC

❤1

4.8K views02:31

DL in NLP links

https://x.com/arankomatsuzaki/status/1844567821184872544?s=12&t=QgBLS4SmhE8cqdYBmhrqJA

5.21K views04:21

DL in NLP links

https://x.com/pronounced_kyle/status/1845451573608186103

6.15K views18:36

DL in NLP links

https://arxiv.org/pdf/2406.14517

6.44K views15:45

DL in NLP links

https://x.com/yoavgo/status/1845835419264442772?s=12&t=QgBLS4SmhE8cqdYBmhrqJA

👍1

7.12K views15:47

DL in NLP links

https://arxiv.org/abs/2410.05258

Differential Transformer

Transformer tends to overallocate attention to irrelevant context. In this work, we introduce Diff Transformer, which amplifies attention to the relevant context while canceling noise....

7.37K views03:24

DL in NLP links

https://x.com/lchoshen/status/1849060908242231329?s=12&t=QgBLS4SmhE8cqdYBmhrqJA

7.1K views02:37