Tempat Uploadan
48 subscribers
16 photos
4 videos
134 files
649 links
Quick file mirror for everyone
(´・ω・`)
Download Telegram
Forwarded from Hacker News
Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU (Score: 152+ in 8 hours)

Link: https://readhacker.news/s/6NeCM
Comments: https://readhacker.news/c/6NeCM

Hi everyone, I'm kinda involved in some retrogaming and with some experiments I ran into the following question: "It would be possible to run transformer models bypassing the cpu/ram, connecting the gpu to the nvme?"
This is the result of that question itself and some weekend vibecoding (it has the linked library repository in the readme as well), it seems to work, even on consumer gpus, it should work better on professional ones tho
Forwarded from Hacker News (yahnc_bot)
How Taalas “prints” LLM onto a chip? https://www.anuragk.com/blog/posts/Taalas.html