ML Research Hub

🩵 940+ FPS Multi-Person Pose Estimation 💛

👉 RTMW (Real-Time Multi-person Whole-body pose estimation models) is a series of high-perf. models for 2D/3D body pose estimation. Over 940 FPS on #GPU! Code & models 💙

🟡 Review: https://t.ly/XkBmg

🟡 Paper: arxiv.org/pdf/2407.08634

🟡 Repo: github.com/open-mmlab/mmpose/tree/main/projects/rtmpose

https://t.iss.one/DataScienceT

🏆

Please open Telegram to view this post

VIEW IN TELEGRAM

👍3❤2❤‍🔥2

3.08K viewsedited 15:58

ML Research Hub

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

7 Apr 2025 · Zonghang Li, Tao Li, Wenjiao Feng, Mohsen Guizani, Hongfang Yu ·

Emergency of DeepSeek R1 and QwQ 32B have broken through performance barriers for running frontier large language models (#LLMs) on home devices. While consumer hardware is getting stronger and model quantization is improving, existing end-side solutions still demand #GPU clusters, large RAM/VRAM, and high bandwidth, far beyond what a common home cluster can handle. This paper introduces prima.cpp, a distributed inference system that runs 70B-scale models on everyday home devices using a mix of CPU/GPU, low RAM/VRAM, Wi-Fi, and cross-platform support. It uses mmap to manage model weights and introduces piped-ring parallelism with prefetching to hide disk loading. By modeling heterogeneity in computation, communication, disk, memory (and its management behavior), and OS, it optimally assigns model layers to each device's #CPU and GPU, further reducing token latency. An elegant algorithm named Halda is proposed to solve this NP-hard assignment problem. We evaluate prima.cpp on a common four-node home cluster. It outperforms llama.cpp,# exo, and #dllama on 30B+ models while keeping memory pressure below 6%. This brings frontier 30B-70B models, such as #Llama 3, #DeepSeek R1, #Qwen 2.5, and #QwQ to home assistants, making advanced AI truly accessible to individuals. The code is open source and available at https://github.com/Lizonghang/prima.cpp.

Paper: https://arxiv.org/pdf/2504.08791v1.pdf

Code: https://github.com/lizonghang/prima.cpp

https://t.iss.one/DataScienceT

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

👍5👏2

2.91K viewsedited 05:42

About

Blog

Apps

Platform