ML Research Hub

✨LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

📝 Summary:
LUT-LLM is an FPGA accelerator for LLM inference that leverages on-chip memory to shift computation from arithmetic to memory-based operations via table lookups. This innovative approach achieves 1.66x lower latency than AMD MI210 and 1.72x higher energy efficiency than NVIDIA A100 for a 1.7B LLM.

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06174
• PDF: https://arxiv.org/pdf/2511.06174

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #FPGA #AI #DeepLearning #AIHardware

243 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AutoNeural: Co-Designing Vision-Language Models for NPU Inference

📝 Summary:
AutoNeural is an NPU-native VLM co-designed for efficient edge inference. It uses a MobileNetV5-style vision backbone for stable integer quantization and a hybrid SSM-Transformer language backbone. This design reduces quantization errors and latency, improving real-time performance on edge devices.

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.02924
• PDF: https://arxiv.org/pdf/2512.02924

🔹 Models citing this paper:
• https://huggingface.co/NexaAI/AutoNeural

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AutoNeural #VisionLanguageModels #EdgeAI #AIHardware #EfficientAI

328 views16:38

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform