Media is too big
VIEW IN TELEGRAM
✨OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation
📝 Summary:
OmniRoam generates long-horizon panoramic videos using a two-stage approach for improved scene completeness and consistency. It first previews a trajectory-controlled video, then refines and extends it to high-resolution, long-range panoramas, enabling high-fidelity world wandering.
🔹 Publication Date: Published on Mar 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.30045
• PDF: https://arxiv.org/pdf/2603.30045
• Project Page: https://yuheng.ink/project-page/omniroam/
• Github: https://github.com/yuhengliu02/OmniRoam
🔹 Models citing this paper:
• https://huggingface.co/Yuheng02/OmniRoam
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmniRoam generates long-horizon panoramic videos using a two-stage approach for improved scene completeness and consistency. It first previews a trajectory-controlled video, then refines and extends it to high-resolution, long-range panoramas, enabling high-fidelity world wandering.
🔹 Publication Date: Published on Mar 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.30045
• PDF: https://arxiv.org/pdf/2603.30045
• Project Page: https://yuheng.ink/project-page/omniroam/
• Github: https://github.com/yuhengliu02/OmniRoam
🔹 Models citing this paper:
• https://huggingface.co/Yuheng02/OmniRoam
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model
📝 Summary:
CALM is a unified model bridging the gap between multi-turn conversations and tool use in language agents. Trained on a new multi-task dataset CALM-IT, it integrates both capabilities. CALM outperforms specialized models, including GPT-4o, across various benchmarks.
🔹 Publication Date: Published on Feb 12, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.08820
• PDF: https://arxiv.org/pdf/2502.08820
• Project Page: https://emrecanacikgoz.github.io/CoALM/
• Github: https://github.com/oumi-ai/oumi
🔹 Models citing this paper:
• https://huggingface.co/uiuc-convai/CoALM-8B
• https://huggingface.co/uiuc-convai/CoALM-405B
• https://huggingface.co/uiuc-convai/CoALM-70B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/uiuc-convai/CoALM-IT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
CALM is a unified model bridging the gap between multi-turn conversations and tool use in language agents. Trained on a new multi-task dataset CALM-IT, it integrates both capabilities. CALM outperforms specialized models, including GPT-4o, across various benchmarks.
🔹 Publication Date: Published on Feb 12, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.08820
• PDF: https://arxiv.org/pdf/2502.08820
• Project Page: https://emrecanacikgoz.github.io/CoALM/
• Github: https://github.com/oumi-ai/oumi
🔹 Models citing this paper:
• https://huggingface.co/uiuc-convai/CoALM-8B
• https://huggingface.co/uiuc-convai/CoALM-405B
• https://huggingface.co/uiuc-convai/CoALM-70B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/uiuc-convai/CoALM-IT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
Can a Single Model Master Both Multi-turn Conversations and Tool...
Large Language Models (LLMs) with API-calling capabilities enabled building effective Language Agents (LA), while also revolutionizing the conventional task-oriented dialogue (TOD) paradigm....
✨Terminal Agents Suffice for Enterprise Automation
📝 Summary:
Simple terminal-based coding agents interacting directly with platform APIs, powered by foundation models, are highly effective for enterprise automation. These low-level agents match or outperform complex tool-augmented systems, demonstrating that elaborate agent architectures are often unnecess...
🔹 Publication Date: Published on Mar 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00073
• PDF: https://arxiv.org/pdf/2604.00073
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Simple terminal-based coding agents interacting directly with platform APIs, powered by foundation models, are highly effective for enterprise automation. These low-level agents match or outperform complex tool-augmented systems, demonstrating that elaborate agent architectures are often unnecess...
🔹 Publication Date: Published on Mar 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00073
• PDF: https://arxiv.org/pdf/2604.00073
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding
📝 Summary:
A dynamic agentic framework called TAB addresses 3D visual grounding by decoupling spatial semantics resolution from 3D structure instantiation through 2D VLMs and multi-view geometry, achieving super...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00528
• PDF: https://arxiv.org/pdf/2604.00528
• Github: https://github.com/WHB139426/TAB-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A dynamic agentic framework called TAB addresses 3D visual grounding by decoupling spatial semantics resolution from 3D structure instantiation through 2D VLMs and multi-view geometry, achieving super...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00528
• PDF: https://arxiv.org/pdf/2604.00528
• Github: https://github.com/WHB139426/TAB-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation
📝 Summary:
GaussianGPT uses a transformer-based autoregressive approach with 3D rotary positional embeddings to generate 3D scenes by predicting Gaussian primitives, offering advantages over diffusion methods in...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26661
• PDF: https://arxiv.org/pdf/2603.26661
• Project Page: https://nicolasvonluetzow.github.io/GaussianGPT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
GaussianGPT uses a transformer-based autoregressive approach with 3D rotary positional embeddings to generate 3D scenes by predicting Gaussian primitives, offering advantages over diffusion methods in...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26661
• PDF: https://arxiv.org/pdf/2603.26661
• Project Page: https://nicolasvonluetzow.github.io/GaussianGPT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Universal YOCO for Efficient Depth Scaling
📝 Summary:
Universal YOCO YOCO-U merges YOCO architecture with recursive computation for efficient LLM depth scaling. It uses iterative processing in shallow attention layers, offering constant KV cache and better token utility.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01220
• PDF: https://arxiv.org/pdf/2604.01220
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Universal YOCO YOCO-U merges YOCO architecture with recursive computation for efficient LLM depth scaling. It uses iterative processing in shallow attention layers, offering constant KV cache and better token utility.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01220
• PDF: https://arxiv.org/pdf/2604.01220
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning
📝 Summary:
PerceptionComp is a new video benchmark for complex, long-horizon perception-centric reasoning. It requires multiple temporal visual evidence and compositional logic. Current AI models struggle significantly, highlighting a major bottleneck in perceptual video reasoning.
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26653
• PDF: https://arxiv.org/pdf/2603.26653
• Project Page: https://perceptioncomp.github.io/
• Github: https://github.com/hrinnnn/PerceptionComp
✨ Datasets citing this paper:
• https://huggingface.co/datasets/hrinnnn/PerceptionComp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PerceptionComp is a new video benchmark for complex, long-horizon perception-centric reasoning. It requires multiple temporal visual evidence and compositional logic. Current AI models struggle significantly, highlighting a major bottleneck in perceptual video reasoning.
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26653
• PDF: https://arxiv.org/pdf/2603.26653
• Project Page: https://perceptioncomp.github.io/
• Github: https://github.com/hrinnnn/PerceptionComp
✨ Datasets citing this paper:
• https://huggingface.co/datasets/hrinnnn/PerceptionComp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?
📝 Summary:
ViGoR benchmark addresses limitations in current AIGC evaluation by introducing a comprehensive framework for assessing visual generative reasoning across multiple modalities and cognitive dimensions....
🔹 Publication Date: Published on Mar 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25823
• PDF: https://arxiv.org/pdf/2603.25823
• Project Page: https://vincenthancoder.github.io/ViGoR-Bench/
• Github: https://github.com/VincentHancoder/ViGoR-Bench-Eval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ViGoR benchmark addresses limitations in current AIGC evaluation by introducing a comprehensive framework for assessing visual generative reasoning across multiple modalities and cognitive dimensions....
🔹 Publication Date: Published on Mar 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25823
• PDF: https://arxiv.org/pdf/2603.25823
• Project Page: https://vincenthancoder.github.io/ViGoR-Bench/
• Github: https://github.com/VincentHancoder/ViGoR-Bench-Eval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Embarrassingly Simple Self-Distillation Improves Code Generation
📝 Summary:
Simple self-distillation improves code generation in large language models by fine-tuning on model-generated samples, effectively addressing precision-exploration trade-offs in decoding. AI-generated ...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01193
• PDF: https://arxiv.org/pdf/2604.01193
• Github: https://github.com/apple/ml-ssd
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Simple self-distillation improves code generation in large language models by fine-tuning on model-generated samples, effectively addressing precision-exploration trade-offs in decoding. AI-generated ...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01193
• PDF: https://arxiv.org/pdf/2604.01193
• Github: https://github.com/apple/ml-ssd
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines
📝 Summary:
Multi-LLM revision pipelines' effectiveness varies by task structure and draft quality, with gains decomposing into re-solving, scaffold, and content components rather than representing uniform error ...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01029
• PDF: https://arxiv.org/pdf/2604.01029
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-LLM revision pipelines' effectiveness varies by task structure and draft quality, with gains decomposing into re-solving, scaffold, and content components rather than representing uniform error ...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01029
• PDF: https://arxiv.org/pdf/2604.01029
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation
📝 Summary:
A native discrete diffusion framework unifies multi-modal understanding and generation for robotic manipulation, enabling parallel action and visual outcome prediction with improved long-horizon consi...
🔹 Publication Date: Published on Mar 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25406
• PDF: https://arxiv.org/pdf/2603.25406
• Project Page: https://yliu-cs.github.io/MMaDA-VLA
• Github: https://github.com/yliu-cs/MMaDA-VLA
🔹 Models citing this paper:
• https://huggingface.co/yliu-cs/MMaDA-VLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A native discrete diffusion framework unifies multi-modal understanding and generation for robotic manipulation, enabling parallel action and visual outcome prediction with improved long-horizon consi...
🔹 Publication Date: Published on Mar 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25406
• PDF: https://arxiv.org/pdf/2603.25406
• Project Page: https://yliu-cs.github.io/MMaDA-VLA
• Github: https://github.com/yliu-cs/MMaDA-VLA
🔹 Models citing this paper:
• https://huggingface.co/yliu-cs/MMaDA-VLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨HippoCamp: Benchmarking Contextual Agents on Personal Computers
📝 Summary:
HippoCamp is a new multimodal benchmark evaluating agents on massive personal file management. It exposes significant performance gaps in current models for long-horizon retrieval and cross-modal reasoning in user-centric environments, revealing bottlenecks in multimodal perception.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01221
• PDF: https://arxiv.org/pdf/2604.01221
• Project Page: https://hippocamp-ai.github.io/
• Github: https://github.com/Savannah-yz/HippoCamp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HippoCamp is a new multimodal benchmark evaluating agents on massive personal file management. It exposes significant performance gaps in current models for long-horizon retrieval and cross-modal reasoning in user-centric environments, revealing bottlenecks in multimodal perception.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01221
• PDF: https://arxiv.org/pdf/2604.01221
• Project Page: https://hippocamp-ai.github.io/
• Github: https://github.com/Savannah-yz/HippoCamp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome
📝 Summary:
MiroEval is a new benchmark for deep research systems, addressing limitations of existing evaluations. It assesses adaptive synthesis, factuality, and process quality across real-user text and multimodal tasks, showing process quality predicts outcomes and multimodal tasks are very challenging.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28407
• PDF: https://arxiv.org/pdf/2603.28407
• Project Page: https://miroeval-ai.github.io/website/
• Github: https://github.com/MiroMindAI/MiroEval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MiroEval is a new benchmark for deep research systems, addressing limitations of existing evaluations. It assesses adaptive synthesis, factuality, and process quality across real-user text and multimodal tasks, showing process quality predicts outcomes and multimodal tasks are very challenging.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28407
• PDF: https://arxiv.org/pdf/2603.28407
• Project Page: https://miroeval-ai.github.io/website/
• Github: https://github.com/MiroMindAI/MiroEval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨QuitoBench: A High-Quality Open Time Series Forecasting Benchmark
📝 Summary:
QuitoBench addresses the lack of large-scale time series benchmarks by introducing a regime-balanced dataset with eight TSF regimes, revealing that foundation models outperform deep learning at long c...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26017
• PDF: https://arxiv.org/pdf/2603.26017
✨ Datasets citing this paper:
• https://huggingface.co/datasets/hq-bench/quitobench
• https://huggingface.co/datasets/hq-bench/quito-corpus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TimeSeriesForecasting #DataScience #MachineLearning #AI #QuitoBench
📝 Summary:
QuitoBench addresses the lack of large-scale time series benchmarks by introducing a regime-balanced dataset with eight TSF regimes, revealing that foundation models outperform deep learning at long c...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26017
• PDF: https://arxiv.org/pdf/2603.26017
✨ Datasets citing this paper:
• https://huggingface.co/datasets/hq-bench/quitobench
• https://huggingface.co/datasets/hq-bench/quito-corpus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TimeSeriesForecasting #DataScience #MachineLearning #AI #QuitoBench
✨Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification
📝 Summary:
Vision2Web presents a comprehensive benchmark for visual website development tasks and evaluates coding agents across static UI generation, interactive frontend reproduction, and full-stack developmen...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26648
• PDF: https://arxiv.org/pdf/2603.26648
• Project Page: https://vision2web-bench.github.io/
• Github: https://github.com/zai-org/Vision2Web
✨ Datasets citing this paper:
• https://huggingface.co/datasets/zai-org/Vision2Web
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Vision2Web presents a comprehensive benchmark for visual website development tasks and evaluates coding agents across static UI generation, interactive frontend reproduction, and full-stack developmen...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26648
• PDF: https://arxiv.org/pdf/2603.26648
• Project Page: https://vision2web-bench.github.io/
• Github: https://github.com/zai-org/Vision2Web
✨ Datasets citing this paper:
• https://huggingface.co/datasets/zai-org/Vision2Web
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers
📝 Summary:
A systematic evaluation framework called PaperRecon is proposed to assess AI-generated papers by separating quality assessment into presentation and hallucination dimensions using a benchmark of 51 re...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01128
• PDF: https://arxiv.org/pdf/2604.01128
• Project Page: https://agent4science-utokyo.github.io/PaperRecon_HP/
• Github: https://github.com/Agent4Science-UTokyo/PaperRecon
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A systematic evaluation framework called PaperRecon is proposed to assess AI-generated papers by separating quality assessment into presentation and hallucination dimensions using a benchmark of 51 re...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01128
• PDF: https://arxiv.org/pdf/2604.01128
• Project Page: https://agent4science-utokyo.github.io/PaperRecon_HP/
• Github: https://github.com/Agent4Science-UTokyo/PaperRecon
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants
📝 Summary:
A framework for proactive agent research is introduced that models applications as finite state machines to enable realistic user simulation and task execution across multiple digital environments. AI...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00842
• PDF: https://arxiv.org/pdf/2604.00842
• Github: https://github.com/deepakn97/pare
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A framework for proactive agent research is introduced that models applications as finite state machines to enable realistic user simulation and task execution across multiple digital environments. AI...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00842
• PDF: https://arxiv.org/pdf/2604.00842
• Github: https://github.com/deepakn97/pare
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment
📝 Summary:
Vision Language Models struggle with aligning assembly diagrams and video feeds due to a depiction gap, with findings indicating visual encoding as the primary target for improving cross-depiction rob...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00913
• PDF: https://arxiv.org/pdf/2604.00913
• Project Page: https://ryenhails.github.io/IKEA-Bench/
• Github: https://ryenhails.github.io/IKEA-Bench/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Vision Language Models struggle with aligning assembly diagrams and video feeds due to a depiction gap, with findings indicating visual encoding as the primary target for improving cross-depiction rob...
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00913
• PDF: https://arxiv.org/pdf/2604.00913
• Project Page: https://ryenhails.github.io/IKEA-Bench/
• Github: https://ryenhails.github.io/IKEA-Bench/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Understand and Accelerate Memory Processing Pipeline for Disaggregated LLM Inference
📝 Summary:
LLM inference faces significant memory processing overhead. This paper proposes using heterogeneous GPU-FPGA systems to accelerate these operations by offloading memory-bounded tasks to FPGAs. This achieves 1.04-2.2x speedup and 1.11-4.7x energy savings over GPU baselines, proving heterogeneous s...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29002
• PDF: https://arxiv.org/pdf/2603.29002
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMInference #FPGA #HeterogeneousComputing #HardwareAcceleration #SystemArchitecture
📝 Summary:
LLM inference faces significant memory processing overhead. This paper proposes using heterogeneous GPU-FPGA systems to accelerate these operations by offloading memory-bounded tasks to FPGAs. This achieves 1.04-2.2x speedup and 1.11-4.7x energy savings over GPU baselines, proving heterogeneous s...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29002
• PDF: https://arxiv.org/pdf/2603.29002
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMInference #FPGA #HeterogeneousComputing #HardwareAcceleration #SystemArchitecture
✨UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems
📝 Summary:
UniMixer is a unified architecture for recommendation systems that improves scaling efficiency. It uses a generalized parameterized token mixing module to optimize mixing patterns and connect attention, TokenMixer, and factorization-machine methods. A lightweight version boosts performance further.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00590
• PDF: https://arxiv.org/pdf/2604.00590
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
UniMixer is a unified architecture for recommendation systems that improves scaling efficiency. It uses a generalized parameterized token mixing module to optimize mixing patterns and connect attention, TokenMixer, and factorization-machine methods. A lightweight version boosts performance further.
🔹 Publication Date: Published on Apr 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00590
• PDF: https://arxiv.org/pdf/2604.00590
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research