ML Research Hub

✨Protein Autoregressive Modeling via Multiscale Structure Generation

📝 Summary:
PAR is a multi-scale autoregressive framework for protein backbone generation that uses hierarchical structure modeling, autoregressive transformers, and flow-based decoding to produce high-quality pr...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04883
• PDF: https://arxiv.org/pdf/2602.04883

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

104 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ACE-Step: A Step Towards Music Generation Foundation Model

📝 Summary:
ACE-Step is an open-source music generation model that integrates diffusion generation with a lightweight transformer and deep compression autoencoder, achieving fast inference, high coherence, and fi...

🔹 Publication Date: Published on May 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.00045
• PDF: https://arxiv.org/pdf/2506.00045
• Github: https://github.com/ace-step/ACE-Step

✨ Spaces citing this paper:
• https://huggingface.co/spaces/DengLi1208/ACE-Step-1.5

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

125 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ERNIE 5.0 Technical Report

📝 Summary:
ERNIE 5.0 is a production-scale trillion-parameter autoregressive model that unifies multimodal understanding and generation through sparse MoE architecture and elastic training. AI-generated summary ...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04705
• PDF: https://arxiv.org/pdf/2602.04705

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

99 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨From Data to Behavior: Predicting Unintended Model Behaviors Before Training

📝 Summary:
Data2Behavior predicts unintended model behaviors before training using MDF, a lightweight method that analyzes data features to reveal potential biases without parameter updates. AI-generated summary...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04735
• PDF: https://arxiv.org/pdf/2602.04735
• Github: https://github.com/zjunlp/Data2Behavior

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

98 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

📝 Summary:
OmniSIFT is a modality-asymmetric token compression framework for Omni-LLMs that reduces computational overhead through spatio-temporal video pruning and vision-guided audio selection while maintainin...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04804
• PDF: https://arxiv.org/pdf/2602.04804

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

84 views06:05

✨ Explore Data Science 📝 Write your paper

✨VLS: Steering Pretrained Robot Policies via Vision-Language Models

📝 Summary:
Pretrained diffusion and flow-matching policies fail under test-time shifts due to tight coupling with training configurations, prompting the development of Vision-Language Steering (VLS) for training...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03973
• PDF: https://arxiv.org/pdf/2602.03973
• Project Page: https://vision-language-steering.github.io/webpage/
• Github: https://github.com/Vision-Language-Steering/code

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

83 views06:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Quantifying the Gap between Understanding and Generation within Unified Multimodal Models

📝 Summary:
Unified multimodal models exhibit a persistent gap between understanding and generation capabilities, indicating only surface-level integration rather than deep cognitive convergence. AI-generated sum...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02140
• PDF: https://arxiv.org/pdf/2602.02140

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

106 views06:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Likelihood-Based Reward Designs for General LLM Reasoning

📝 Summary:
Log-probability rewards derived from the reference answer's likelihood outperform binary rewards in chain-of-thought fine-tuning across both verifiable and non-verifiable reasoning benchmarks. AI-gene...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03979
• PDF: https://arxiv.org/pdf/2602.03979

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

130 views06:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

📝 Summary:
This paper introduces depth-wise semantic routing to fuse multi-layer LLM hidden states, enhancing text conditioning in DiT models. It significantly improves text-image alignment and compositional generation. Time-wise fusion can degrade results due to trajectory mismatch.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03510
• PDF: https://arxiv.org/pdf/2602.03510

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views07:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering

📝 Summary:
MEnvAgent is a multi-language framework that automates environment construction for software engineering tasks using a planning-execution-verification architecture and environment reuse mechanism, ach...

🔹 Publication Date: Published on Jan 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22859
• PDF: https://arxiv.org/pdf/2601.22859

✨ Datasets citing this paper:
• https://huggingface.co/datasets/ernie-research/MEnvBench
• https://huggingface.co/datasets/ernie-research/MEnvData-SWE
• https://huggingface.co/datasets/ernie-research/MEnvData-SWE-Trajectory

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

146 views07:05

✨ Explore Data Science 📝 Write your paper

✨Efficient Autoregressive Video Diffusion with Dummy Head

📝 Summary:
Autoregressive video diffusion models underutilize historical frames. Dummy Forcing improves efficiency through heterogeneous memory allocation and dynamic head programming. This method achieves up to 2.0x speedup with less than 0.5% quality drop, enabling faster video generation.

🔹 Publication Date: Published on Jan 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20499
• PDF: https://arxiv.org/pdf/2601.20499
• Project Page: https://csguoh.github.io/project/DummyForcing/
• Github: https://github.com/csguoh/DummyForcing

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoDiffusion #AutoregressiveModels #GenerativeAI #DeepLearning #AI

114 views08:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniRad: A Radiological Foundation Model for Multi-Task Medical Image Analysis

📝 Summary:
OmniRad is a self-supervised radiological foundation model pretrained on 1.2 million medical images. It improves classification F1 by 2.05 percent and achieves better segmentation through representation reuse and cross-task transferability.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04547
• PDF: https://arxiv.org/pdf/2602.04547
• Github: https://github.com/unica-visual-intelligence-lab/OmniRad

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MedicalAI #FoundationModels #Radiology #SelfSupervisedLearning #MedicalImaging

150 views08:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization

📝 Summary:
LongVPO is a two-stage DPO framework for short-context VLMs to understand long videos. It uses synthetic preference data from anchored clips and recursive captioning for multi-segment reasoning. LongVPO achieves state-of-the-art with minimal human annotation.

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02341
• PDF: https://arxiv.org/pdf/2602.02341
• Github: https://github.com/MCG-NJU/LongVPO

🔹 Models citing this paper:
• https://huggingface.co/MCG-NJU/LongVPO-Stage2-InternVL3-8B
• https://huggingface.co/MCG-NJU/LongVPO-Stage1-InternVL3-8B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoUnderstanding #MachineLearning #VLMs #DeepLearning #AIResearch

157 views08:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?

📝 Summary:
SpatiaLab introduces a comprehensive benchmark to evaluate vision language model spatial reasoning in realistic scenarios. Experiments show a significant performance gap between current models and humans, revealing major limitations in tasks like depth and 3D geometry. This highlights challenges ...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03916
• PDF: https://arxiv.org/pdf/2602.03916
• Project Page: https://spatialab-reasoning.github.io/
• Github: https://github.com/SpatiaLab-Reasoning/SpatiaLab

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisionLanguageModels #SpatialReasoning #ComputerVision #AIResearch #DeepLearning

181 views09:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Self-Rewarding Sequential Monte Carlo for Masked Diffusion Language Models

📝 Summary:
This work introduces Self-Rewarding Sequential Monte Carlo SMC to improve masked diffusion language model sampling. SMC uses multiple parallel diffusion processes and trajectory-level confidence as a self-rewarding signal, guiding generation to high-quality samples and boosting performance withou...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01849
• PDF: https://arxiv.org/pdf/2602.01849
• Project Page: https://algolzw.github.io/sr-smc/
• Github: https://github.com/Algolzw/self-rewarding-smc

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#DiffusionModels #SequentialMonteCarlo #LanguageModels #GenerativeAI #MachineLearning

❤1

182 views09:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CL-bench: A Benchmark for Context Learning

📝 Summary:
Current LMs struggle with context learning, requiring new knowledge and reasoning beyond pre-training. The CL-bench, a new real-world benchmark, reveals models solve only 17.2 percent of tasks, showing a critical bottleneck for complex real-world applications.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03587
• PDF: https://arxiv.org/pdf/2602.03587
• Project Page: https://www.clbench.com
• Github: https://github.com/Tencent-Hunyuan/CL-bench

✨ Datasets citing this paper:
• https://huggingface.co/datasets/tencent/CL-bench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ContextLearning #LanguageModels #AIBenchmark #NLP #AIResearch

139 views10:59

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Proxy Compression for Language Modeling

📝 Summary:
Proxy compression trains language models on both raw bytes and compressed views. This enables efficient training on compressed inputs while offering a robust, end-to-end raw-byte inference. It improves training efficiency and eventually matches tokenizer performance.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04289
• PDF: https://arxiv.org/pdf/2602.04289
• Github: https://github.com/LZhengisme/proxy-compression

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LanguageModels #Compression #MachineLearning #AI #Efficiency

144 views10:59

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:06

This media is not supported in your browser

VIEW IN TELEGRAM

✨3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

📝 Summary:
3DiMo enables view-agnostic human motion control in video generation by training a motion encoder alongside a pretrained video generator to distill driving frames into compact motion tokens that align...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03796
• PDF: https://arxiv.org/pdf/2602.03796
• Github: https://hjrphoebus.github.io/3DiMo/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

132 views11:00

✨ Explore Data Science 📝 Write your paper

✨Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

📝 Summary:
Generative 3D models face challenges in animation rigging, which this work addresses by introducing SkinTokens—a learned discrete representation for skinning weights—and TokenRig, a unified autoregres...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04805
• PDF: https://arxiv.org/pdf/2602.04805
• Project Page: https://zjp-shadow.github.io/works/SkinTokens/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

178 views11:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FASA: Frequency-aware Sparse Attention

📝 Summary:
FASA addresses LLM KV cache memory for long contexts by dynamically predicting token importance. It leverages functional sparsity in RoPEs frequency chunks to identify critical tokens for focused attention. This significantly reduces memory and computation while maintaining high performance.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03152
• PDF: https://arxiv.org/pdf/2602.03152

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #SparseAttention #MemoryEfficiency #DeepLearning #NLP

172 views11:59

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations

📝 Summary:
AutoFigure is an agentic AI framework that automatically generates publication-ready scientific illustrations from long-form text. It uses extensive thinking and validation to ensure structural soundness and aesthetic appeal. Supported by FigureBench, a large new benchmark, AutoFigure surpasses b...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03828
• PDF: https://arxiv.org/pdf/2602.03828
• Github: https://github.com/ResearAI/AutoFigure-Edit

✨ Datasets citing this paper:
• https://huggingface.co/datasets/WestlakeNLP/FigureBench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #GenerativeAI #ScientificIllustrations #ResearchTools #AcademicPublishing

194 views11:59

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform