ML Research Hub
32.5K subscribers
5.96K photos
384 videos
24 files
6.45K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
VAREX: A Benchmark for Multi-Modal Structured Extraction from Documents

📝 Summary:
VAREX is a multimodal benchmark for structured data extraction from government forms. It provides four input modalities per document to systematically assess how input format affects extraction accuracy. Key findings show layout-preserving text significantly boosts accuracy and output compliance ...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15118
• PDF: https://arxiv.org/pdf/2603.15118
• Project Page: https://udibarzi.github.io/varex-bench/
• Github: https://github.com/udibarzi/varex-bench

Datasets citing this paper:
https://huggingface.co/datasets/ibm-research/VAREX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DataExtraction #MultimodalAI #DocumentAI #AIbenchmark #NLP
1
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

📝 Summary:
Qianfan-OCR is a 4B vision-language model that unifies document parsing, layout analysis, and understanding. It features Layout-as-Thought to improve accuracy on complex layouts and achieves state-of-the-art performance across multiple OCR and document intelligence benchmarks.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13398
• PDF: https://arxiv.org/pdf/2603.13398
• Project Page: https://github.com/baidubce/Qianfan-VL
• Github: https://github.com/baidubce/Qianfan-VL

🔹 Models citing this paper:
https://huggingface.co/baidu/Qianfan-OCR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#OCR #DocumentIntelligence #VisionLanguageModel #AI #MachineLearning
WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation

📝 Summary:
Waypoint Diffusion Transformers WiT address trajectory conflicts in pixel-space flow matching using semantic waypoints from pre-trained vision models. WiT disentangles generation paths into segments, accelerating training convergence. It outperforms pixel-space baselines and speeds up JiT trainin...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15132
• PDF: https://arxiv.org/pdf/2603.15132
• Project Page: https://hainuo-wang.github.io/WiT/
• Github: https://github.com/hainuo-wang/WiT

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DiffusionModels #Transformers #ComputerVision #DeepLearning #AI
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

📝 Summary:
GradMem writes LLM context into memory efficiently via test-time gradient descent on memory tokens. It optimizes a reconstruction loss, outperforming forward-only methods in capacity and efficiency on synthetic and natural language tasks.

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13875
• PDF: https://arxiv.org/pdf/2603.13875
• Github: https://github.com/yurakuratov/gradmem

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #GradientDescent #MachineLearning #NLP #AIResearch
SuperLocalMemory V3: Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory

📝 Summary:
This paper establishes information-geometric foundations for AI agent memory. It introduces a new retrieval metric, principled lifecycle management, and formal contradiction detection, improving performance on benchmarks with a zero-LLM architecture.

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14588
• PDF: https://arxiv.org/pdf/2603.14588
• Github: https://github.com/qualixar/superlocalmemory

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIAgents #AgentMemory #InformationGeometry #ZeroLLM #EnterpriseAI
Theoretical Foundations of Latent Posterior Factors: Formal Guarantees for Multi-Evidence Reasoning

📝 Summary:
Latent Posterior Factors LPF is a theoretical framework for trustworthy AI that combines heterogeneous evidence in probabilistic prediction tasks. It offers formal guarantees for key desiderata like calibration, error decay, and graceful degradation under corruption, all empirically validated.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15674
• PDF: https://arxiv.org/pdf/2603.15674

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#TrustworthyAI #AIResearch #ProbabilisticAI #MachineLearning #FormalGuarantees
I Know What I Don't Know: Latent Posterior Factor Models for Multi-Evidence Probabilistic Reasoning

📝 Summary:
This paper introduces Latent Posterior Factors LPF, a framework combining VAE latent posteriors with Sum-Product Network inference. LPF enables tractable probabilistic reasoning over unstructured evidence while maintaining calibrated uncertainty. It achieves high accuracy and low calibration erro...

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15670
• PDF: https://arxiv.org/pdf/2603.15670
• Github: https://github.com/aaaEpalea/epalea

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Chain-of-Trajectories: Unlocking the Intrinsic Generative Optimality of Diffusion Models via Graph-Theoretic Planning

📝 Summary:
Chain-of-Trajectories framework enables deliberative planning for diffusion models by using Diffusion DNA to dynamically allocate computational resources based on denoising difficulty. AI-generated su...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14704
• PDF: https://github.com/UnicomAI/CoTj/blob/main/CoTj_v20260305.pdf
• Github: https://github.com/UnicomAI/CoTj

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Mixture of Style Experts for Diverse Image Stylization

📝 Summary:
StyleExpert introduces a Mixture of Experts architecture for image stylization. It uses a unified style encoder and gating mechanism to handle diverse styles across semantic levels. This preserves semantics and material details better than existing methods.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16649
• PDF: https://arxiv.org/pdf/2603.16649
• Project Page: https://hh-lg.github.io/StyleExpert-Page/
• Github: https://github.com/HVision-NKU/StyleExpert

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Omnilingual MT: Machine Translation for 1,600 Languages

📝 Summary:
Omnilingual MT OMT is the first system to support over 1,600 languages. It uses specialized smaller LLMs 1B-8B to outperform 70B baselines, achieving high-quality translation and coherent generation in low-compute settings.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16309
• PDF: https://arxiv.org/pdf/2603.16309

Datasets citing this paper:
https://huggingface.co/datasets/facebook/bouquet

Spaces citing this paper:
https://huggingface.co/spaces/facebook/bouquet

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration

📝 Summary:
Idea-Catalyst is a framework that supports interdisciplinary research by identifying insights across domains to enhance creative reasoning in scientific discovery. AI-generated summary Despite interdi...

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12226
• PDF: https://arxiv.org/pdf/2603.12226
• Project Page: https://pkargupta.github.io/idea_catalyst.html
• Github: https://pkargupta.github.io/idea_catalyst.html

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes

📝 Summary:
HistoAtlas is a pan-cancer computational map linking 38 H&E histomic features to patient outcomes and molecular profiles across 21 cancer types. It reveals new biology and allows biomarker discovery from routine slides.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16587
• PDF: https://arxiv.org/pdf/2603.16587
• Project Page: https://histoatlas.com
• Github: https://github.com/HistoAtlas/HistoAtlas

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
SparkVSR: Interactive Video Super-Resolution via Sparse Keyframe Propagation

📝 Summary:
SparkVSR offers interactive video super-resolution using sparse keyframes as user control. It propagates high-resolution keyframe information through the video, guided by motion, enhancing temporal consistency and restoration quality.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16864
• PDF: https://arxiv.org/pdf/2603.16864
• Project Page: https://sparkvsr.github.io/
• Github: https://github.com/taco-group/SparkVSR

🔹 Models citing this paper:
https://huggingface.co/JiongzeYu/SparkVSR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games

📝 Summary:
MEMO, a memory-augmented model context optimization framework, improves multi-agent LLM game performance and stability through retained insights and exploratory prompt evolution with uncertainty-aware...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09022
• PDF: https://arxiv.org/pdf/2603.09022
• Project Page: https://yunfeixie233.github.io/MEMO/
• Github: https://github.com/openverse-ai/MEMO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
BERTology of Molecular Property Prediction

📝 Summary:
Researchers systematically investigate how dataset size, model size, and standardization impact chemical language model performance in molecular property prediction. This study provides numerical evidence to understand mechanisms affecting performance and resolve inconsistent literature results.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13627
• PDF: https://arxiv.org/pdf/2603.13627
• Github: https://github.com/molssi-ai/bertology

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MolecularPropertyPrediction #ChemicalLanguageModels #BERT #DeepLearning #Cheminformatics
V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

📝 Summary:
Pixel-space diffusion models can be enhanced through visual co-denoising techniques that incorporate pretrained visual features, with systematic analysis revealing key architectural and training compo...

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16792
• PDF: https://arxiv.org/pdf/2603.16792

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ECG-Reasoning-Benchmark: A Benchmark for Evaluating Clinical Reasoning Capabilities in ECG Interpretation

📝 Summary:
W h i l e M u l t i m o d a l L a r g e L a n g u a g e M o d e l s ( M L L M s ) s h o w p r o m i s i n g p e r f o r m a n c e i n a u t o m a t e d e l e c t r o c a r d i o g r a m i n t e r p r ...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14326
• PDF: https://arxiv.org/pdf/2603.14326

Datasets citing this paper:
https://huggingface.co/datasets/Jwoo5/ECG-Reasoning-Benchmark

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Please open Telegram to view this post
VIEW IN TELEGRAM
Residual Stream Duality in Modern Transformer Architectures

📝 Summary:
The residual stream in Transformers can be viewed through a two-axis framework where sequence position and layer depth provide different pathways for information flow, with causal depth-wise residual ...

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16039
• PDF: https://arxiv.org/pdf/2603.16039
• Project Page: https://github.com/yifanzhang-pro/residual-stream-duality
• Github: https://github.com/yifanzhang-pro/residual-stream-duality

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning

📝 Summary:
A hierarchical reinforcement learning framework named ARISE employs a skill management system to improve mathematical reasoning in language models through reusable strategies and structured skill libr...

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16060
• PDF: https://arxiv.org/pdf/2603.16060
• Github: https://github.com/Skylanding/ARISE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal Scaling of Diffusion Language Models

📝 Summary:
MDM-Prime-v2 enhances masked diffusion language models with Binary Encoding and Index Shuffling. It is 21.8 times more compute-efficient than autoregressive models, achieving significantly better perplexity and zero-shot accuracy.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16077
• PDF: https://arxiv.org/pdf/2603.16077
• Project Page: https://chen-hao-chao.github.io/mdm-prime-v2/
• Github: https://github.com/chen-hao-chao/mdm-prime-v2

🔹 Models citing this paper:
https://huggingface.co/chen-hao-chao/mdm-prime-v2-c4
https://huggingface.co/chen-hao-chao/mdm-prime-v2-slimpajama

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research