ML Research Hub
32.3K subscribers
6.74K photos
474 videos
24 files
7.36K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video

📝 Summary:
ReconPhys is the first feedforward framework to jointly learn physical attribute estimation and 3D Gaussian Splatting reconstruction from a single video. It offers significantly faster inference and superior reconstruction quality for non-rigid objects compared to prior optimization-based methods...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07882
• PDF: https://arxiv.org/pdf/2604.07882
• Project Page: https://chuanshuogushi.github.io/ReconPhys/
• Github: https://chuanshuogushi.github.io/ReconPhys/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ComputerVision #3DReconstruction #GaussianSplatting #DeepLearning #AIResearch
SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering

📝 Summary:
SemaClaw is an open-source multi-agent framework addressing the need for robust infrastructure for personal AI agents. It ensures control and trustworthiness through novel orchestration, safety, and context management components, advancing general-purpose personal AI via harness engineering.

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11548
• PDF: https://arxiv.org/pdf/2604.11548
• Github: https://github.com/midea-ai/sema-code-core

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Mobile GUI Agents under Real-world Threats: Are We There Yet?

📝 Summary:
Mobile GUI agents powered by large language models show significant performance degradation when exposed to real-world third-party content in commercial applications. AI-generated summary Recent years...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.04227
• PDF: https://arxiv.org/pdf/2507.04227
• Project Page: https://agenthazard.github.io
• Github: https://github.com/Zsbyqx20/AgentHazard

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Target Policy Optimization

📝 Summary:
Target Policy Optimization separates policy update decisions from probability assignment in reinforcement learning, improving performance over standard policy gradient methods in sparse reward scenari...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06159
• PDF: https://arxiv.org/pdf/2604.06159
• Github: https://github.com/JeanKaddour/tpo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

📝 Summary:
This survey focuses on generalizable feed-forward 3D reconstruction, which efficiently maps images to 3D representations. It proposes a novel taxonomy centered on model design strategies, addressing key problems like feature enhancement and model efficiency, rather than output format differences.

🔹 Publication Date: Published on Apr 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14025
• PDF: https://arxiv.org/pdf/2604.14025
• Project Page: https://ff3d-survey.github.io
• Github: https://github.com/ziplab/Awesome-Feed-Forward-3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SkVM: Compiling Skills for Efficient Execution Everywhere

📝 Summary:
SkVM is a compilation and runtime system that enables portable and efficient execution of LLM skills across different models and platforms by treating skills as code and analyzing capability requireme...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03088
• PDF: https://arxiv.org/pdf/2604.03088
• Project Page: https://skillvm.ai/index.html
• Github: https://github.com/SJTU-IPADS/SkVM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Geometric Context Transformer for Streaming 3D Reconstruction

📝 Summary:
LingBot-Map is a feed-forward 3D foundation model that reconstructs scenes from video streams using a geometric context transformer architecture with specialized attention mechanisms for coordinate gr...

🔹 Publication Date: Published on Apr 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14141
• PDF: https://arxiv.org/pdf/2604.14141
• Project Page: https://technology.robbyant.com/lingbot-map
• Github: https://github.com/robbyant/lingbot-map

🔹 Models citing this paper:
https://huggingface.co/robbyant/lingbot-map

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Narrative-Driven Paper-to-Slide Generation via ArcDeck

📝 Summary:
ArcDeck is a multi-agent framework for paper-to-slide generation that models a paper's logical flow through discourse trees. It uses an iterative refinement process to ensure narrative coherence and improve presentations over direct summarization methods.

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11969
• PDF: https://arxiv.org/pdf/2604.11969
• Project Page: https://arcdeck.org/
• Github: https://github.com/RehgLab/ArcDeck

Datasets citing this paper:
https://huggingface.co/datasets/ArcDeck/ArcBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIPresentations #NLP #GenerativeAI #ResearchTools #AcademicPublishing
1
HDR Video Generation via Latent Alignment with Logarithmic Encoding

📝 Summary:
This work enables high dynamic range HDR video generation by leveraging pretrained generative models. It uses logarithmic encoding to align HDR imagery with model latent spaces and camera-mimicking degradation training, achieving strong results without architectural redesign or complex retraining.

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11788
• PDF: https://arxiv.org/pdf/2604.11788
• Project Page: https://hdr-lumivid.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling

📝 Summary:
LangFlow demonstrates that continuous diffusion models can match discrete counterparts in language modeling by leveraging embedding-space flow matching with novel training techniques and noise schedul...

🔹 Publication Date: Published on Apr 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11748
• PDF: https://arxiv.org/pdf/2604.11748
• Project Page: https://caradryanl.github.io/blog/2026/langflow/
• Github: https://github.com/nealchen2003/LangFlow

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

📝 Summary:
Self-Distillation Zero trains a model to transform binary rewards into dense token-level self-supervision through dual-role training and on-policy self-distillation, achieving superior performance in ...

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12002
• PDF: https://arxiv.org/pdf/2604.12002

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

📝 Summary:
Vision-language models can be adapted for regional contexts through Anthropogenic Regional Adaptation and GG-EZ method while maintaining global performance and improving cultural relevance. AI-generat...

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11490
• PDF: https://arxiv.org/pdf/2604.11490
• Project Page: https://huggingface.co/collections/SEACrowd/sea-vl-phase-2-multimodal-vision-language-models-for-sea

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
What do Language Models Learn and When? The Implicit Curriculum Hypothesis

📝 Summary:
LLM pretraining follows an Implicit Curriculum Hypothesis, showing a compositional and predictable skill emergence. Capabilities emerge consistently across models, with composite tasks appearing after their components. This order is encoded in representations, allowing prediction of training traj...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08510
• PDF: https://arxiv.org/pdf/2604.08510
• Github: https://github.com/KaiserWhoLearns/ElementalTask

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ROSE: An Intent-Centered Evaluation Metric for NL2SQL

📝 Summary:
ROSE is a new NL2SQL metric addressing unreliable Execution Accuracy. It evaluates if predicted SQL answers user intent via a Prover-Refuter cascade, showing superior agreement with human experts.

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12988
• PDF: https://arxiv.org/pdf/2604.12988

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#NL2SQL #NLP #EvaluationMetrics #AIResearch #DataScience
Cross-Tokenizer LLM Distillation through a Byte-Level Interface

📝 Summary:
Byte-Level Distillation BLD is a new simple method for cross-tokenizer LLM knowledge transfer. It uses a shared byte-level interface, converting teacher outputs to byte probabilities for student distillation. BLD performs competitively with complex approaches, suggesting the byte level is a natur...

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07466
• PDF: https://arxiv.org/pdf/2604.07466

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

📝 Summary:
MM-WebAgent is a hierarchical agentic framework that coordinates AIGC-based element generation for coherent and visually consistent webpage design through joint optimization of layout and multimodal c...

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15309
• PDF: https://arxiv.org/pdf/2604.15309
• Github: https://github.com/microsoft/MM-webagent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack

📝 Summary:
Activation-Scaling Guard (ASGuard) mitigates brittle refusal behaviors in large language models by identifying and recalibrating specific attention heads vulnerable to tense-based jailbreaking attacks...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.25843
• PDF: https://arxiv.org/pdf/2509.25843
• Github: https://github.com/dmis-lab/ASGuard

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

📝 Summary:
HY-World 2.0 is a multi-modal world model framework that generates high-fidelity 3D Gaussian Splatting scenes from diverse inputs using specialized modules for panorama generation, trajectory planning...

🔹 Publication Date: Published on Apr 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14268
• PDF: https://arxiv.org/pdf/2604.14268
• Project Page: https://3d-models.hunyuan.tencent.com/world/
• Github: https://github.com/Tencent-Hunyuan/HY-World-2.0

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

📝 Summary:
Teacher-student cooperation data synthesis framework addresses stylistic divergence in synthetic data for improved model fine-tuning performance. AI-generated summary A widely adopted strategy for mod...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14164
• PDF: https://arxiv.org/pdf/2604.14164
• Github: https://github.com/CoopReason/TESSY

Datasets citing this paper:
https://huggingface.co/datasets/CoopReason/TESSY-Code-80K

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning

📝 Summary:
LongAct improves long-context reasoning in LLMs by implementing saliency-guided sparse updates based on high-magnitude activation patterns in query and key vectors. AI-generated summary Reinforcement ...

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14922
• PDF: https://arxiv.org/pdf/2604.14922

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

📝 Summary:
UniDoc-RL introduces a reinforcement learning framework for LVLMs that jointly optimizes retrieval, reranking, visual perception, and reasoning through hierarchical decision-making and dense multi-rew...

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14967
• PDF: https://arxiv.org/pdf/2604.14967
• Github: https://github.com/deepglint/UniDoc-RL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research