ML Research Hub
32.3K subscribers
6.73K photos
472 videos
24 files
7.35K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
EXAONE 4.5 Technical Report

📝 Summary:
EXAONE 4.5 is LG AI Research's first open-weight vision language model, integrating a visual encoder into EXAONE 4.0. It enhances document understanding and general language capabilities through targeted data and extended context, outperforming similar models in document tasks.

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08644
• PDF: https://arxiv.org/pdf/2604.08644
• Github: https://github.com/LG-AI-EXAONE/EXAONE-4.5

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModel #AI #DocumentUnderstanding #MultimodalAI #OpenSourceAI
This media is not supported in your browser
VIEW IN TELEGRAM
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios

📝 Summary:
FORGE introduces a multimodal manufacturing dataset, revealing that MLLM performance is limited by domain-specific knowledge, not visual grounding. Fine-tuning on FORGEs annotations significantly improves accuracy, offering a path for domain-adapted MLLMs.

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07413
• PDF: https://arxiv.org/pdf/2604.07413
• Project Page: https://ai4manufacturing.github.io/forge-web/
• Github: https://github.com/AI4Manufacturing/FORGE

Datasets citing this paper:
https://huggingface.co/datasets/AI4Manufacturing/forge

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FORGE #MLLM #ManufacturingAI #MultimodalAI #DomainAdaptation
This media is not supported in your browser
VIEW IN TELEGRAM
Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video

📝 Summary:
A novel cross-modal emotion transfer approach generates expressive talking face videos by modeling emotion semantic vectors between speech and visual feature spaces, achieving superior emotion accurac...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07786
• PDF: https://arxiv.org/pdf/2604.07786
• Project Page: https://chanhyeok-choi.github.io/C-MET/
• Github: https://github.com/ChanHyeok-Choi/C-MET

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
WildDet3D: Scaling Promptable 3D Detection in the Wild

📝 Summary:
WildDet3D is a unified architecture for open-world 3D object detection, accepting multiple prompt types and integrating geometric cues. It leverages WildDet3D-Data, the largest 3D dataset, to achieve state-of-the-art performance across benchmarks, with significant gains from incorporating depth i...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08626
• PDF: https://arxiv.org/pdf/2604.08626
• Project Page: https://allenai.github.io/WildDet3D/
• Github: https://github.com/allenai/WildDet3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DObjectDetection #ComputerVision #DeepLearning #AI #Datasets
Structured Causal Video Reasoning via Multi-Objective Alignment

📝 Summary:
This paper introduces Structured Event Facts for explicit causal video reasoning, moving beyond unstructured methods. It uses a multi-objective reinforcement learning pipeline to balance training goals, leading to Factum-4B. This model achieves reliable, stronger performance on complex temporal v...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04415
• PDF: https://arxiv.org/pdf/2604.04415

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#CausalAI #VideoReasoning #ReinforcementLearning #ComputerVision #AIResearch
ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion

📝 Summary:
ECHO is an efficient diffusion model for chest X-ray report generation. It achieves fast one-step-per-block inference using Direct Conditional Distillation and Response-Asymmetric Diffusion. ECHO delivers an 8x speedup and improved accuracy over state-of-the-art methods.

🔹 Publication Date: Published on Apr 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09450
• PDF: https://arxiv.org/pdf/2604.09450
• Project Page: https://echo-midea-airc.github.io/
• Github: https://github.com/clf28/ECHO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism

📝 Summary:
LLMs use a distinct, compact internal mechanism for generating harmful content, separate from benign functions. This compressed structure explains why fine-tuning can cause broad emergent misalignment, offering new ways to improve AI safety.

🔹 Publication Date: Published on Apr 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09544
• PDF: https://arxiv.org/pdf/2604.09544

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Exploring the Future of AI: Neutrosophic Graph Neural Networks (NGNN)

Recent analysis indicates that Neutrosophic Graph Neural Networks (NGNN) represent a significant advancement in contemporary artificial intelligence research. The following overview details the concept and its implications.

Most artificial intelligence models presuppose data integrity; however, real-world data is frequently imperfect. Consequently, NGNN may emerge as a critical innovation.

The foundational inquiry addresses the following:
How does artificial intelligence manage data characterized by uncertainty, incompleteness, or contradiction?

Traditional models exhibit limitations in this regard, often assuming certainty where none exists.

The Foundation: Neutrosophic Logic
In the late 1990s, mathematician Florentin Smarandache introduced a framework extending beyond binary true/false dichotomies. He proposed three dimensions of truth:
T — What is true
I — What is indeterminate
F — What is false

Between 2000 and 2015, this framework evolved into neutrosophic sets and neutrosophic graphs, mathematical tools capable of encoding uncertainty within data and relationships.

The Parallel Rise of Graph Neural Networks
Around 2016, the artificial intelligence sector adopted Graph Neural Networks (GNNs), models designed to learn from nodes (data points) and edges (relationships). These models became foundational in social networks, healthcare, fraud detection, and bioinformatics.

However, GNNs possess a critical limitation: they assume data certainty, whereas real-world data is inherently uncertain.

The Convergence: NGNN
From 2020 onwards, researchers began integrating these two domains. In an NGNN, rather than carrying only features, a node encapsulates:
— T: What is likely true
— I: What remains uncertain
— F: What may be false

This constitutes not a minor upgrade, but a fundamental shift in how artificial intelligence models perceive and process reality.

Key Application Areas:
Healthcare — Navigating uncertain or conflicting diagnoses
Fraud detection — Identifying ambiguous behavioral patterns
Social networks — Modeling unclear or evolving relationships
Bioinformatics — Managing the complexity of biological interactions

Is NGNN advanced machine learning?
Affirmatively. It resides at the intersection of:
Graph theory · Deep learning · Mathematical logic · Uncertainty modeling

This technology represents research-level, cutting-edge development and is not yet widely deployed in industry. This status underscores its current strategic importance.

The Broader Context
NGNN is not merely another model; it signifies a philosophical shift in artificial intelligence from systems assuming certainty to systems reasoning through uncertainty. Real-world problems are rarely perfect; therefore, models should not presume perfection.

This represents not only evolution but a definitive direction for the field.

——

#ArtificialIntelligence #MachineLearning #DeepLearning #GraphNeuralNetworks #AIResearch #DataScience #FutureOfAI #Innovation #EmergingTech #NGNN #AIHealthcare #Bioinformatics
AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents

📝 Summary:
AgentSwing adaptively manages context for long-horizon web agents using parallel branching and lookahead routing. This state-aware framework outperforms static methods, reducing interactions while improving search efficiency and terminal precision.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27490
• PDF: https://arxiv.org/pdf/2603.27490

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WebAgents #AI #ContextManagement #ParallelComputing #AgentAI
1
ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery

📝 Summary:
ScheMatiQ uses large language models to automatically generate annotation schemas and structured databases from research questions and document collections. Its interactive web interface allows users to steer the extraction, supporting real-world analysis in law and biology.

🔹 Publication Date: Published on Apr 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09237
• PDF: https://arxiv.org/pdf/2604.09237
• Project Page: https://www.schematiq-ai.com/
• Github: https://github.com/shaharl6000/ScheMatiQ

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Large Language Models Align with the Human Brain during Creative Thinking

📝 Summary:
Large language models show varying alignment with brain activity during creative thinking tasks, with model size and post-training objectives influencing how well their representations match neural re...

🔹 Publication Date: Published on Apr 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03480
• PDF: https://arxiv.org/pdf/2604.03480

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
On Semiotic-Grounded Interpretive Evaluation of Generative Art

📝 Summary:
Generative art evaluation framework based on Peircean semiotics assesses symbolic and indexical meaning through hierarchical semiosis graphs, improving alignment with human artistic interpretation. AI...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08641
• PDF: https://arxiv.org/pdf/2604.08641
• Github: https://github.com/songrise/SemJudge

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Initialisation Determines the Basin: Efficient Codebook Optimisation for Extreme LLM Quantization

📝 Summary:
Additive quantization for LLM compression faces challenges at 2-bit precision due to codebook initialization issues, which OA-EM addresses through output-aware EM initialization based on Hessian-weigh...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08118
• PDF: https://arxiv.org/pdf/2604.08118
• Github: https://github.com/kenno94-IK/aqlm-oaem

🔹 Models citing this paper:
https://huggingface.co/kennedyian94/Llama-3.2-3B-AQLM-OA-EM-2Bit-2x8
https://huggingface.co/kennedyian94/Qwen-2.5-3B-AQLM-OA-EM-2Bit-2x8
https://huggingface.co/kennedyian94/Llama-3.1-8B-AQLM-OA-EM-2Bit-2x8

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation

📝 Summary:
AVGen-Bench presents a comprehensive benchmark for text-to-audio-video generation with multi-granular evaluation, revealing gaps between aesthetic quality and semantic accuracy. AI-generated summary T...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08540
• PDF: https://arxiv.org/pdf/2604.08540
• Project Page: https://microsoft.github.io/AVGen-Bench/
• Github: https://github.com/microsoft/AVGen-Bench

Datasets citing this paper:
https://huggingface.co/datasets/microsoft/AVGen-Bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Backdoor Attacks on Decentralised Post-Training

📝 Summary:
This paper introduces the first backdoor attack on pipeline parallelism in decentralized LLM post-training. An adversary controlling an intermediate stage can significantly misalign the model, reducing alignment from 80% to 6% with a trigger word, even resisting safety training.

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02372
• PDF: https://arxiv.org/pdf/2604.02372

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#BackdoorAttack #LLM #DecentralizedAI #AISecurity #MachineLearning
1
Multi-User Large Language Model Agents

📝 Summary:
Multi-user LLM agents struggle with conflicting objectives, privacy, and coordination. This study formalizes the problem and reveals systematic gaps in current LLMs. They fail to prioritize instructions, violate privacy, and suffer coordination bottlenecks.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08567
• PDF: https://arxiv.org/pdf/2604.08567
• Project Page: https://korde-ai.github.io/Multi-User-LLM-Agent/
• Github: https://github.com/Korde-AI/Multi-User-LLM-Agent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
p1: Better Prompt Optimization with Fewer Prompts

📝 Summary:
Research reveals that prompt optimization effectiveness depends on the balance between response stochasticity and system prompt quality variance, leading to the development of a filtering method that ...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08801
• PDF: https://arxiv.org/pdf/2604.08801

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers

📝 Summary:
EquiformerV3 advances SE(3)-equivariant graph neural networks through enhanced efficiency, expressivity, and generality via optimized implementation, improved architectural components, and novel activ...

🔹 Publication Date: Published on Apr 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09130
• PDF: https://arxiv.org/pdf/2604.09130
• Github: https://github.com/atomicarchitects/equiformer_v3

🔹 Models citing this paper:
https://huggingface.co/yilunliao/equiformer_v3
https://huggingface.co/mirror-physics/equiformer_v3

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance

📝 Summary:
Vision-Language Models show significant vulnerabilities under geometric transformations, lacking robust spatial invariance and equivariance despite strong semantic capabilities. AI-generated summary T...

🔹 Publication Date: Published on Apr 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01848
• PDF: https://arxiv.org/pdf/2604.01848
• Project Page: https://xthomasbu.github.io/visual_invariance/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video

📝 Summary:
A novel cross-modal emotion transfer approach generates expressive talking face videos by modeling emotion semantic vectors between speech and visual feature spaces, achieving superior emotion accurac...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07786
• PDF: https://arxiv.org/pdf/2604.07786
• Project Page: https://chanhyeok-choi.github.io/C-MET/
• Github: https://github.com/ChanHyeok-Choi/C-MET

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Envisioning the Future, One Step at a Time

📝 Summary:
Autoregressive diffusion models predict open-set future scene dynamics by modeling sparse point trajectories, enabling fast and scalable multi-modal motion prediction with physical plausibility. AI-ge...

🔹 Publication Date: Published on Apr 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09527
• PDF: https://arxiv.org/pdf/2604.09527
• Project Page: https://compvis.github.io/myriad
• Github: https://github.com/compvis/myriad

🔹 Models citing this paper:
https://huggingface.co/CompVis/myriad

Datasets citing this paper:
https://huggingface.co/datasets/CompVis/owm-95
https://huggingface.co/datasets/CompVis/myriad-physics

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research