✨Token Reduction via Local and Global Contexts Optimization for Efficient Video Large Language Models
📝 Summary:
AOT framework reduces video token redundancy through local-global optimal transport to preserve informative contexts while achieving efficient spatiotemporal compression in video large language models...
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01400
• PDF: https://arxiv.org/pdf/2603.01400
• Project Page: https://tyroneli.github.io/AOT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AOT framework reduces video token redundancy through local-global optimal transport to preserve informative contexts while achieving efficient spatiotemporal compression in video large language models...
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01400
• PDF: https://arxiv.org/pdf/2603.01400
• Project Page: https://tyroneli.github.io/AOT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
✨Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
📝 Summary:
A feedforward model called Track4World enables efficient holistic 3D tracking of every pixel in a video by utilizing a global 3D scene representation and novel 3D correlation scheme for dense flow est...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02573
• PDF: https://arxiv.org/pdf/2603.02573
• Project Page: https://jiah-cloud.github.io/Track4World.github.io/
• Github: https://github.com/TencentARC/Track4World
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A feedforward model called Track4World enables efficient holistic 3D tracking of every pixel in a video by utilizing a global 3D scene representation and novel 3D correlation scheme for dense flow est...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02573
• PDF: https://arxiv.org/pdf/2603.02573
• Project Page: https://jiah-cloud.github.io/Track4World.github.io/
• Github: https://github.com/TencentARC/Track4World
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Beyond Language Modeling: An Exploration of Multimodal Pretraining
📝 Summary:
Controlled multimodal pretraining experiments reveal key insights about unified visual representations, data complementarity, world modeling emergence, and efficient scaling through mixture-of-experts...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03276
• PDF: https://arxiv.org/pdf/2603.03276
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Controlled multimodal pretraining experiments reveal key insights about unified visual representations, data complementarity, world modeling emergence, and efficient scaling through mixture-of-experts...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03276
• PDF: https://arxiv.org/pdf/2603.03276
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?
📝 Summary:
Current code agent benchmarks fail to capture real-world complexity, prompting the creation of BeyondSWE to evaluate broader reasoning and knowledge scopes, alongside SearchSWE to study external knowl...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03194
• PDF: https://arxiv.org/pdf/2603.03194
• Project Page: https://aweai-team.github.io/BeyondSWE/
• Github: https://github.com/AweAI-Team/BeyondSWE
✨ Datasets citing this paper:
• https://huggingface.co/datasets/AweAI-Team/BeyondSWE
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Current code agent benchmarks fail to capture real-world complexity, prompting the creation of BeyondSWE to evaluate broader reasoning and knowledge scopes, alongside SearchSWE to study external knowl...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03194
• PDF: https://arxiv.org/pdf/2603.03194
• Project Page: https://aweai-team.github.io/BeyondSWE/
• Github: https://github.com/AweAI-Team/BeyondSWE
✨ Datasets citing this paper:
• https://huggingface.co/datasets/AweAI-Team/BeyondSWE
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨APRES: An Agentic Paper Revision and Evaluation System
📝 Summary:
Large language models are used to automatically revise scientific papers based on citation-predictive rubrics while preserving core content, achieving improved citation predictions and human evaluator...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03142
• PDF: https://arxiv.org/pdf/2603.03142
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models are used to automatically revise scientific papers based on citation-predictive rubrics while preserving core content, achieving improved citation predictions and human evaluator...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03142
• PDF: https://arxiv.org/pdf/2603.03142
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?
📝 Summary:
Code agents can autonomously generate more complex mathematical problems by evolving existing ones, providing a scalable solution for creating high-difficulty reasoning problems. AI-generated summary ...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03202
• PDF: https://arxiv.org/pdf/2603.03202
• Github: https://github.com/TarferSoul/Code2Math
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Code agents can autonomously generate more complex mathematical problems by evolving existing ones, providing a scalable solution for creating high-difficulty reasoning problems. AI-generated summary ...
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03202
• PDF: https://arxiv.org/pdf/2603.03202
• Github: https://github.com/TarferSoul/Code2Math
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Fast Matrix Multiplication in Small Formats: Discovering New Schemes with an Open-Source Flip Graph Framework
📝 Summary:
A new open-source C++ framework discovers fast matrix multiplication schemes, improving 79 ranks. It found a 4x4x10 scheme with 115 multiplications, beating Strassen's exponent for that size, and redistributes many schemes to simpler coefficients. Tools are public.
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02398
• PDF: https://arxiv.org/pdf/2603.02398
• Project Page: https://github.com/dronperminov/FastMatrixMultiplication
• Github: https://github.com/dronperminov/ternary_flip_graph
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A new open-source C++ framework discovers fast matrix multiplication schemes, improving 79 ranks. It found a 4x4x10 scheme with 115 multiplications, beating Strassen's exponent for that size, and redistributes many schemes to simpler coefficients. Tools are public.
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02398
• PDF: https://arxiv.org/pdf/2603.02398
• Project Page: https://github.com/dronperminov/FastMatrixMultiplication
• Github: https://github.com/dronperminov/ternary_flip_graph
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
📝 Summary:
AgentConductor uses reinforcement learning-optimized multi-agent systems with an LLM-based orchestrator to dynamically generate interaction topologies for code generation, improving accuracy while red...
🔹 Publication Date: Published on Feb 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.17100
• PDF: https://arxiv.org/pdf/2602.17100
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AgentConductor uses reinforcement learning-optimized multi-agent systems with an LLM-based orchestrator to dynamically generate interaction topologies for code generation, improving accuracy while red...
🔹 Publication Date: Published on Feb 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.17100
• PDF: https://arxiv.org/pdf/2602.17100
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Qwen2 Technical Report
📝 Summary:
The Qwen2 series, comprising 0.5 to 72 billion parameter models, surpasses prior open models across language understanding, generation, multilingualism, coding, math, and reasoning, with exceptional p...
🔹 Publication Date: Published on Jul 15, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2407.10671
• PDF: https://arxiv.org/pdf/2407.10671
• Github: https://github.com/qwenlm/qwen2
🔹 Models citing this paper:
• https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct
• https://huggingface.co/Qwen/QwQ-32B-Preview
• https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
✨ Datasets citing this paper:
• https://huggingface.co/datasets/thunder-research-group/SNU_Thunder-synthetic-instruction-following
• https://huggingface.co/datasets/thunder-research-group/SNU_Thunder-synthetic-coding
✨ Spaces citing this paper:
• https://huggingface.co/spaces/pliny-the-prompter/obliteratus
• https://huggingface.co/spaces/multimodalart/kugelaudio
• https://huggingface.co/spaces/agents-course/First_agent_template
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The Qwen2 series, comprising 0.5 to 72 billion parameter models, surpasses prior open models across language understanding, generation, multilingualism, coding, math, and reasoning, with exceptional p...
🔹 Publication Date: Published on Jul 15, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2407.10671
• PDF: https://arxiv.org/pdf/2407.10671
• Github: https://github.com/qwenlm/qwen2
🔹 Models citing this paper:
• https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct
• https://huggingface.co/Qwen/QwQ-32B-Preview
• https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
✨ Datasets citing this paper:
• https://huggingface.co/datasets/thunder-research-group/SNU_Thunder-synthetic-instruction-following
• https://huggingface.co/datasets/thunder-research-group/SNU_Thunder-synthetic-coding
✨ Spaces citing this paper:
• https://huggingface.co/spaces/pliny-the-prompter/obliteratus
• https://huggingface.co/spaces/multimodalart/kugelaudio
• https://huggingface.co/spaces/agents-course/First_agent_template
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
Qwen2 Technical Report
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned...
This media is not supported in your browser
VIEW IN TELEGRAM
✨Utonia: Toward One Encoder for All Point Clouds
📝 Summary:
Utonia introduces a unified self-supervised transformer encoder for diverse point cloud domains. It enhances perception and aids embodied and multimodal reasoning, aiming for foundation models in sparse 3D data.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03283
• PDF: https://arxiv.org/pdf/2603.03283
• Project Page: https://pointcept.github.io/Utonia/
• Github: https://github.com/Pointcept/Utonia
🔹 Models citing this paper:
• https://huggingface.co/Pointcept/Utonia
✨ Spaces citing this paper:
• https://huggingface.co/spaces/pointcept-bot/Utonia
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Utonia introduces a unified self-supervised transformer encoder for diverse point cloud domains. It enhances perception and aids embodied and multimodal reasoning, aiming for foundation models in sparse 3D data.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03283
• PDF: https://arxiv.org/pdf/2603.03283
• Project Page: https://pointcept.github.io/Utonia/
• Github: https://github.com/Pointcept/Utonia
🔹 Models citing this paper:
• https://huggingface.co/Pointcept/Utonia
✨ Spaces citing this paper:
• https://huggingface.co/spaces/pointcept-bot/Utonia
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Qwen3-Coder-Next Technical Report
📝 Summary:
Qwen3-Coder-Next is an 80-billion-parameter language model that activates only 3 billion parameters during inference, achieving strong coding capabilities through agentic training with verifiable task...
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00729
• PDF: https://arxiv.org/pdf/2603.00729
• Project Page: https://github.com/QwenLM/Qwen3-Coder
• Github: https://github.com/QwenLM/Qwen3-Coder
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Qwen3-Coder-Next is an 80-billion-parameter language model that activates only 3 billion parameters during inference, achieving strong coding capabilities through agentic training with verifiable task...
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00729
• PDF: https://arxiv.org/pdf/2603.00729
• Project Page: https://github.com/QwenLM/Qwen3-Coder
• Github: https://github.com/QwenLM/Qwen3-Coder
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
📝 Summary:
AReaL, a fully asynchronous reinforcement learning system, decouples generation and training to achieve higher GPU utilization and up to 2.57x training speedup for large language models on reasoning t...
🔹 Publication Date: Published on May 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.24298
• PDF: https://arxiv.org/pdf/2505.24298
• Github: https://github.com/inclusionAI/AReaL
🔹 Models citing this paper:
• https://huggingface.co/inclusionAI/AReaL-boba-2-8B
• https://huggingface.co/inclusionAI/AReaL-boba-2-14B
• https://huggingface.co/inclusionAI/AReaL-boba-2-8B-Open
✨ Datasets citing this paper:
• https://huggingface.co/datasets/inclusionAI/AReaL-tau2-data
✨ Spaces citing this paper:
• https://huggingface.co/spaces/rzvn/Medieval-Village-AI
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AReaL, a fully asynchronous reinforcement learning system, decouples generation and training to achieve higher GPU utilization and up to 2.57x training speedup for large language models on reasoning t...
🔹 Publication Date: Published on May 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.24298
• PDF: https://arxiv.org/pdf/2505.24298
• Github: https://github.com/inclusionAI/AReaL
🔹 Models citing this paper:
• https://huggingface.co/inclusionAI/AReaL-boba-2-8B
• https://huggingface.co/inclusionAI/AReaL-boba-2-14B
• https://huggingface.co/inclusionAI/AReaL-boba-2-8B-Open
✨ Datasets citing this paper:
• https://huggingface.co/datasets/inclusionAI/AReaL-tau2-data
✨ Spaces citing this paper:
• https://huggingface.co/spaces/rzvn/Medieval-Village-AI
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
AReaL: A Large-Scale Asynchronous Reinforcement Learning System...
Reinforcement learning (RL) has become a dominant paradigm for training large language models (LLMs), particularly for reasoning tasks. Effective RL for LLMs requires massive parallelization and...
✨InfoPO: Information-Driven Policy Optimization for User-Centric Agents
📝 Summary:
InfoPO optimizes agent-user collaboration for underspecified requests. It uses an information-gain reward to credit valuable turns that reduce uncertainty, improving decision-making and outperforming multi-turn RL baselines.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00656
• PDF: https://arxiv.org/pdf/2603.00656
• Github: https://github.com/kfq20/InfoPO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #AI #HumanComputerInteraction #InformationTheory #AIagents
📝 Summary:
InfoPO optimizes agent-user collaboration for underspecified requests. It uses an information-gain reward to credit valuable turns that reduce uncertainty, improving decision-making and outperforming multi-turn RL baselines.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00656
• PDF: https://arxiv.org/pdf/2603.00656
• Github: https://github.com/kfq20/InfoPO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #AI #HumanComputerInteraction #InformationTheory #AIagents
✨Chain of World: World Model Thinking in Latent Motion
📝 Summary:
CoWVLA unifies world-model temporal reasoning with disentangled latent motion representation to improve visuomotor learning efficiency. This new approach overcomes limitations of existing VLA models and outperforms them on robotic simulation benchmarks.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03195
• PDF: https://arxiv.org/pdf/2603.03195
• Project Page: https://fx-hit.github.io/cowvla-io/
• Github: https://fx-hit.github.io/cowvla-io/
🔹 Models citing this paper:
• https://huggingface.co/hitfx/CoWVLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#WorldModels #Robotics #MachineLearning #VisuomotorLearning #DeepLearning
📝 Summary:
CoWVLA unifies world-model temporal reasoning with disentangled latent motion representation to improve visuomotor learning efficiency. This new approach overcomes limitations of existing VLA models and outperforms them on robotic simulation benchmarks.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03195
• PDF: https://arxiv.org/pdf/2603.03195
• Project Page: https://fx-hit.github.io/cowvla-io/
• Github: https://fx-hit.github.io/cowvla-io/
🔹 Models citing this paper:
• https://huggingface.co/hitfx/CoWVLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#WorldModels #Robotics #MachineLearning #VisuomotorLearning #DeepLearning
✨Surgical Post-Training: Cutting Errors, Keeping Knowledge
📝 Summary:
Surgical Post-Training SPoT efficiently improves LLM reasoning while preventing catastrophic forgetting. It employs data rectification with an Oracle and a novel binary cross-entropy objective. SPoT enhanced Qwen3-8B accuracy by 6.2 percent using minimal data and training time.
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01683
• PDF: https://arxiv.org/pdf/2603.01683
• Github: https://github.com/Visual-AI/SPoT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #CatastrophicForgetting #MachineLearning #AI #DeepLearning
📝 Summary:
Surgical Post-Training SPoT efficiently improves LLM reasoning while preventing catastrophic forgetting. It employs data rectification with an Oracle and a novel binary cross-entropy objective. SPoT enhanced Qwen3-8B accuracy by 6.2 percent using minimal data and training time.
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01683
• PDF: https://arxiv.org/pdf/2603.01683
• Github: https://github.com/Visual-AI/SPoT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #CatastrophicForgetting #MachineLearning #AI #DeepLearning
✨Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics
📝 Summary:
Whisper-RIR-Mega dataset evaluates ASR model robustness to reverberation by pairing clean and reverberant speech samples with stratified splits based on RT60 and DRR metrics. AI-generated summary We i...
🔹 Publication Date: Published on Feb 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02252
• PDF: https://arxiv.org/pdf/2603.02252
• Project Page: https://huggingface.co/datasets/mandipgoswami/whisper-rirmega-bench
• Github: https://github.com/mandip42/whisper-rirmega-bench
✨ Datasets citing this paper:
• https://huggingface.co/datasets/mandipgoswami/whisper-rirmega-bench
✨ Spaces citing this paper:
• https://huggingface.co/spaces/mandipgoswami/whisper-rirmega-benchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Whisper-RIR-Mega dataset evaluates ASR model robustness to reverberation by pairing clean and reverberant speech samples with stratified splits based on RT60 and DRR metrics. AI-generated summary We i...
🔹 Publication Date: Published on Feb 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02252
• PDF: https://arxiv.org/pdf/2603.02252
• Project Page: https://huggingface.co/datasets/mandipgoswami/whisper-rirmega-bench
• Github: https://github.com/mandip42/whisper-rirmega-bench
✨ Datasets citing this paper:
• https://huggingface.co/datasets/mandipgoswami/whisper-rirmega-bench
✨ Spaces citing this paper:
• https://huggingface.co/spaces/mandipgoswami/whisper-rirmega-benchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
📝 Summary:
MOSAIC is a framework aligning agentic models for safe multi-step tool use, employing explicit safety reasoning and refusal. It significantly reduces harmful actions, increases refusal for unsafe tasks, cuts privacy leakage, and preserves benign performance.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03205
• PDF: https://arxiv.org/pdf/2603.03205
• Project Page: https://aradhye2002.github.io/mosaic-agent-safety/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AISafety #AIAgents #ResponsibleAI #LLMs #AIAlignment
📝 Summary:
MOSAIC is a framework aligning agentic models for safe multi-step tool use, employing explicit safety reasoning and refusal. It significantly reduces harmful actions, increases refusal for unsafe tasks, cuts privacy leakage, and preserves benign performance.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03205
• PDF: https://arxiv.org/pdf/2603.03205
• Project Page: https://aradhye2002.github.io/mosaic-agent-safety/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AISafety #AIAgents #ResponsibleAI #LLMs #AIAlignment
❤1
✨Spilled Energy in Large Language Models
📝 Summary:
Reinterpreting LLM softmax as an Energy-Based Model enables training-free hallucination detection. New energy metrics from output logits identify errors and biases without training overhead, demonstrating robust cross-task generalization.
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18671
• PDF: https://arxiv.org/pdf/2602.18671
• Github: https://github.com/OmnAI-Lab/spilled-energy
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #EnergyBasedModels #HallucinationDetection #AISafety #ArtificialIntelligence
📝 Summary:
Reinterpreting LLM softmax as an Energy-Based Model enables training-free hallucination detection. New energy metrics from output logits identify errors and biases without training overhead, demonstrating robust cross-task generalization.
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18671
• PDF: https://arxiv.org/pdf/2602.18671
• Github: https://github.com/OmnAI-Lab/spilled-energy
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #EnergyBasedModels #HallucinationDetection #AISafety #ArtificialIntelligence
✨Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction
📝 Summary:
CCP evaluates LLMs simulating social media users. Supervised fine-tuning improves text structure but degrades semantic accuracy, as models infer from behavioral histories without explicit conditioning. Prioritize authentic behavioral traces.
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.22752
• PDF: https://arxiv.org/pdf/2602.22752
• Project Page: https://nsschw.github.io/Turing-TWONy/
• Github: https://github.com/nsschw/Conditioned-Comment-Prediction
🔹 Models citing this paper:
• https://huggingface.co/nsschw/echo-Llama-3.1-8B-Instruct-eng
• https://huggingface.co/nsschw/echo-Llama-3.1-8B-Instruct-ger
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #SocialMedia #AISimulation #NLP #AIResearch
📝 Summary:
CCP evaluates LLMs simulating social media users. Supervised fine-tuning improves text structure but degrades semantic accuracy, as models infer from behavioral histories without explicit conditioning. Prioritize authentic behavioral traces.
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.22752
• PDF: https://arxiv.org/pdf/2602.22752
• Project Page: https://nsschw.github.io/Turing-TWONy/
• Github: https://github.com/nsschw/Conditioned-Comment-Prediction
🔹 Models citing this paper:
• https://huggingface.co/nsschw/echo-Llama-3.1-8B-Instruct-eng
• https://huggingface.co/nsschw/echo-Llama-3.1-8B-Instruct-ger
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #SocialMedia #AISimulation #NLP #AIResearch