✨SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation
📝 Summary:
Free-form sketching enables intuitive dynamic intent communication for automated content creation, bridging human intention and digital output in animation workflows. AI-generated summary Sketching pr...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20622
• PDF: https://arxiv.org/pdf/2601.20622
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Free-form sketching enables intuitive dynamic intent communication for automated content creation, bridging human intention and digital output in animation workflows. AI-generated summary Sketching pr...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20622
• PDF: https://arxiv.org/pdf/2601.20622
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨DeepSeek-OCR 2: Visual Causal Flow
📝 Summary:
DeepSeek-OCR 2 introduces DeepEncoder V2 that dynamically reorders visual tokens based on semantic content, enabling more human-like causal reasoning in 2D image understanding through cascaded 1D caus...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20552
• PDF: https://arxiv.org/pdf/2601.20552
• Github: https://github.com/deepseek-ai/DeepSeek-OCR-2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DeepSeek-OCR 2 introduces DeepEncoder V2 that dynamically reorders visual tokens based on semantic content, enabling more human-like causal reasoning in 2D image understanding through cascaded 1D caus...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20552
• PDF: https://arxiv.org/pdf/2601.20552
• Github: https://github.com/deepseek-ai/DeepSeek-OCR-2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning
📝 Summary:
Spark is a reinforcement learning framework that strategically allocates computational resources by branching at critical decision states, improving sample efficiency and generalization for long-horiz...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20209
• PDF: https://arxiv.org/pdf/2601.20209
🔹 Models citing this paper:
• https://huggingface.co/Jinyang23/Spark-1.5B-ALFWorld
• https://huggingface.co/Jinyang23/Spark-1.5B-ScienceWorld
• https://huggingface.co/Jinyang23/Spark-1.5B-WebShop
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Spark is a reinforcement learning framework that strategically allocates computational resources by branching at critical decision states, improving sample efficiency and generalization for long-horiz...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20209
• PDF: https://arxiv.org/pdf/2601.20209
🔹 Models citing this paper:
• https://huggingface.co/Jinyang23/Spark-1.5B-ALFWorld
• https://huggingface.co/Jinyang23/Spark-1.5B-ScienceWorld
• https://huggingface.co/Jinyang23/Spark-1.5B-WebShop
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Linear representations in language models can change dramatically over a conversation
📝 Summary:
Linear representation directions in language models dynamically shift during conversations, affecting how factual information is encoded while preserving generic content, with implications for interpr...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20834
• PDF: https://arxiv.org/pdf/2601.20834
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Linear representation directions in language models dynamically shift during conversations, affecting how factual information is encoded while preserving generic content, with implications for interpr...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20834
• PDF: https://arxiv.org/pdf/2601.20834
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SERA: Soft-Verified Efficient Repository Agents
📝 Summary:
Soft-Verified Efficient Repository Agents (SERA) enables cost-effective training of coding agents through supervised fine-tuning, achieving state-of-the-art performance while enabling specialization t...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20789
• PDF: https://arxiv.org/pdf/2601.20789
• Github: https://github.com/allenai/SERA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Soft-Verified Efficient Repository Agents (SERA) enables cost-effective training of coding agents through supervised fine-tuning, achieving state-of-the-art performance while enabling specialization t...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20789
• PDF: https://arxiv.org/pdf/2601.20789
• Github: https://github.com/allenai/SERA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
📝 Summary:
Innovator-VL demonstrates that principled training design and transparent methodology can achieve strong scientific intelligence with reduced data requirements while maintaining general vision perform...
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19325
• PDF: https://arxiv.org/pdf/2601.19325
• Project Page: https://innovatorlm.github.io/Innovator-VL
• Github: https://github.com/InnovatorLM/Innovator-VL
🔹 Models citing this paper:
• https://huggingface.co/InnovatorLab/Innovator-VL-8B-Instruct
• https://huggingface.co/InnovatorLab/Innovator-VL-8B-Thinking
✨ Datasets citing this paper:
• https://huggingface.co/datasets/InnovatorLab/Innovator-VL-Instruct-46M
• https://huggingface.co/datasets/InnovatorLab/EMVista
• https://huggingface.co/datasets/InnovatorLab/MolParse
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Innovator-VL demonstrates that principled training design and transparent methodology can achieve strong scientific intelligence with reduced data requirements while maintaining general vision perform...
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19325
• PDF: https://arxiv.org/pdf/2601.19325
• Project Page: https://innovatorlm.github.io/Innovator-VL
• Github: https://github.com/InnovatorLM/Innovator-VL
🔹 Models citing this paper:
• https://huggingface.co/InnovatorLab/Innovator-VL-8B-Instruct
• https://huggingface.co/InnovatorLab/Innovator-VL-8B-Thinking
✨ Datasets citing this paper:
• https://huggingface.co/datasets/InnovatorLab/Innovator-VL-Instruct-46M
• https://huggingface.co/datasets/InnovatorLab/EMVista
• https://huggingface.co/datasets/InnovatorLab/MolParse
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
We present Innovator-VL, a scientific multimodal large language model designed to advance understanding and reasoning across diverse scientific domains while maintaining excellent performance on...
✨OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution
📝 Summary:
OmegaUse is a general-purpose GUI agent model that achieves state-of-the-art performance on mobile and desktop platforms through a combination of high-quality data construction, decoupled training met...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20380
• PDF: https://arxiv.org/pdf/2601.20380
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmegaUse is a general-purpose GUI agent model that achieves state-of-the-art performance on mobile and desktop platforms through a combination of high-quality data construction, decoupled training met...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20380
• PDF: https://arxiv.org/pdf/2601.20380
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper
📝 Summary:
SE-DiCoW improves speaker-attributed ASR by using diarization output to identify an enrollment segment for each speaker. This segment provides fixed conditioning in cross-attention layers, resolving ambiguities and significantly reducing transcription error rates compared to DiCoW.
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19194
• PDF: https://arxiv.org/pdf/2601.19194
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SE-DiCoW improves speaker-attributed ASR by using diarization output to identify an enrollment segment for each speaker. This segment provides fixed conditioning in cross-attention layers, resolving ambiguities and significantly reducing transcription error rates compared to DiCoW.
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19194
• PDF: https://arxiv.org/pdf/2601.19194
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders
📝 Summary:
UPLiFT is an efficient iterative upsampling architecture with a Local Attender operator that creates dense features from visual backbones. It achieves state-of-the-art performance with lower inference costs than cross-attention methods, overcoming prior limitations.
🔹 Publication Date: Published on Jan 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17950
• PDF: https://arxiv.org/pdf/2601.17950
• Project Page: https://www.cs.umd.edu/~mwalmer/uplift/
• Github: https://github.com/mwalmer-umd/UPLiFT/
🔹 Models citing this paper:
• https://huggingface.co/UPLiFT-upsampler/uplift_dinov2-s14
• https://huggingface.co/UPLiFT-upsampler/uplift_dinov3-splus16
• https://huggingface.co/UPLiFT-upsampler/uplift_sd1.5vae
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #DeepLearning #FeatureUpsampling #AttentionMechanisms #EfficientAI
📝 Summary:
UPLiFT is an efficient iterative upsampling architecture with a Local Attender operator that creates dense features from visual backbones. It achieves state-of-the-art performance with lower inference costs than cross-attention methods, overcoming prior limitations.
🔹 Publication Date: Published on Jan 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17950
• PDF: https://arxiv.org/pdf/2601.17950
• Project Page: https://www.cs.umd.edu/~mwalmer/uplift/
• Github: https://github.com/mwalmer-umd/UPLiFT/
🔹 Models citing this paper:
• https://huggingface.co/UPLiFT-upsampler/uplift_dinov2-s14
• https://huggingface.co/UPLiFT-upsampler/uplift_dinov3-splus16
• https://huggingface.co/UPLiFT-upsampler/uplift_sd1.5vae
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #DeepLearning #FeatureUpsampling #AttentionMechanisms #EfficientAI
❤1
Media is too big
VIEW IN TELEGRAM
✨Shallow-π: Knowledge Distillation for Flow-based VLAs
📝 Summary:
Shallow-pi is a knowledge distillation framework that reduces transformer depth in vision-language-action models. It achieves over two times faster inference with less than one percent performance drop, enabling efficient real-world robotic deployment.
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20262
• PDF: https://arxiv.org/pdf/2601.20262
• Project Page: https://icsl-jeon.github.io/shallow-pi/
• Github: https://icsl-jeon.github.io/shallow-pi/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#KnowledgeDistillation #Robotics #VLAModels #EfficientAI #DeepLearning
📝 Summary:
Shallow-pi is a knowledge distillation framework that reduces transformer depth in vision-language-action models. It achieves over two times faster inference with less than one percent performance drop, enabling efficient real-world robotic deployment.
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20262
• PDF: https://arxiv.org/pdf/2601.20262
• Project Page: https://icsl-jeon.github.io/shallow-pi/
• Github: https://icsl-jeon.github.io/shallow-pi/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#KnowledgeDistillation #Robotics #VLAModels #EfficientAI #DeepLearning
❤1
✨Reinforcement Learning via Self-Distillation
📝 Summary:
Self-Distillation Policy Optimization SDPO leverages rich textual feedback to address the credit-assignment bottleneck in reinforcement learning. SDPO treats the model as a self-teacher, distilling feedback-informed predictions to improve sample efficiency and accuracy. It significantly enhances ...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20802
• PDF: https://arxiv.org/pdf/2601.20802
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #SelfDistillation #MachineLearning #AI #PolicyOptimization
📝 Summary:
Self-Distillation Policy Optimization SDPO leverages rich textual feedback to address the credit-assignment bottleneck in reinforcement learning. SDPO treats the model as a self-teacher, distilling feedback-informed predictions to improve sample efficiency and accuracy. It significantly enhances ...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20802
• PDF: https://arxiv.org/pdf/2601.20802
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #SelfDistillation #MachineLearning #AI #PolicyOptimization
❤2
✨Training Reasoning Models on Saturated Problems via Failure-Prefix Conditioning
📝 Summary:
Reinforcement learning training stalls on saturated problems as informative failures are hard to find. Failure-prefix conditioning addresses this by training on prefixes from rare incorrect reasoning paths, exposing models to failures. This boosts performance, maintains efficiency, and improves r...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20829
• PDF: https://arxiv.org/pdf/2601.20829
• Github: https://github.com/minwukim/training-on-saturated-problems
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #MachineLearning #ArtificialIntelligence #DeepLearning #AIResearch
📝 Summary:
Reinforcement learning training stalls on saturated problems as informative failures are hard to find. Failure-prefix conditioning addresses this by training on prefixes from rare incorrect reasoning paths, exposing models to failures. This boosts performance, maintains efficiency, and improves r...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20829
• PDF: https://arxiv.org/pdf/2601.20829
• Github: https://github.com/minwukim/training-on-saturated-problems
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #MachineLearning #ArtificialIntelligence #DeepLearning #AIResearch
❤2
✨MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
📝 Summary:
MM-Agent is an expert-inspired framework that enables LLMs to excel in real-world mathematical modeling by decomposing the task into four stages. It significantly outperforms human experts and baseline agents on a new benchmark, proving its practical effectiveness as a modeling copilot.
🔹 Publication Date: Published on May 20, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.14148
• PDF: https://arxiv.org/pdf/2505.14148
• Github: https://github.com/usail-hkust/llm-mm-agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MathematicalModeling #AIAgents #ArtificialIntelligence #DataScience
📝 Summary:
MM-Agent is an expert-inspired framework that enables LLMs to excel in real-world mathematical modeling by decomposing the task into four stages. It significantly outperforms human experts and baseline agents on a new benchmark, proving its practical effectiveness as a modeling copilot.
🔹 Publication Date: Published on May 20, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.14148
• PDF: https://arxiv.org/pdf/2505.14148
• Github: https://github.com/usail-hkust/llm-mm-agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MathematicalModeling #AIAgents #ArtificialIntelligence #DataScience
arXiv.org
MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
Mathematical modeling is a cornerstone of scientific discovery and engineering practice, enabling the translation of real-world problems into formal systems across domains such as physics,...
❤1
✨VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning
📝 Summary:
VERGE is a neurosymbolic framework that combines LLMs with SMT solvers for verification-guided iterative refinement of reasoning. It enhances logical correctness through formal semantic checking, semantic routing, and precise error localization, achieving an 18.7% performance uplift on reasoning ...
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20055
• PDF: https://arxiv.org/pdf/2601.20055
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #NeurosymbolicAI #FormalVerification #AIReasoning #SMTSolvers
📝 Summary:
VERGE is a neurosymbolic framework that combines LLMs with SMT solvers for verification-guided iterative refinement of reasoning. It enhances logical correctness through formal semantic checking, semantic routing, and precise error localization, achieving an 18.7% performance uplift on reasoning ...
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20055
• PDF: https://arxiv.org/pdf/2601.20055
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #NeurosymbolicAI #FormalVerification #AIReasoning #SMTSolvers
❤2🔥1
✨Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning
📝 Summary:
This paper introduces Multi-Adversary GDRO to improve LLM reasoning. It dynamically adapts training distributions by classifying prompt difficulty and reallocating resources. This boosts accuracy by over 10% compared to GRPO, focusing compute on hard problems.
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19280
• PDF: https://arxiv.org/pdf/2601.19280
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMReasoning #ReinforcementLearning #Optimization #MachineLearning #AI
📝 Summary:
This paper introduces Multi-Adversary GDRO to improve LLM reasoning. It dynamically adapts training distributions by classifying prompt difficulty and reallocating resources. This boosts accuracy by over 10% compared to GRPO, focusing compute on hard problems.
🔹 Publication Date: Published on Jan 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.19280
• PDF: https://arxiv.org/pdf/2601.19280
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMReasoning #ReinforcementLearning #Optimization #MachineLearning #AI
❤1
✨Persona Prompting as a Lens on LLM Social Reasoning
📝 Summary:
Persona prompting improves LLM classification on subjective tasks like hate speech but degrades explanation quality. It fails to mitigate demographic biases and align with real-world personas, as models remain resistant to significant steering and over-flag content as harmful. This reveals a crit...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20757
• PDF: https://arxiv.org/pdf/2601.20757
• Github: https://github.com/jingyng/PP-social-reasoning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #PersonaPrompting #BiasInAI #AIethics #NLP
📝 Summary:
Persona prompting improves LLM classification on subjective tasks like hate speech but degrades explanation quality. It fails to mitigate demographic biases and align with real-world personas, as models remain resistant to significant steering and over-flag content as harmful. This reveals a crit...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20757
• PDF: https://arxiv.org/pdf/2601.20757
• Github: https://github.com/jingyng/PP-social-reasoning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #PersonaPrompting #BiasInAI #AIethics #NLP
🔥1
✨How AI Impacts Skill Formation
📝 Summary:
AI assistance impairs skill acquisition for novice workers, hindering conceptual understanding and debugging. Heavy AI reliance is not a shortcut to competence. Careful AI adoption is crucial to preserve skill formation.
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20245
• PDF: https://arxiv.org/pdf/2601.20245
• Project Page: https://www.anthropic.com/research/AI-assistance-coding-skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #SkillFormation #WorkforceDevelopment #LearningScience #HumanAICollaboration
📝 Summary:
AI assistance impairs skill acquisition for novice workers, hindering conceptual understanding and debugging. Heavy AI reliance is not a shortcut to competence. Careful AI adoption is crucial to preserve skill formation.
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20245
• PDF: https://arxiv.org/pdf/2601.20245
• Project Page: https://www.anthropic.com/research/AI-assistance-coding-skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #SkillFormation #WorkforceDevelopment #LearningScience #HumanAICollaboration
✨FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning
📝 Summary:
FP8-RL presents a practical FP8 rollout stack for LLM reinforcement learning, addressing computational and memory bottlenecks. It employs blockwise FP8, KV-cache recalibration, and importance sampling to mitigate train-inference mismatch. This achieves up to 44% throughput gains while preserving ...
🔹 Publication Date: Published on Jan 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18150
• PDF: https://arxiv.org/pdf/2601.18150
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #ReinforcementLearning #FP8 #MachineLearning #AIResearch
📝 Summary:
FP8-RL presents a practical FP8 rollout stack for LLM reinforcement learning, addressing computational and memory bottlenecks. It employs blockwise FP8, KV-cache recalibration, and importance sampling to mitigate train-inference mismatch. This achieves up to 44% throughput gains while preserving ...
🔹 Publication Date: Published on Jan 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18150
• PDF: https://arxiv.org/pdf/2601.18150
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #ReinforcementLearning #FP8 #MachineLearning #AIResearch
✨Language-based Trial and Error Falls Behind in the Era of Experience
📝 Summary:
LLMs struggle in nonlinguistic tasks due to costly exploration. SCOUT uses lightweight scouts for efficient exploration, then fine-tunes LLMs via SFT and RL. This boosts performance and saves GPU hours, outperforming proprietary models.
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21754
• PDF: https://arxiv.org/pdf/2601.21754
• Project Page: https://scout-cs.github.io/
• Github: https://github.com/Harry-mic/SCOUT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LLMs struggle in nonlinguistic tasks due to costly exploration. SCOUT uses lightweight scouts for efficient exploration, then fine-tunes LLMs via SFT and RL. This boosts performance and saves GPU hours, outperforming proprietary models.
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21754
• PDF: https://arxiv.org/pdf/2601.21754
• Project Page: https://scout-cs.github.io/
• Github: https://github.com/Harry-mic/SCOUT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
👍1
✨Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
📝 Summary:
A two-stage trained cybersecurity reasoning model achieves competitive performance on specialized tasks while maintaining general capabilities through supervised fine-tuning and reinforcement learning...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21051
• PDF: https://arxiv.org/pdf/2601.21051
• Project Page: https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A two-stage trained cybersecurity reasoning model achieves competitive performance on specialized tasks while maintaining general capabilities through supervised fine-tuning and reinforcement learning...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21051
• PDF: https://arxiv.org/pdf/2601.21051
• Project Page: https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research