✨Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
📝 Summary:
A fusion framework called FINCH combines audio and spatiotemporal predictors for bioacoustic classification by adaptively weighting evidence based on reliability estimates, outperforming fixed-weight ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03817
• PDF: https://arxiv.org/pdf/2602.03817
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A fusion framework called FINCH combines audio and spatiotemporal predictors for bioacoustic classification by adaptively weighting evidence based on reliability estimates, outperforming fixed-weight ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03817
• PDF: https://arxiv.org/pdf/2602.03817
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration
📝 Summary:
Search-R2 framework improves language agent reasoning through Actor-Refiner collaboration with targeted interventions and fine-grained reward supervision for better credit assignment in reinforcement ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03647
• PDF: https://arxiv.org/pdf/2602.03647
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Search-R2 framework improves language agent reasoning through Actor-Refiner collaboration with targeted interventions and fine-grained reward supervision for better credit assignment in reinforcement ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03647
• PDF: https://arxiv.org/pdf/2602.03647
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MARS: Modular Agent with Reflective Search for Automated AI Research
📝 Summary:
MARS is a modular AI research automation framework that uses budget-aware planning, modular construction, and reflective memory to achieve state-of-the-art performance in autonomous machine learning r...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02660
• PDF: https://arxiv.org/pdf/2602.02660
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MARS is a modular AI research automation framework that uses budget-aware planning, modular construction, and reflective memory to achieve state-of-the-art performance in autonomous machine learning r...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02660
• PDF: https://arxiv.org/pdf/2602.02660
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently
📝 Summary:
daVinci-Agency addresses LLM limitations in long-horizon tasks by extracting structured training data from software pull request sequences. It uses progressive decomposition, consistency enforcement, and bug-fix refinement. This method offers data-efficient supervision, boosting LLM performance o...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02619
• PDF: https://arxiv.org/pdf/2602.02619
• Github: https://github.com/GAIR-NLP/daVinci-Agency
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
daVinci-Agency addresses LLM limitations in long-horizon tasks by extracting structured training data from software pull request sequences. It uses progressive decomposition, consistency enforcement, and bug-fix refinement. This method offers data-efficient supervision, boosting LLM performance o...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02619
• PDF: https://arxiv.org/pdf/2602.02619
• Github: https://github.com/GAIR-NLP/daVinci-Agency
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation
📝 Summary:
Offline reinforcement learning method combines contextual bandit learning with partial trajectories to improve multi-turn code generation performance while reducing training costs. AI-generated summar...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03806
• PDF: https://arxiv.org/pdf/2602.03806
• Github: https://github.com/OSU-NLP-Group/cobalt
✨ Datasets citing this paper:
• https://huggingface.co/datasets/osunlp/TACO-Cobalt
• https://huggingface.co/datasets/osunlp/TACO-Cobalt-PTB
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Offline reinforcement learning method combines contextual bandit learning with partial trajectories to improve multi-turn code generation performance while reducing training costs. AI-generated summar...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03806
• PDF: https://arxiv.org/pdf/2602.03806
• Github: https://github.com/OSU-NLP-Group/cobalt
✨ Datasets citing this paper:
• https://huggingface.co/datasets/osunlp/TACO-Cobalt
• https://huggingface.co/datasets/osunlp/TACO-Cobalt-PTB
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SWE-World: Building Software Engineering Agents in Docker-Free Environments
📝 Summary:
A Docker-free framework replaces physical execution environments with learned surrogates for training software engineering agents, enabling efficient training and test-time scaling without costly cont...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03419
• PDF: https://arxiv.org/pdf/2602.03419
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A Docker-free framework replaces physical execution environments with learned surrogates for training software engineering agents, enabling efficient training and test-time scaling without costly cont...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03419
• PDF: https://arxiv.org/pdf/2602.03419
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
📝 Summary:
SWE-Master presents a reproducible framework for developing software engineering agents through systematic optimization across multiple stages of agent development, achieving superior performance on s...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03411
• PDF: https://arxiv.org/pdf/2602.03411
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SWE-Master presents a reproducible framework for developing software engineering agents through systematic optimization across multiple stages of agent development, achieving superior performance on s...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03411
• PDF: https://arxiv.org/pdf/2602.03411
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration
📝 Summary:
AOrchestra is a framework-agnostic agentic system that uses a tuple-based abstraction to dynamically create specialized task executors, achieving improved performance on complex benchmarks through aut...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03786
• PDF: https://arxiv.org/pdf/2602.03786
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AOrchestra is a framework-agnostic agentic system that uses a tuple-based abstraction to dynamically create specialized task executors, achieving improved performance on complex benchmarks through aut...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03786
• PDF: https://arxiv.org/pdf/2602.03786
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
📝 Summary:
A novel distillation framework called DP-DMD is introduced that preserves sample diversity in text-to-image generation by separating the roles of distilled steps, using v-prediction for diversity and ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03139
• PDF: https://arxiv.org/pdf/2602.03139
• Github: https://github.com/Multimedia-Analytics-Laboratory/dpdmd
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A novel distillation framework called DP-DMD is introduced that preserves sample diversity in text-to-image generation by separating the roles of distilled steps, using v-prediction for diversity and ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03139
• PDF: https://arxiv.org/pdf/2602.03139
• Github: https://github.com/Multimedia-Analytics-Laboratory/dpdmd
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration
📝 Summary:
SafeGround is a uncertainty-aware framework for GUI grounding models that uses distribution-aware uncertainty quantification and calibration to enable risk-aware predictions with controlled false disc...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02419
• PDF: https://arxiv.org/pdf/2602.02419
• Github: https://github.com/Cece1031/SAFEGROUND
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SafeGround is a uncertainty-aware framework for GUI grounding models that uses distribution-aware uncertainty quantification and calibration to enable risk-aware predictions with controlled false disc...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02419
• PDF: https://arxiv.org/pdf/2602.02419
• Github: https://github.com/Cece1031/SAFEGROUND
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation
📝 Summary:
FullStack-Agent is a unified AI system assisting non-experts in full-stack web development. It uses a multi-agent framework and a self-improving method, demonstrating significant performance gains over prior state-of-the-art across all web functionalities.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03798
• PDF: https://arxiv.org/pdf/2602.03798
• Github: https://github.com/mnluzimu/FullStack-Agent
🔹 Models citing this paper:
• https://huggingface.co/luzimu/FullStack-Learn-LM-30B-A3B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/luzimu/FullStack-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
FullStack-Agent is a unified AI system assisting non-experts in full-stack web development. It uses a multi-agent framework and a self-improving method, demonstrating significant performance gains over prior state-of-the-art across all web functionalities.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03798
• PDF: https://arxiv.org/pdf/2602.03798
• Github: https://github.com/mnluzimu/FullStack-Agent
🔹 Models citing this paper:
• https://huggingface.co/luzimu/FullStack-Learn-LM-30B-A3B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/luzimu/FullStack-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
📝 Summary:
Token Sparse Attention enables efficient long-context inference by dynamically compressing and decompressing attention tensors at the token level, achieving significant speedup with minimal accuracy l...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03216
• PDF: https://arxiv.org/pdf/2602.03216
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Token Sparse Attention enables efficient long-context inference by dynamically compressing and decompressing attention tensors at the token level, achieving significant speedup with minimal accuracy l...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03216
• PDF: https://arxiv.org/pdf/2602.03216
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
📝 Summary:
LRAgent is a KV cache sharing framework for multi-LoRA agents that decomposes cache into shared and adapter-dependent components, reducing memory and compute overhead while maintaining accuracy. AI-ge...
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01053
• PDF: https://arxiv.org/pdf/2602.01053
• Github: https://github.com/hjeon2k/LRAgent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LRAgent is a KV cache sharing framework for multi-LoRA agents that decomposes cache into shared and adapter-dependent components, reducing memory and compute overhead while maintaining accuracy. AI-ge...
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01053
• PDF: https://arxiv.org/pdf/2602.01053
• Github: https://github.com/hjeon2k/LRAgent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Evaluating and Aligning CodeLLMs on Human Preference
📝 Summary:
A human-curated benchmark (CodeArena) and a large synthetic instruction corpus (SynCode-Instruct) are introduced to evaluate code LLMs based on human preference alignment, revealing performance differ...
🔹 Publication Date: Published on Dec 6, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.05210
• PDF: https://arxiv.org/pdf/2412.05210
• Project Page: https://codearenaeval.github.io/
• Github: https://github.com/QwenLM/Qwen2.5-Coder/tree/main/qwencoder-eval/instruct/CodeArena
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CSJianYang/CodeArena
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A human-curated benchmark (CodeArena) and a large synthetic instruction corpus (SynCode-Instruct) are introduced to evaluate code LLMs based on human preference alignment, revealing performance differ...
🔹 Publication Date: Published on Dec 6, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.05210
• PDF: https://arxiv.org/pdf/2412.05210
• Project Page: https://codearenaeval.github.io/
• Github: https://github.com/QwenLM/Qwen2.5-Coder/tree/main/qwencoder-eval/instruct/CodeArena
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CSJianYang/CodeArena
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨The Necessity of a Unified Framework for LLM-Based Agent Evaluation
📝 Summary:
Current LLM agent evaluations are hindered by confounding factors like prompts, toolsets, and environments, alongside a lack of standardization, leading to unfair and irreproducible results. A unified evaluation framework is essential to ensure rigorous and fair assessment of these advanced agents.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03238
• PDF: https://arxiv.org/pdf/2602.03238
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #AIEvaluation #Standardization #AIResearch #MachineLearning
📝 Summary:
Current LLM agent evaluations are hindered by confounding factors like prompts, toolsets, and environments, alongside a lack of standardization, leading to unfair and irreproducible results. A unified evaluation framework is essential to ensure rigorous and fair assessment of these advanced agents.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03238
• PDF: https://arxiv.org/pdf/2602.03238
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #AIEvaluation #Standardization #AIResearch #MachineLearning
✨SimpleGPT: Improving GPT via A Simple Normalization Strategy
📝 Summary:
SimpleNorm is a new normalization strategy for Transformers that stabilizes activation scales and reduces the Hessian spectral norm. This allows for significantly larger stable learning rates, leading to improved training performance and lower loss in large GPT models.
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01212
• PDF: https://arxiv.org/pdf/2602.01212
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GPT #Normalization #Transformers #DeepLearning #AIResearch
📝 Summary:
SimpleNorm is a new normalization strategy for Transformers that stabilizes activation scales and reduces the Hessian spectral norm. This allows for significantly larger stable learning rates, leading to improved training performance and lower loss in large GPT models.
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01212
• PDF: https://arxiv.org/pdf/2602.01212
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GPT #Normalization #Transformers #DeepLearning #AIResearch
✨No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding
📝 Summary:
This paper introduces ID-MoCQA, the first large-scale multi-hop question answering dataset for assessing cultural understanding in LLMs, using Indonesian traditions. It transforms single-hop questions into complex reasoning chains across diverse clue types. Evaluations reveal significant gaps in ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03709
• PDF: https://arxiv.org/pdf/2602.03709
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultiHopQA #LLMs #CulturalAI #IndonesianCulture #NLP
📝 Summary:
This paper introduces ID-MoCQA, the first large-scale multi-hop question answering dataset for assessing cultural understanding in LLMs, using Indonesian traditions. It transforms single-hop questions into complex reasoning chains across diverse clue types. Evaluations reveal significant gaps in ...
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03709
• PDF: https://arxiv.org/pdf/2602.03709
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultiHopQA #LLMs #CulturalAI #IndonesianCulture #NLP
❤1
✨Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration
📝 Summary:
Instruction tokens act as anchors for modality arbitration in MLLMs, guiding multimodal context use. This involves shallow layers gathering cues and deep layers resolving competition. Manipulating a few specialized attention heads significantly impacts this process.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03677
• PDF: https://arxiv.org/pdf/2602.03677
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #MultimodalAI #AttentionMechanisms #DeepLearning #AIResearch
📝 Summary:
Instruction tokens act as anchors for modality arbitration in MLLMs, guiding multimodal context use. This involves shallow layers gathering cues and deep layers resolving competition. Manipulating a few specialized attention heads significantly impacts this process.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03677
• PDF: https://arxiv.org/pdf/2602.03677
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #MultimodalAI #AttentionMechanisms #DeepLearning #AIResearch
❤1
✨RecGOAT: Graph Optimal Adaptive Transport for LLM-Enhanced Multimodal Recommendation with Dual Semantic Alignment
📝 Summary:
RecGOAT bridges the representational gap between LLMs and recommendation systems. It uses graph attention networks and a dual-granularity semantic alignment framework combining cross-modal contrastive learning and optimal adaptive transport for superior performance.
🔹 Publication Date: Published on Jan 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.00682
• PDF: https://arxiv.org/pdf/2602.00682
• Github: https://github.com/6lyc/RecGOAT-LLM4Rec
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RecGOAT #LLM #RecommendationSystems #MultimodalAI #GraphNeuralNetworks
📝 Summary:
RecGOAT bridges the representational gap between LLMs and recommendation systems. It uses graph attention networks and a dual-granularity semantic alignment framework combining cross-modal contrastive learning and optimal adaptive transport for superior performance.
🔹 Publication Date: Published on Jan 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.00682
• PDF: https://arxiv.org/pdf/2602.00682
• Github: https://github.com/6lyc/RecGOAT-LLM4Rec
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RecGOAT #LLM #RecommendationSystems #MultimodalAI #GraphNeuralNetworks
✨POP: Prefill-Only Pruning for Efficient Large Model Inference
📝 Summary:
POP is a new stage-aware pruning method for large models. It omits deep layers during the computationally intensive prefill stage while using the full model for decoding. This achieves up to 1.37 times prefill speedup with minimal accuracy loss, overcoming limitations of prior pruning methods.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03295
• PDF: https://arxiv.org/pdf/2602.03295
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #MachineLearning #LLM #ModelPruning #InferenceOptimization
📝 Summary:
POP is a new stage-aware pruning method for large models. It omits deep layers during the computationally intensive prefill stage while using the full model for decoding. This achieves up to 1.37 times prefill speedup with minimal accuracy loss, overcoming limitations of prior pruning methods.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03295
• PDF: https://arxiv.org/pdf/2602.03295
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #MachineLearning #LLM #ModelPruning #InferenceOptimization
✨MEG-XL: Data-Efficient Brain-to-Text via Long-Context Pre-Training
📝 Summary:
MEG-XL improves brain-to-text decoding by pre-training with 2.5 minutes of MEG context, far exceeding prior methods. This long-context approach dramatically boosts data efficiency, achieving supervised performance with only a fraction of the data and outperforming other models.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02494
• PDF: https://arxiv.org/pdf/2602.02494
• Github: https://github.com/neural-processing-lab/MEG-XL
🔹 Models citing this paper:
• https://huggingface.co/pnpl/MEG-XL
✨ Datasets citing this paper:
• https://huggingface.co/datasets/pnpl/LibriBrain
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#BrainToText #MEG #Neuroscience #DeepLearning #AI
📝 Summary:
MEG-XL improves brain-to-text decoding by pre-training with 2.5 minutes of MEG context, far exceeding prior methods. This long-context approach dramatically boosts data efficiency, achieving supervised performance with only a fraction of the data and outperforming other models.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02494
• PDF: https://arxiv.org/pdf/2602.02494
• Github: https://github.com/neural-processing-lab/MEG-XL
🔹 Models citing this paper:
• https://huggingface.co/pnpl/MEG-XL
✨ Datasets citing this paper:
• https://huggingface.co/datasets/pnpl/LibriBrain
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#BrainToText #MEG #Neuroscience #DeepLearning #AI