✨LoopViT: Scaling Visual ARC with Looped Transformers
📝 Summary:
Loop-ViT introduces a recursive vision transformer architecture that decouples reasoning depth from model capacity through weight-tied recurrence and dynamic exit mechanisms, achieving superior visual...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02156
• PDF: https://arxiv.org/pdf/2602.02156
• Github: https://github.com/WenjieShu/LoopViT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Loop-ViT introduces a recursive vision transformer architecture that decouples reasoning depth from model capacity through weight-tied recurrence and dynamic exit mechanisms, achieving superior visual...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02156
• PDF: https://arxiv.org/pdf/2602.02156
• Github: https://github.com/WenjieShu/LoopViT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training
📝 Summary:
Rubric-ARM framework jointly optimizes rubric generation and judging through reinforcement learning to improve response quality assessment in creative and open-ended tasks. AI-generated summary Standa...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01511
• PDF: https://arxiv.org/pdf/2602.01511
🔹 Models citing this paper:
• https://huggingface.co/OpenRubrics/RubricARM-8B-Rubric
• https://huggingface.co/OpenRubrics/RubricARM-8B-Judge
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Rubric-ARM framework jointly optimizes rubric generation and judging through reinforcement learning to improve response quality assessment in creative and open-ended tasks. AI-generated summary Standa...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01511
• PDF: https://arxiv.org/pdf/2602.01511
🔹 Models citing this paper:
• https://huggingface.co/OpenRubrics/RubricARM-8B-Rubric
• https://huggingface.co/OpenRubrics/RubricARM-8B-Judge
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PISA: Piecewise Sparse Attention Is Wiser for Efficient Diffusion Transformers
📝 Summary:
PISA is a novel sparse attention method that improves diffusion transformer efficiency by approximating non-critical attention blocks instead of discarding them, achieving faster processing with maint...
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01077
• PDF: https://arxiv.org/pdf/2602.01077
• Github: https://github.com/xie-lab-ml/piecewise-sparse-attention
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PISA is a novel sparse attention method that improves diffusion transformer efficiency by approximating non-critical attention blocks instead of discarding them, achieving faster processing with maint...
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01077
• PDF: https://arxiv.org/pdf/2602.01077
• Github: https://github.com/xie-lab-ml/piecewise-sparse-attention
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OVD: On-policy Verbal Distillation
📝 Summary:
On-policy Verbal Distillation (OVD) enables efficient knowledge transfer from teacher to student models by replacing token-level probability matching with trajectory matching using discrete verbal sco...
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21968
• PDF: https://arxiv.org/pdf/2601.21968
• Github: https://OVD.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
On-policy Verbal Distillation (OVD) enables efficient knowledge transfer from teacher to student models by replacing token-level probability matching with trajectory matching using discrete verbal sco...
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21968
• PDF: https://arxiv.org/pdf/2601.21968
• Github: https://OVD.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space
📝 Summary:
FSVideo is a fast transformer-based image-to-video diffusion framework that uses a compressed video autoencoder, diffusion transformer architecture with enhanced layer memory, and multi-resolution gen...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02092
• PDF: https://arxiv.org/pdf/2602.02092
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
FSVideo is a fast transformer-based image-to-video diffusion framework that uses a compressed video autoencoder, diffusion transformer architecture with enhanced layer memory, and multi-resolution gen...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02092
• PDF: https://arxiv.org/pdf/2602.02092
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SPARKLING: Balancing Signal Preservation and Symmetry Breaking for Width-Progressive Learning
📝 Summary:
SPARKLING is a framework for mid-stage width expansion in deep learning models that maintains signal preservation and breaks symmetry to stabilize training and reduce computational costs. AI-generated...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02472
• PDF: https://arxiv.org/pdf/2602.02472
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SPARKLING is a framework for mid-stage width expansion in deep learning models that maintains signal preservation and breaks symmetry to stabilize training and reduce computational costs. AI-generated...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02472
• PDF: https://arxiv.org/pdf/2602.02472
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨An Empirical Study of World Model Quantization
📝 Summary:
Post-training quantization effects in world models reveal unique failure modes and trade-offs between accuracy, bit-width, and planning performance, particularly in encoder-predictor module asymmetrie...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02110
• PDF: https://arxiv.org/pdf/2602.02110
• Github: https://github.com/huawei-noah/noah-research/tree/master/QuantWM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Post-training quantization effects in world models reveal unique failure modes and trade-offs between accuracy, bit-width, and planning performance, particularly in encoder-predictor module asymmetrie...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02110
• PDF: https://arxiv.org/pdf/2602.02110
• Github: https://github.com/huawei-noah/noah-research/tree/master/QuantWM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning
📝 Summary:
A training-free framework enables language model agents to automatically create and optimize tools during inference, improving their reasoning capabilities through self-evolution and memory consolidat...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01983
• PDF: https://arxiv.org/pdf/2602.01983
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A training-free framework enables language model agents to automatically create and optimize tools during inference, improving their reasoning capabilities through self-evolution and memory consolidat...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01983
• PDF: https://arxiv.org/pdf/2602.01983
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Influence Guided Sampling for Domain Adaptation of Text Retrievers
📝 Summary:
An reinforcement learning-based sampling framework adaptively reweights training datasets to improve embedding model performance while reducing GPU costs. AI-generated summary General-purpose open-dom...
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21759
• PDF: https://arxiv.org/pdf/2601.21759
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
An reinforcement learning-based sampling framework adaptively reweights training datasets to improve embedding model performance while reducing GPU costs. AI-generated summary General-purpose open-dom...
🔹 Publication Date: Published on Jan 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21759
• PDF: https://arxiv.org/pdf/2601.21759
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨INDIBATOR: Diverse and Fact-Grounded Individuality for Multi-Agent Debate in Molecular Discovery
📝 Summary:
Multi-agent systems for molecular discovery that use individualized scientist profiles based on publication and molecular history outperform traditional role-based approaches. AI-generated summary Mul...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01815
• PDF: https://arxiv.org/pdf/2602.01815
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-agent systems for molecular discovery that use individualized scientist profiles based on publication and molecular history outperform traditional role-based approaches. AI-generated summary Mul...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01815
• PDF: https://arxiv.org/pdf/2602.01815
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios
📝 Summary:
AgentIF-OneDay is a new benchmark evaluating AI agents on diverse daily tasks using natural language instructions. It assesses problem-solving, attachment understanding, and file-based outputs across three user-centric categories. Benchmarking shows leading agent products and LLM APIs excel in th...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20613
• PDF: https://arxiv.org/pdf/2601.20613
✨ Datasets citing this paper:
• https://huggingface.co/datasets/xbench/AgentIF-OneDay
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #LLMs #Benchmark #InstructionFollowing #GeneralAI
📝 Summary:
AgentIF-OneDay is a new benchmark evaluating AI agents on diverse daily tasks using natural language instructions. It assesses problem-solving, attachment understanding, and file-based outputs across three user-centric categories. Benchmarking shows leading agent products and LLM APIs excel in th...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20613
• PDF: https://arxiv.org/pdf/2601.20613
✨ Datasets citing this paper:
• https://huggingface.co/datasets/xbench/AgentIF-OneDay
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #LLMs #Benchmark #InstructionFollowing #GeneralAI
✨VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration
📝 Summary:
VisionTrim accelerates MLLMs by selecting dominant visual tokens and merging them with text guidance. This training-free framework improves efficiency without performance loss, addressing high computational costs from excessive visual data.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22674
• PDF: https://arxiv.org/pdf/2601.22674
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #VisionTokenCompression #ModelAcceleration #DeepLearning #TrainingFree
📝 Summary:
VisionTrim accelerates MLLMs by selecting dominant visual tokens and merging them with text guidance. This training-free framework improves efficiency without performance loss, addressing high computational costs from excessive visual data.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22674
• PDF: https://arxiv.org/pdf/2601.22674
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #VisionTokenCompression #ModelAcceleration #DeepLearning #TrainingFree
✨On the Limits of Layer Pruning for Generative Reasoning in LLMs
📝 Summary:
Layer pruning degrades LLM generative reasoning tasks, unlike classification which recovers well. While finetuning helps, generative reasoning recovery remains fundamentally limited, especially at higher pruning ratios.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01997
• PDF: https://arxiv.org/pdf/2602.01997
• Github: https://github.com/safal312/on-the-limits-of-layer-pruning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #ModelPruning #AIResearch #GenerativeAI #DeepLearning
📝 Summary:
Layer pruning degrades LLM generative reasoning tasks, unlike classification which recovers well. While finetuning helps, generative reasoning recovery remains fundamentally limited, especially at higher pruning ratios.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01997
• PDF: https://arxiv.org/pdf/2602.01997
• Github: https://github.com/safal312/on-the-limits-of-layer-pruning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #ModelPruning #AIResearch #GenerativeAI #DeepLearning
❤1
✨Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
📝 Summary:
Generalizable Predictive Prompt Selection GPS efficiently selects informative prompts for RL-enhanced language models using Bayesian inference and a lightweight generative model. This method significantly improves training efficiency, final performance, and test-time efficiency.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01970
• PDF: https://arxiv.org/pdf/2602.01970
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #ReinforcementLearning #PromptEngineering #AI #MachineLearning
📝 Summary:
Generalizable Predictive Prompt Selection GPS efficiently selects informative prompts for RL-enhanced language models using Bayesian inference and a lightweight generative model. This method significantly improves training efficiency, final performance, and test-time efficiency.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01970
• PDF: https://arxiv.org/pdf/2602.01970
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #ReinforcementLearning #PromptEngineering #AI #MachineLearning
✨Implicit neural representation of textures
📝 Summary:
This work designs new texture implicit neural representations that operate continuously over UV coordinate space. Experiments show they achieve good image quality while balancing memory and rendering time, useful for real-time rendering and downstream tasks.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02354
• PDF: https://arxiv.org/pdf/2602.02354
• Project Page: https://peterhuistyping.github.io/INR-Tex/
• Github: https://github.com/PeterHUistyping/INR-Tex
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImplicitNeuralRepresentations #ComputerGraphics #DeepLearning #TextureModeling #RealTimeRendering
📝 Summary:
This work designs new texture implicit neural representations that operate continuously over UV coordinate space. Experiments show they achieve good image quality while balancing memory and rendering time, useful for real-time rendering and downstream tasks.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02354
• PDF: https://arxiv.org/pdf/2602.02354
• Project Page: https://peterhuistyping.github.io/INR-Tex/
• Github: https://github.com/PeterHUistyping/INR-Tex
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImplicitNeuralRepresentations #ComputerGraphics #DeepLearning #TextureModeling #RealTimeRendering
✨Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics
📝 Summary:
This paper unifies LLM control methods as dynamic weight updates, revealing a consistent preference-utility trade-off. It introduces SPLIT, a new steering method that enhances preference while better preserving utility.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02343
• PDF: https://arxiv.org/pdf/2602.02343
• Github: https://github.com/zjunlp/EasyEdit/blob/main/examples/SPLIT.md
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #MachineLearning #LLMSteering #DeepLearning
📝 Summary:
This paper unifies LLM control methods as dynamic weight updates, revealing a consistent preference-utility trade-off. It introduces SPLIT, a new steering method that enhances preference while better preserving utility.
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02343
• PDF: https://arxiv.org/pdf/2602.02343
• Github: https://github.com/zjunlp/EasyEdit/blob/main/examples/SPLIT.md
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #MachineLearning #LLMSteering #DeepLearning
✨SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization
📝 Summary:
SLIME is a new objective for aligning large language models, addressing 'unlearning' and 'formatting collapse' issues in prior methods. It maximizes preferred response likelihood, stabilizes rejected token probabilities, and uses dual-margin constraints, achieving superior performance and stable ...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02383
• PDF: https://arxiv.org/pdf/2602.02383
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AIAlignment #MachineLearning #NLP #DeepLearning
📝 Summary:
SLIME is a new objective for aligning large language models, addressing 'unlearning' and 'formatting collapse' issues in prior methods. It maximizes preferred response likelihood, stabilizes rejected token probabilities, and uses dual-margin constraints, achieving superior performance and stable ...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02383
• PDF: https://arxiv.org/pdf/2602.02383
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AIAlignment #MachineLearning #NLP #DeepLearning
✨TRIP-Bench: A Benchmark for Long-Horizon Interactive Agents in Real-World Scenarios
📝 Summary:
TRIP-Bench introduces a challenging long-horizon benchmark for evaluating LLM agents in complex, real-world travel planning. Existing models struggle significantly on this benchmark. To improve performance, the authors propose GTPO, an online reinforcement learning method that enhances constraint...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01675
• PDF: https://arxiv.org/pdf/2602.01675
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #ReinforcementLearning #AI #NLP #Benchmarking
📝 Summary:
TRIP-Bench introduces a challenging long-horizon benchmark for evaluating LLM agents in complex, real-world travel planning. Existing models struggle significantly on this benchmark. To improve performance, the authors propose GTPO, an online reinforcement learning method that enhances constraint...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01675
• PDF: https://arxiv.org/pdf/2602.01675
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #ReinforcementLearning #AI #NLP #Benchmarking
This media is not supported in your browser
VIEW IN TELEGRAM
✨AI-Generated Image Detectors Overrely on Global Artifacts: Evidence from Inpainting Exchange
📝 Summary:
AI image detectors for inpainting overrely on global spectral shifts from VAEs, not local content. Inpainting Exchange INP-X reveals this weakness, dramatically reducing detector accuracy. This calls for content-aware detection methods.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.00192
• PDF: https://arxiv.org/pdf/2602.00192
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ImageDetection #Inpainting #ComputerVision #DeepfakeDetection
📝 Summary:
AI image detectors for inpainting overrely on global spectral shifts from VAEs, not local content. Inpainting Exchange INP-X reveals this weakness, dramatically reducing detector accuracy. This calls for content-aware detection methods.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.00192
• PDF: https://arxiv.org/pdf/2602.00192
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ImageDetection #Inpainting #ComputerVision #DeepfakeDetection
✨Enhancing Multi-Image Understanding through Delimiter Token Scaling
📝 Summary:
Scaling delimiter token hidden states in vision-language models reduces cross-image information leakage, improving multi-image reasoning. This enhances image distinction and performance on multi-image benchmarks. The method also aids multi-document understanding without extra training or inferenc...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01984
• PDF: https://arxiv.org/pdf/2602.01984
• Github: https://github.com/MYMY-young/DelimScaling
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageModels #MultiModalAI #TokenScaling #DeepLearning #AIResearch
📝 Summary:
Scaling delimiter token hidden states in vision-language models reduces cross-image information leakage, improving multi-image reasoning. This enhances image distinction and performance on multi-image benchmarks. The method also aids multi-document understanding without extra training or inferenc...
🔹 Publication Date: Published on Feb 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01984
• PDF: https://arxiv.org/pdf/2602.01984
• Github: https://github.com/MYMY-young/DelimScaling
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageModels #MultiModalAI #TokenScaling #DeepLearning #AIResearch
✨PolySAE: Modeling Feature Interactions in Sparse Autoencoders via Polynomial Decoding
📝 Summary:
PolySAE enhances sparse autoencoders with polynomial decoding to model complex feature interactions and compositional structure. It improves probing F1 by 8% and captures relationships independent of feature co-occurrence while maintaining interpretability.
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01322
• PDF: https://arxiv.org/pdf/2602.01322
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PolySAE enhances sparse autoencoders with polynomial decoding to model complex feature interactions and compositional structure. It improves probing F1 by 8% and captures relationships independent of feature co-occurrence while maintaining interpretability.
🔹 Publication Date: Published on Feb 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01322
• PDF: https://arxiv.org/pdf/2602.01322
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research