✨s2n-bignum-bench: A practical benchmark for evaluating low-level code reasoning of LLMs
📝 Summary:
s2n-bignum-bench is a new benchmark evaluating LLMs on formal proof synthesis for industrial cryptographic assembly routines. It bridges the gap between competition math and real-world verification by requiring LLMs to generate HOL Light proofs for AWS s2n-bignum library code.
🔹 Publication Date: Published on Mar 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14628
• PDF: https://arxiv.org/pdf/2603.14628
• Project Page: https://kings-crown.github.io/s2n-bignum-leaderboard/
• Github: https://github.com/kings-crown/s2n-bignum-bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
s2n-bignum-bench is a new benchmark evaluating LLMs on formal proof synthesis for industrial cryptographic assembly routines. It bridges the gap between competition math and real-world verification by requiring LLMs to generate HOL Light proofs for AWS s2n-bignum library code.
🔹 Publication Date: Published on Mar 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14628
• PDF: https://arxiv.org/pdf/2603.14628
• Project Page: https://kings-crown.github.io/s2n-bignum-leaderboard/
• Github: https://github.com/kings-crown/s2n-bignum-bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders
📝 Summary:
State space models demonstrate competitive performance as vision backbones for vision-language models, matching or exceeding transformer-based architectures while operating at smaller scales and requi...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19209
• PDF: https://arxiv.org/pdf/2603.19209
• Project Page: https://lab-spell.github.io/vlm-ssm-vision-encoders/
• Github: https://github.com/raykuo18/vlm-ssm-vision-encoders
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
State space models demonstrate competitive performance as vision backbones for vision-language models, matching or exceeding transformer-based architectures while operating at smaller scales and requi...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19209
• PDF: https://arxiv.org/pdf/2603.19209
• Project Page: https://lab-spell.github.io/vlm-ssm-vision-encoders/
• Github: https://github.com/raykuo18/vlm-ssm-vision-encoders
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos
📝 Summary:
TAPESTRY generates high-fidelity 360-degree turntable videos conditioned on 3D geometry, enabling consistent texture synthesis and neural rendering for complete 3D asset creation. AI-generated summary...
🔹 Publication Date: Published on Mar 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17735
• PDF: https://arxiv.org/pdf/2603.17735
• Project Page: https://zerone182.github.io/TAPESTRY/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
TAPESTRY generates high-fidelity 360-degree turntable videos conditioned on 3D geometry, enabling consistent texture synthesis and neural rendering for complete 3D asset creation. AI-generated summary...
🔹 Publication Date: Published on Mar 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17735
• PDF: https://arxiv.org/pdf/2603.17735
• Project Page: https://zerone182.github.io/TAPESTRY/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck
📝 Summary:
This paper reformulates efficient LLM reasoning as a lossy compression problem using the Conditional Information Bottleneck. This models reasoning as a computational bridge containing only essential information, maximizing task reward while compressing completions. The method prunes cognitive blo...
🔹 Publication Date: Published on Mar 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08462
• PDF: https://arxiv.org/pdf/2603.08462
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
This paper reformulates efficient LLM reasoning as a lossy compression problem using the Conditional Information Bottleneck. This models reasoning as a computational bridge containing only essential information, maximizing task reward while compressing completions. The method prunes cognitive blo...
🔹 Publication Date: Published on Mar 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08462
• PDF: https://arxiv.org/pdf/2603.08462
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Probing Cultural Signals in Large Language Models through Author Profiling
📝 Summary:
Large language models exhibit systematic cultural biases when performing author profiling from song lyrics, with varying degrees of ethnic alignment across different models. AI-generated summary Large...
🔹 Publication Date: Published on Mar 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16749
• PDF: https://arxiv.org/pdf/2603.16749
• Github: https://github.com/ValentinLafargue/CulturalProbingLLM
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ValentinLAFARGUE/AuthorProfilingResults
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models exhibit systematic cultural biases when performing author profiling from song lyrics, with varying degrees of ethnic alignment across different models. AI-generated summary Large...
🔹 Publication Date: Published on Mar 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16749
• PDF: https://arxiv.org/pdf/2603.16749
• Github: https://github.com/ValentinLafargue/CulturalProbingLLM
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ValentinLAFARGUE/AuthorProfilingResults
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination
📝 Summary:
ReLi3D is a unified pipeline that reconstructs 3D geometry, materials, and illumination from multi-view images. It uses a transformer and two-path prediction to disentangle these elements, enabling near-instantaneous generation of relightable 3D assets.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19753
• PDF: https://arxiv.org/pdf/2603.19753
• Project Page: https://reli3d.jdihlmann.com/
• Github: https://github.com/Stability-AI/ReLi3D
🔹 Models citing this paper:
• https://huggingface.co/StabilityLabs/ReLi3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ReLi3D is a unified pipeline that reconstructs 3D geometry, materials, and illumination from multi-view images. It uses a transformer and two-path prediction to disentangle these elements, enabling near-instantaneous generation of relightable 3D assets.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19753
• PDF: https://arxiv.org/pdf/2603.19753
• Project Page: https://reli3d.jdihlmann.com/
• Github: https://github.com/Stability-AI/ReLi3D
🔹 Models citing this paper:
• https://huggingface.co/StabilityLabs/ReLi3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
✨DROID-SLAM in the Wild
📝 Summary:
A real-time RGB SLAM system handles dynamic and cluttered environments. It estimates per-pixel uncertainty from multi-view visual features via differentiable bundle adjustment. This enables state-of-the-art performance at real-time speeds.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19076
• PDF: https://arxiv.org/pdf/2603.19076
• Project Page: https://moyangli00.github.io/droid-w/
• Github: https://github.com/MoyangLi00/DROID-W
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A real-time RGB SLAM system handles dynamic and cluttered environments. It estimates per-pixel uncertainty from multi-view visual features via differentiable bundle adjustment. This enables state-of-the-art performance at real-time speeds.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19076
• PDF: https://arxiv.org/pdf/2603.19076
• Project Page: https://moyangli00.github.io/droid-w/
• Github: https://github.com/MoyangLi00/DROID-W
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ReLMXEL: Adaptive RL-Based Memory Controller with Explainable Energy and Latency Optimization
📝 Summary:
R e d u c i n g l a t e n c y a n d e n e r g y c o n s u m p t i o n i s c r i t i c a l t o i m p r o v i n g t h e e f f i c i e n c y o f m e m o r y s y s t e m s i n m o d e r n c o m p u t i n ...
🔹 Publication Date: Published on Mar 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17309
• PDF: https://arxiv.org/pdf/2603.17309
• Project Page: https://github.com/Chirag-Sai-Panuganti/ReLMXEL
• Github: https://github.com/Chirag-Sai-Panuganti/ReLMXEL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
R e d u c i n g l a t e n c y a n d e n e r g y c o n s u m p t i o n i s c r i t i c a l t o i m p r o v i n g t h e e f f i c i e n c y o f m e m o r y s y s t e m s i n m o d e r n c o m p u t i n ...
🔹 Publication Date: Published on Mar 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17309
• PDF: https://arxiv.org/pdf/2603.17309
• Project Page: https://github.com/Chirag-Sai-Panuganti/ReLMXEL
• Github: https://github.com/Chirag-Sai-Panuganti/ReLMXEL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Human-AI Synergy in Agentic Code Review
📝 Summary:
C o d e r e v i e w i s a c r i t i c a l s o f t w a r e e n g i n e e r i n g p r a c t i c e w h e r e d e v e l o p e r s r e v i e w c o d e c h a n g e s b e f o r e i n t e g r a t i o n t o e ...
🔹 Publication Date: Published on Mar 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15911
• PDF: https://arxiv.org/pdf/2603.15911
• Github: https://github.com/Software-Evolution-Analytics-Lab-SEAL/AI_Vs_Human_Codereview
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
C o d e r e v i e w i s a c r i t i c a l s o f t w a r e e n g i n e e r i n g p r a c t i c e w h e r e d e v e l o p e r s r e v i e w c o d e c h a n g e s b e f o r e i n t e g r a t i o n t o e ...
🔹 Publication Date: Published on Mar 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15911
• PDF: https://arxiv.org/pdf/2603.15911
• Github: https://github.com/Software-Evolution-Analytics-Lab-SEAL/AI_Vs_Human_Codereview
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image Segmentation
📝 Summary:
This paper introduces Switch, a semi-supervised learning framework for medical ultrasound segmentation. It uses multiscale patch mixing and frequency domain contrastive learning for robust features. Switch outperforms state-of-the-art methods and even fully supervised baselines using very little ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18655
• PDF: https://arxiv.org/pdf/2603.18655
• Github: https://github.com/jinggqu/Switch
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MedicalImaging #ImageSegmentation #SemiSupervisedLearning #ContrastiveLearning #DeepLearning
📝 Summary:
This paper introduces Switch, a semi-supervised learning framework for medical ultrasound segmentation. It uses multiscale patch mixing and frequency domain contrastive learning for robust features. Switch outperforms state-of-the-art methods and even fully supervised baselines using very little ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18655
• PDF: https://arxiv.org/pdf/2603.18655
• Github: https://github.com/jinggqu/Switch
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MedicalImaging #ImageSegmentation #SemiSupervisedLearning #ContrastiveLearning #DeepLearning
✨Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
📝 Summary:
LLM post-training hits a capability ceiling by using expanding action histories instead of compact Markov states. This work reintroduces explicit Markov states, significantly reducing sample complexity and breaking performance boundaries to unlock new reasoning capabilities.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19987
• PDF: https://arxiv.org/pdf/2603.19987
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LLM post-training hits a capability ceiling by using expanding action histories instead of compact Markov states. This work reintroduces explicit Markov states, significantly reducing sample complexity and breaking performance boundaries to unlock new reasoning capabilities.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19987
• PDF: https://arxiv.org/pdf/2603.19987
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL
📝 Summary:
Adaptive Layerwise Perturbation (ALP) addresses policy staleness and training-inference mismatch in large language model reinforcement learning by injecting learnable perturbations into hidden states ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19470
• PDF: https://arxiv.org/pdf/2603.19470
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMRL #ReinforcementLearning #LargeLanguageModels #DeepLearning #AI
📝 Summary:
Adaptive Layerwise Perturbation (ALP) addresses policy staleness and training-inference mismatch in large language model reinforcement learning by injecting learnable perturbations into hidden states ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19470
• PDF: https://arxiv.org/pdf/2603.19470
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMRL #ReinforcementLearning #LargeLanguageModels #DeepLearning #AI
Forwarded from Machine Learning with Python
This media is not supported in your browser
VIEW IN TELEGRAM
𝐕𝐢𝐬𝐮𝐚𝐥 𝐛𝐥𝐨𝐠 on Vision Transformers is live.
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.iss.one/CodeProgrammer
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
CNNs process images through small sliding filters. Each filter only sees a tiny local region, and the model has to stack many layers before distant parts of an image can even talk to each other.
Vision Transformers threw that whole approach out.
ViT chops an image into patches, treats each patch like a token, and runs self-attention across the full sequence.
Every patch can attend to every other patch from the very first layer. No stacking required.
That global view from layer one is what made ViT surpass CNNs on large-scale benchmarks.
𝐖𝐡𝐚𝐭 𝐭𝐡𝐞 𝐛𝐥𝐨𝐠 𝐜𝐨𝐯𝐞𝐫𝐬:
- Introduction to Vision Transformers and comparison with CNNs
- Adapting transformers to images: patch embeddings and flattening
- Positional encodings in Vision Transformers
- Encoder-only structure for classification
- Benefits and drawbacks of ViT
- Real-world applications of Vision Transformers
- Hands-on: fine-tuning ViT for image classification
The Image below shows
Self-attention connects every pixel to every other pixel at once. Convolution only sees a small local window. That's why ViT captures things CNNs miss, like the optical illusion painting where distant patches form a hidden face.
The architecture is simple. Split image into patches, flatten them into embeddings (like words in a sentence), run them through a Transformer encoder, and the class token collects info from all patches for the final prediction. Patch in, class out.
Inside attention: each patch (query) compares itself to all other patches (keys), softmax gives attention weights, and the weighted sum of values produces a new representation aware of the full image, visualizes what the CLS token actually attends to through attention heatmaps.
The second half of the blog is hands-on code. I fine-tuned ViT-Base from google (86M params) on the Oxford-IIIT Pet dataset, 37 breeds, ~7,400 images.
𝐁𝐥𝐨𝐠 𝐋𝐢𝐧𝐤
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.iss.one/CodeProgrammer
✨Automatic detection of Gen-AI texts: A comparative framework of neural models
📝 Summary:
This paper compares neural models for detecting AI-generated text. It found that supervised machine learning detectors achieved more stable and robust performance than commercial tools across different languages and domains.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18750
• PDF: https://arxiv.org/pdf/2603.18750
• Project Page: https://huggingface.co/datasets/cristian03/ARTandMH
• Github: https://github.com/cristian03git/DETECTION_GENAI
✨ Datasets citing this paper:
• https://huggingface.co/datasets/cristian03/ARTandMH
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GenAI #AIDetection #MachineLearning #NeuralNetworks #NLP
📝 Summary:
This paper compares neural models for detecting AI-generated text. It found that supervised machine learning detectors achieved more stable and robust performance than commercial tools across different languages and domains.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18750
• PDF: https://arxiv.org/pdf/2603.18750
• Project Page: https://huggingface.co/datasets/cristian03/ARTandMH
• Github: https://github.com/cristian03git/DETECTION_GENAI
✨ Datasets citing this paper:
• https://huggingface.co/datasets/cristian03/ARTandMH
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GenAI #AIDetection #MachineLearning #NeuralNetworks #NLP
✨From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image Tampering
📝 Summary:
This paper shifts VLM image tampering detection from coarse object masks to pixel-level analysis with semantic understanding. It introduces a new taxonomy, benchmark, and metrics to evaluate both localization accuracy and the meaning of image modifications. This offers a more rigorous standard fo...
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20193
• PDF: https://arxiv.org/pdf/2603.20193
• Github: https://github.com/VILA-Lab/PIXAR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VLM #ImageTampering #DeepfakeDetection #ComputerVision #AIResearch
📝 Summary:
This paper shifts VLM image tampering detection from coarse object masks to pixel-level analysis with semantic understanding. It introduces a new taxonomy, benchmark, and metrics to evaluate both localization accuracy and the meaning of image modifications. This offers a more rigorous standard fo...
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20193
• PDF: https://arxiv.org/pdf/2603.20193
• Github: https://github.com/VILA-Lab/PIXAR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VLM #ImageTampering #DeepfakeDetection #ComputerVision #AIResearch
Forwarded from Machine Learning with Python
Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
✨LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
📝 Summary:
LongCat-Flash-Prover is a 560B MoE model advancing Lean4 formal reasoning using agentic tool integration. It employs a hybrid framework and hierarchical policy optimization for stable training. It achieves state-of-the-art results, including 97.1% on MiniF2F-Test and improved performance on Prove...
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21065
• PDF: https://arxiv.org/pdf/2603.21065
• Project Page: https://github.com/meituan-longcat/LongCat-Flash-Prover
• Github: https://github.com/meituan-longcat/LongCat-Flash-Prover
🔹 Models citing this paper:
• https://huggingface.co/meituan-longcat/LongCat-Flash-Prover
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LongCat-Flash-Prover is a 560B MoE model advancing Lean4 formal reasoning using agentic tool integration. It employs a hybrid framework and hierarchical policy optimization for stable training. It achieves state-of-the-art results, including 97.1% on MiniF2F-Test and improved performance on Prove...
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21065
• PDF: https://arxiv.org/pdf/2603.21065
• Project Page: https://github.com/meituan-longcat/LongCat-Flash-Prover
• Github: https://github.com/meituan-longcat/LongCat-Flash-Prover
🔹 Models citing this paper:
• https://huggingface.co/meituan-longcat/LongCat-Flash-Prover
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT
📝 Summary:
Multi-task supervised fine-tuning with heterogeneous learning dynamics benefits from an iterative overfitting-aware search algorithm that improves performance across diverse datasets and compute budge...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21606
• PDF: https://arxiv.org/pdf/2603.21606
• Github: https://github.com/reiss-koh/msft
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-task supervised fine-tuning with heterogeneous learning dynamics benefits from an iterative overfitting-aware search algorithm that improves performance across diverse datasets and compute budge...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21606
• PDF: https://arxiv.org/pdf/2603.21606
• Github: https://github.com/reiss-koh/msft
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization
📝 Summary:
R e u s i n g a n d i n v o k i n g e x i s t i n g c o d e r e m a i n s c o s t l y a n d u n r e l i a b l e , a s m o s t p r a c t i c a l t o o l s a r e e m b e d d e d i n h e t e r o g e n e ...
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09290
• PDF: https://arxiv.org/pdf/2603.09290
• Project Page: https://sdiaa.tech/projects
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #OpenSource #ToolStandardization #AIResearch #DataScience
📝 Summary:
R e u s i n g a n d i n v o k i n g e x i s t i n g c o d e r e m a i n s c o s t l y a n d u n r e l i a b l e , a s m o s t p r a c t i c a l t o o l s a r e e m b e d d e d i n h e t e r o g e n e ...
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09290
• PDF: https://arxiv.org/pdf/2603.09290
• Project Page: https://sdiaa.tech/projects
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #OpenSource #ToolStandardization #AIResearch #DataScience
✨PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
📝 Summary:
PowerInfer, a high-speed LLM inference engine for personal computers, enhances efficiency using hotspot neuron analysis, GPU-CPU hybrid computation, adaptive predictors, and neuron-aware sparse operat...
🔹 Publication Date: Published on Dec 16, 2023
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2312.12456
• PDF: https://arxiv.org/pdf/2312.12456
• Github: https://github.com/sjtu-ipads/powerinfer
🔹 Models citing this paper:
• https://huggingface.co/SparseLLM/prosparse-llama-2-7b
• https://huggingface.co/openbmb/MiniCPM-S-1B-sft
• https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FallnAI/Quantize-HF-Models
• https://huggingface.co/spaces/openfree/LLM_Quantization
• https://huggingface.co/spaces/seawolf2357/LLM_Quantization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PowerInfer, a high-speed LLM inference engine for personal computers, enhances efficiency using hotspot neuron analysis, GPU-CPU hybrid computation, adaptive predictors, and neuron-aware sparse operat...
🔹 Publication Date: Published on Dec 16, 2023
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2312.12456
• PDF: https://arxiv.org/pdf/2312.12456
• Github: https://github.com/sjtu-ipads/powerinfer
🔹 Models citing this paper:
• https://huggingface.co/SparseLLM/prosparse-llama-2-7b
• https://huggingface.co/openbmb/MiniCPM-S-1B-sft
• https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FallnAI/Quantize-HF-Models
• https://huggingface.co/spaces/openfree/LLM_Quantization
• https://huggingface.co/spaces/seawolf2357/LLM_Quantization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
This paper introduces PowerInfer, a high-speed Large Language Model (LLM) inference engine on a personal computer (PC) equipped with a single consumer-grade GPU. The key principle underlying the...