AI & ML Papers
33K subscribers
7.11K photos
532 videos
24 files
7.77K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
AI & ML Papers
Photo
πŸ”₯ GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)

πŸ’‘ The paper introduces GenericAgent, a self-evolving large language model agent system designed to overcome the limitations of long-horizon interactions. The main problem addressed is that as interactions become longer, the accumulation of tool descriptions, memories, and environmental feedback pushes out the information needed for decision-making, leading to poor performance. The authors argue that the key to improving long-horizon performance is not the length of the context, but rather how much decision-relevant information is maintained within a finite context budget.

To address this problem, the GenericAgent system is built around the principle of context information density maximization. The system consists of four main components: a minimal atomic tool set, a hierarchical on-demand memory, a self-evolution mechanism, and a context truncation and compression layer. The minimal atomic tool set keeps the interface simple, while the hierarchical on-demand memory only shows a small high-level view by default. The self-evolution mechanism turns verified past trajectories into reusable standard operating procedures and executable code, allowing the agent to learn from its experiences. The context truncation and compression layer maintains information density during long executions by removing unnecessary information.

The results show that GenericAgent consistently outperforms leading agent systems in terms of task completion, tool use efficiency, memory effectiveness, self-evolution, and web browsing. Moreover, GenericAgent achieves these results while using significantly fewer tokens and interactions, demonstrating its efficiency. The system also continues to evolve over time, allowing it to adapt to new situations and improve its performance. Overall, the paper presents a novel approach to building self-evolving large language model agents that can effectively handle long-horizon interactions and maximize context information density.


πŸ“… Published on Apr 18

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2604.17091
β€’ PDF: https://arxiv.org/pdf/2604.17091

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#TokenEfficientLLMs #SelfEvolvingAgents #ContextualInformationDensity #LargeLanguageModelAgents #LongHorizonInteractions
❀1
AI & ML Papers
Photo
πŸ”₯ An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU

πŸ’‘ The paper addresses the challenge of fine-tuning large language models on single GPUs, which is limited by the models' memory-intensive nature. To overcome this, the authors propose SlideFormer, a system designed for single-GPU environments. The key innovations of SlideFormer include a lightweight asynchronous engine that overlaps GPU computation with CPU updates and multi-tier I/O, a heterogeneous memory management scheme that reduces peak memory usage, and optimized kernels that solve key bottlenecks and integrate advanced I/O.

The asynchronous engine treats the GPU as a sliding window, allowing for efficient processing. The heterogeneous memory management scheme significantly reduces memory usage, making it possible to fine-tune larger models. The optimized kernels improve performance by solving key bottlenecks and integrating advanced I/O.

The results show that SlideFormer achieves higher throughput and reduced memory usage compared to baselines. Specifically, it supports up to 8 times larger batch sizes and 6 times larger models, and achieves 1.40 to 6.27 times higher throughput while roughly halving CPU and GPU memory usage. The system sustains over 95 percent peak performance on both NVIDIA and AMD GPUs, demonstrating its effectiveness and efficiency. Overall, SlideFormer enables the fine-tuning of large language models on single GPUs, making it a significant contribution to the field of natural language processing.


πŸ“… Published on Mar 17

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2603.16428
β€’ PDF: https://arxiv.org/pdf/2603.16428

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#HeterogeneousCoDesign #GPUMemoryOptimization #LanguageModelFineTuning #SingleGPUComputing #AsynchronousProcessingTechniques
AI & ML Papers
Photo
πŸ”₯ BlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding

πŸ’‘ The paper introduces BlockPilot, a method for improving the efficiency of speculative decoding in natural language processing tasks. Speculative decoding is a technique that uses a lightweight model to generate candidate tokens in parallel, which are then verified by a target model. Existing methods use a fixed block size for decoding, which can be suboptimal as the optimal block size varies across different input samples. The authors show that the optimal block size is critical to speculative decoding performance and that it exhibits a local structure, meaning that it tends to concentrate around the training block size.

To address this issue, the authors propose a sample-adaptive policy that predicts the optimal block size from the prefilling representation. This is done by formulating block size selection as a lightweight policy learning problem, where the optimal block size is predicted based on the representation of the prefilling stage. The prediction is performed only once after prefilling, allowing for seamless integration with existing models.

The authors evaluate their method on several benchmarks and demonstrate that it is plug-and-play, introduces minimal overhead, and consistently improves efficiency. The results show that BlockPilot achieves an acceptance length of 5.92 and a 4.20 times speedup on a specific model, indicating that it can significantly accelerate inference while maintaining accuracy. Overall, the paper contributes to the development of more efficient and adaptive speculative decoding methods, which can be useful for a wide range of natural language processing applications.


πŸ“… Published on Jun 30

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2606.31315
β€’ PDF: https://arxiv.org/pdf/2606.31315

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#InstanceAdaptivePolicyLearning #DiffusionBasedSpeculativeDecoding #NaturalLanguageProcessing #SpeculativeDecodingTechniques #BlockPilotMethod
AI & ML Papers
Photo
πŸ”₯ GEAR: Guided End-to-End AutoRegression for Image Synthesis

πŸ’‘ The paper introduces GEAR, a method for training a vector-quantized tokenizer and an autoregressive generator jointly and end-to-end for image synthesis. Typically, these models are trained in two stages, where the tokenizer is first trained and then frozen, and then the generator is trained on its output. However, this approach has a limitation, as the tokenizer is not aware of what the generator finds easy to model.

GEAR overcomes this limitation by training the tokenizer and generator jointly, guided by representation alignment. The key challenge is that the output of the tokenizer is non-differentiable, making it difficult to train the tokenizer and generator jointly. To address this, GEAR uses a dual read-out approach, where the tokenizer output is used in two different ways. A hard, one-hot branch is used to train the autoregressive generator, while a differentiable soft branch is used to carry a representation-alignment loss that guides the tokenizer.

This approach allows the autoregressive generator to steer the tokenizer towards an index distribution that it can predict more easily. As a result, the tokenizer's features become less complex, while the autoregressive generator's features become more complex and semantic. The paper demonstrates that GEAR speeds up convergence by up to 10 times relative to a strong baseline, and learns better patch-level and spatially-coherent features. Additionally, GEAR generalizes across different quantizers and can be applied to text-to-image generation. Overall, GEAR provides a new approach for training visual generative models, and achieves state-of-the-art results in image synthesis.


πŸ“… Published on Jun 30

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2606.32039
β€’ PDF: https://arxiv.org/pdf/2606.32039
β€’ Project Page: https://linb203.github.io/gear

πŸ€– Models citing this paper:
β€’ https://huggingface.co/BinLin203/Warmup-LFQ
β€’ https://huggingface.co/BinLin203/Warmup-IBQ
β€’ https://huggingface.co/BinLin203/GEAR-VQ

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#ImageSynthesis #AutoRegression #VectorQuantization #EndToEndLearning #AutoregressiveGenerators
AI & ML Papers
Photo
πŸ”₯ Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

πŸ’‘ The paper presents a real-time verification system for retrieval-augmented generation that can process long documents and balance latency constraints with comprehensive answer validation. The problem addressed is that verifying generated answers in retrieval-augmented generation systems is difficult due to the large size of the source materials and the need for interactive services to respond quickly. Large language models can check long contexts but are too slow and costly, while lightweight classifiers operate within strict context limits and frequently miss evidence outside truncated passages.

The method proposed is a real-time verification component integrated into a production retrieval-augmented generation pipeline that enables full-document grounding under latency constraints. The system can process documents up to 32K tokens and employs adaptive inference strategies to balance response time and verification coverage across workloads.

The results show that full-context verification substantially improves detection of unsupported responses compared with truncated validation. The evaluation methodology used to deploy the verifier highlights the importance of long-context verification, the limitations of chunk-based checking in real documents, and the impact of latency budgets on model design. The findings provide practical guidance for practitioners building reliable large-scale retrieval-augmented applications, demonstrating that the proposed system can effectively verify generated answers in real-time while maintaining comprehensive coverage of the source materials.


πŸ“… Published on Mar 4

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2603.23508
β€’ PDF: https://arxiv.org/pdf/2603.23508

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#RealTimeVerification #RetrievalAugmentedGeneration #LongDocumentProcessing #AnswerValidationSystems #LatencyConstrainedVerification
AI & ML Papers
Photo
πŸ”₯ AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

πŸ’‘ The paper presents AReaL, a large-scale asynchronous reinforcement learning system designed for training large language models on reasoning tasks. The problem with existing synchronous reinforcement learning systems is that they alternate between generation and training in a batch setting, which leads to severe system-level inefficiency and underutilization of GPUs. This is because generation must wait until the longest output in the batch is completed before the model can be updated.

To address this issue, AReaL decouples generation from training, allowing rollout workers to continuously generate new outputs without waiting, while training workers update the model whenever a batch of data is collected. This asynchronous approach leads to substantially higher GPU utilization. To stabilize reinforcement learning training, AReaL balances the workload of rollout and training workers to control data staleness and adopts a staleness-enhanced PPO variant to better handle outdated training samples.

The results show that AReaL achieves up to 2.57 times training speedup compared to the best synchronous systems with the same number of GPUs, while matching or even improving final performance. The system was tested on math and code reasoning benchmarks, demonstrating the effectiveness of the asynchronous approach. The code for AReaL is made available, allowing others to build upon and utilize the system. Overall, AReaL provides a more efficient and scalable solution for training large language models on reasoning tasks using reinforcement learning.


πŸ“… Published on May 30, 2025

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2505.24298
β€’ PDF: https://arxiv.org/pdf/2505.24298

πŸ€– Models citing this paper:
β€’ https://huggingface.co/inclusionAI/AReaL-boba-2-8B
β€’ https://huggingface.co/inclusionAI/AReaL-boba-2-14B
β€’ https://huggingface.co/inclusionAI/AReaL-boba-2-8B-Open

πŸ“Š Datasets citing this paper:
β€’ https://huggingface.co/datasets/inclusionAI/AReaL-tau2-data

πŸš€ Spaces citing this paper:
β€’ https://huggingface.co/spaces/rzvn/Medieval-Village-AI

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#AsynchronousReinforcementLearning #LanguageReasoningTasks #LargeScaleLanguageModels #ReinforcementLearningSystems #DeepLearningForNaturalLanguageProcessing
AI & ML Papers
Photo
πŸ”₯ Perceive-to-Reason: Decoupling Perception and Reasoning for Fine-Grained Visual Reasoning

πŸ’‘ The paper introduces a unified framework called Perceive-to-Reason that improves fine-grained visual reasoning performance on high-resolution images. Fine-grained visual reasoning is a challenging task for vision-language models, especially when small but critical visual cues are buried in high-resolution images. Existing approaches typically do not explicitly distinguish between perception and reasoning, instead relying on repeated cropping or test-time visual search to introduce local evidence.

The Perceive-to-Reason framework addresses this limitation by formulating fine-grained visual reasoning as a two-stage process. In the first stage, the model localizes question-relevant evidence as a Perceiver, and in the second stage, it answers the question as a Reasoner based on the annotated image and cropped regions. To train the model, the authors introduce a role-aware reinforcement learning strategy called Perception-Reasoning Alternating GRPO, which alternates between perception-focused and reasoning-focused updates using only final-answer supervision.

The Perceive-to-Reason framework is built on top of existing vision-language models, and it consistently improves performance across model scales. The results show that the Perceive-to-Reason framework achieves state-of-the-art performance on several benchmarks, including V-Star, HR-Bench-4K, and HR-Bench-8K. Specifically, the P2R-4B model achieves 93.2 percent on V-Star, 81.9 percent on HR-Bench-4K, and 80.5 percent on HR-Bench-8K, substantially outperforming its corresponding backbone.

The benefits of the Perceive-to-Reason framework extend beyond high-resolution benchmarks to broader multimodal reasoning tasks. The results suggest that explicitly decoupling perception from reasoning provides an effective framework for fine-grained visual reasoning. Overall, the paper contributes a novel framework for fine-grained visual reasoning that improves performance on high-resolution images and has broader implications for multimodal reasoning tasks.


πŸ“… Published on Jul 1

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2607.01191
β€’ PDF: https://arxiv.org/pdf/2607.01191

πŸ€– Models citing this paper:
β€’ https://huggingface.co/hongxingli/P2R-4B
β€’ https://huggingface.co/hongxingli/P2R-2B
β€’ https://huggingface.co/hongxingli/P2R-8B

πŸ“Š Datasets citing this paper:
β€’ https://huggingface.co/datasets/hongxingli/P2R-10k

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#FineGrainedVisualReasoning #VisualReasoningModels #PerceptionAndReasoning #HighResolutionImageAnalysis #VisionLanguageModels
πŸ”₯ Free IT Cert Resources – Grab Them While They're Hot!

🌈SPOTO just dropped a bunch of 100% free study kits for 2026 – covering #Cisco, #AWS, #PMP, #AI, #Python, #Excel, and #Cybersecurity

πŸ’₯No signup traps, no hidden fees – just click and download.

πŸ“˜ FREE Cert E‑Book β†’ https://bit.ly/4wkiLAT
πŸͺœ Online FREE Course β†’
https://bit.ly/4vHFJSz
☁️ FREE AI Materials β†’
https://bit.ly/4wdu7X6
πŸ“Š Cloud Study Guide β†’
https://bit.ly/4y0HyeW
🧠 Free Mock Exam β†’
https://bit.ly/4ff8jos

Tag a friend who's also on this journey – Get certified together! πŸ’ͺ

🌐 Join the community: https://chat.whatsapp.com/FmbIbbqm2QhKglVpVTSH4d/
πŸ“² Need personalized help? β†’ https://wa.link/6k7042
❀1
AI & ML Papers
πŸ› οΈ Build Faster, Spend Less. Your All-in-One API Proxy Endpoint. www.afford-ai.cn is designed for developers who need scale without the crazy costs. πŸ”Ή 1:2 Value Ratio: Stretch your budget further. For every $1 you fund, we credit your account with $2 in…
Code smarter, not costlier. πŸš€
Get powerful AI coding agents, seamless OpenAI-compatible APIs, and more value for every dollar. Build faster, automate more, and let AI work directly with your code. Join now and start creating without limits.
AI & ML Papers
Photo
πŸ”₯ Representation Distribution Matching for One-Step Visual Generation

πŸ’‘ The paper introduces Representation Distribution Matching, a method for one-step visual generation that matches feature distributions under pretrained encoders. The goal is to generate high-quality images by comparing the distributions of generated and reference features. The authors identify two key design axes: how the distributions are compared and the representations they are compared in. They conduct controlled studies and find three main results.

First, they show that the Maximum Mean Discrepancy, a classical method that was previously ineffective, becomes a strong and scalable objective when estimated correctly. Second, they find that the batch size of the generated images has a significant impact on performance, with an optimum batch size above 2048, which is much larger than typical batch sizes. Third, they demonstrate that using a single representation can be gamed, resulting in low scores despite visibly fake images, and instead propose using a balanced set of encoders and evaluating with a Sliced-Wasserstein distance over 14 encoders.

The authors combine these findings to develop an improved Representation Distribution Matching method, which they call iRDM. They evaluate iRDM on the ImageNet dataset and achieve state-of-the-art results, with a Sliced-Wasserstein distance of 1.30. Additionally, they use a human-preference proxy, called PickScore, which shows that iRDM is preferred over the previous best one-step generator on 71.2% of matched samples. They also apply the same method to post-train a four-step generator, called FLUX.2, and achieve better results than the original four-step version, with improved performance on GenEval and PickScore, and requiring only 90 GPU-hours. Overall, the paper presents a new method for one-step visual generation that achieves state-of-the-art results and can be used to improve existing generators.


πŸ“… Published on Jul 2

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2607.02375
β€’ PDF: https://arxiv.org/pdf/2607.02375
β€’ Project Page: https://alan-lanfeng.github.io/rdm/

πŸ€– Models citing this paper:
β€’ https://huggingface.co/epfl-vita/flux2-klein-1step-rdm

πŸš€ Spaces citing this paper:
β€’ https://huggingface.co/spaces/epfl-vita/flux2-klein-1step-demo

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#VisualGeneration #RepresentationLearning #DistributionMatching #ImageSynthesis #DeepLearning
πŸ”₯ AgenticSTS: A Bounded-Memory Testbed for Long-Horizon LLM Agents

πŸ’‘ The paper introduces a new approach to studying long-horizon large language model agents, called AgenticSTS. The problem addressed is that current methods for analyzing memory components in these agents are limited, as they append past observations and reflections to every prompt, making it hard to isolate the effect of a single memory component. To solve this, the authors propose a bounded contract approach, where every decision is made from a fresh user message assembled by typed retrieval, with no raw cross-decision transcript appended. This allows for isolated analysis of memory components and demonstrates improved performance in complex decision-making tasks.

The method involves instantiating this contract in a closed-rule stochastic deck-building game, where runs require hundreds of tactical and strategic decisions. The authors create a testbed, called AgenticSTS, which includes a reproducible environment, frozen memory and skill snapshots, prompt records, and analysis scripts. This testbed allows for the study of how explicit memory layers shape long-horizon LLM-agent decisions.

The results show that the proposed approach leads to improved performance in the game, with a fixed-A0 ablation showing the largest observed difference when triggered strategic skills are enabled. The no-store baseline wins 3 out of 10 games, while adding the skill layer wins 6 out of 10 games. Although the comparison is directional rather than statistically decisive, the results demonstrate the effectiveness of the proposed approach. The authors also release a public online benchmark of frontier LLMs on the same game, which reports zero wins at the lowest difficulty across five configurations, highlighting the challenge of the task. Overall, the paper contributes a new methodology for studying long-horizon LLM agents and demonstrates its effectiveness in a complex decision-making task.


πŸ“… Published on Jul 2

πŸ”— Links:
β€’ GitHub: https://github.com/huggingface
β€’ arXiv: https://arxiv.org/abs/2607.02255
β€’ PDF: https://arxiv.org/pdf/2607.02255
β€’ Project Page: https://alayalab.github.io/AgenticSTS/

━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“’ By: https://t.iss.one/PaperNexus

#AgenticSTS #LongHorizonLLMAgents #BoundedMemoryTestbed #LargeLanguageModelAgents #LLMMemoryComponents
Unlock Your Earning Potential Today! Did you know that many are scoring daily crypto rewards from the comfort of their home? πŸ’°βœ¨ Join our exclusive community at π•Œβ„•π•€β„šπ•†β„β„• 𝔹𝕆𝕏 πŸ¦„, where real earning opportunities are just a click away!

- πŸ’΅ Access big squares and red packets!

- 🎁 Participate in exciting giveaways!

- πŸ”₯ Stay updated without hunting through multiple channels!

- πŸ¦„ Unlock daily rewards with simple tasks!

Don’t miss out on the chance to turn your online presence into profit. This isn’t just luck; it’s your golden opportunity! Tap into the fun and benefits today! πŸ‘‰ Join the Uniqorn Tribe

#ad πŸ“’ InsideAd
Please open Telegram to view this post
VIEW IN TELEGRAM
❀1