This media is not supported in your browser
VIEW IN TELEGRAM
✨Accelerating Speculative Decoding with Block Diffusion Draft Trees
📝 Summary:
DDTree enhances speculative decoding by constructing draft trees from block diffusion drafter distributions. It efficiently verifies multiple trajectories in parallel in a single target model pass, improving performance.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12989
• PDF: https://arxiv.org/pdf/2604.12989
• Project Page: https://liranringel.github.io/ddtree
• Github: https://github.com/liranringel/ddtree
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SpeculativeDecoding #BlockDiffusion #LLMAcceleration #DeepLearning #AIResearch
📝 Summary:
DDTree enhances speculative decoding by constructing draft trees from block diffusion drafter distributions. It efficiently verifies multiple trajectories in parallel in a single target model pass, improving performance.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12989
• PDF: https://arxiv.org/pdf/2604.12989
• Project Page: https://liranringel.github.io/ddtree
• Github: https://github.com/liranringel/ddtree
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SpeculativeDecoding #BlockDiffusion #LLMAcceleration #DeepLearning #AIResearch
❤1
✨When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation
📝 Summary:
Reasoning-enhanced LLMs can over-optimize, making them better problem solvers but poor simulators of diverse, boundedly rational behavior. This solver-sampler mismatch means high model capability hurts simulation fidelity. Bounded reflection improves realism.
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11840
• PDF: https://arxiv.org/pdf/2604.11840
• Project Page: https://www.sandric.co
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MultiAgentSystems #BehavioralSimulation #AI #AgentBasedModeling
📝 Summary:
Reasoning-enhanced LLMs can over-optimize, making them better problem solvers but poor simulators of diverse, boundedly rational behavior. This solver-sampler mismatch means high model capability hurts simulation fidelity. Bounded reflection improves realism.
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11840
• PDF: https://arxiv.org/pdf/2604.11840
• Project Page: https://www.sandric.co
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #MultiAgentSystems #BehavioralSimulation #AI #AgentBasedModeling
✨Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
📝 Summary:
This paper introduces the Turing Test on Screen to address GUI agents detectability by digital platforms. It proposes a benchmark and methods to humanize agent behavior, balancing imitability with task performance, enabling seamless coexistence in adversarial digital environments.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09574
• PDF: https://arxiv.org/pdf/2604.09574
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TuringTest #GUIAgents #AIHumanization #MobileAI #AISecurity
📝 Summary:
This paper introduces the Turing Test on Screen to address GUI agents detectability by digital platforms. It proposes a benchmark and methods to humanize agent behavior, balancing imitability with task performance, enabling seamless coexistence in adversarial digital environments.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09574
• PDF: https://arxiv.org/pdf/2604.09574
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TuringTest #GUIAgents #AIHumanization #MobileAI #AISecurity
✨SpotSound: Enhancing Large Audio-Language Models with Fine-Grained Temporal Grounding
📝 Summary:
SpotSound improves audio language models for precise temporal grounding in long, noisy audio. It uses a novel training objective to suppress false timestamps, addressing sparse events in challenging backgrounds. SpotSound achieves state-of-the-art performance on temporal grounding benchmarks.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13023
• PDF: https://arxiv.org/pdf/2604.13023
• Project Page: https://loiesun.github.io/spotsound/
• Github: https://github.com/LoieSun/SpotSound
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AudioLanguageModels #TemporalGrounding #AIResearch #MachineLearning #AudioProcessing
📝 Summary:
SpotSound improves audio language models for precise temporal grounding in long, noisy audio. It uses a novel training objective to suppress false timestamps, addressing sparse events in challenging backgrounds. SpotSound achieves state-of-the-art performance on temporal grounding benchmarks.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13023
• PDF: https://arxiv.org/pdf/2604.13023
• Project Page: https://loiesun.github.io/spotsound/
• Github: https://github.com/LoieSun/SpotSound
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AudioLanguageModels #TemporalGrounding #AIResearch #MachineLearning #AudioProcessing
✨Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution
📝 Summary:
Domain-specific autoencoders significantly enhance medical image super-resolution. Replacing generic VAEs improves fidelity, showing autoencoder choice is key, not the diffusion architecture. Autoencoder performance predicts overall SR quality.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12152
• PDF: https://arxiv.org/pdf/2604.12152
• Github: https://github.com/sebasmos/latent-sr
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MedicalImaging #SuperResolution #DiffusionModels #DeepLearning #Autoencoders
📝 Summary:
Domain-specific autoencoders significantly enhance medical image super-resolution. Replacing generic VAEs improves fidelity, showing autoencoder choice is key, not the diffusion architecture. Autoencoder performance predicts overall SR quality.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12152
• PDF: https://arxiv.org/pdf/2604.12152
• Github: https://github.com/sebasmos/latent-sr
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MedicalImaging #SuperResolution #DiffusionModels #DeepLearning #Autoencoders
✨3DTV: A Feedforward Interpolation Network for Real-Time View Synthesis
📝 Summary:
3DTV is a feedforward network combining lightweight geometry and learning for real-time, robust sparse-view interpolation. It generates novel views efficiently without scene-specific optimization, making it practical for interactive applications.
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11211
• PDF: https://arxiv.org/pdf/2604.11211
• Project Page: https://stefanmschulz.github.io/3DTV_webpage/
• Github: https://github.com/StefanMSchulz/3DTV
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ViewSynthesis #DeepLearning #ComputerVision #NeuralNetworks #RealTimeAI
📝 Summary:
3DTV is a feedforward network combining lightweight geometry and learning for real-time, robust sparse-view interpolation. It generates novel views efficiently without scene-specific optimization, making it practical for interactive applications.
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11211
• PDF: https://arxiv.org/pdf/2604.11211
• Project Page: https://stefanmschulz.github.io/3DTV_webpage/
• Github: https://github.com/StefanMSchulz/3DTV
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ViewSynthesis #DeepLearning #ComputerVision #NeuralNetworks #RealTimeAI
✨BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation
📝 Summary:
Lexical LLM evaluation is rigid and inaccurate, while LLM-as-a-Judge is expensive. This paper introduces BERT-as-a-Judge, a robust, scalable encoder-driven method for reference-based LLM evaluation. It performs like larger LLM judges but with lower cost.
🔹 Publication Date: Published on Apr 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09497
• PDF: https://arxiv.org/pdf/2604.09497
• Github: https://github.com/artefactory/BERT-as-a-Judge
🔹 Models citing this paper:
• https://huggingface.co/artefactory/BERTJudge
• https://huggingface.co/artefactory/BERTJudge-Formatted-QCR-500k
• https://huggingface.co/artefactory/BERTJudge-Formatted-QCR-OOD
✨ Datasets citing this paper:
• https://huggingface.co/datasets/artefactory/BERTJudge-Dataset
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMEvaluation #BERT #NLP #AIResearch #MachineLearning
📝 Summary:
Lexical LLM evaluation is rigid and inaccurate, while LLM-as-a-Judge is expensive. This paper introduces BERT-as-a-Judge, a robust, scalable encoder-driven method for reference-based LLM evaluation. It performs like larger LLM judges but with lower cost.
🔹 Publication Date: Published on Apr 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09497
• PDF: https://arxiv.org/pdf/2604.09497
• Github: https://github.com/artefactory/BERT-as-a-Judge
🔹 Models citing this paper:
• https://huggingface.co/artefactory/BERTJudge
• https://huggingface.co/artefactory/BERTJudge-Formatted-QCR-500k
• https://huggingface.co/artefactory/BERTJudge-Formatted-QCR-OOD
✨ Datasets citing this paper:
• https://huggingface.co/datasets/artefactory/BERTJudge-Dataset
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMEvaluation #BERT #NLP #AIResearch #MachineLearning
arXiv.org
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for...
Accurate evaluation is central to the large language model (LLM) ecosystem, guiding model selection and downstream adoption across diverse use cases. In practice, however, evaluating generative...
✨Spatial Competence Benchmark
📝 Summary:
Three frontier models show declining accuracy on a new spatial competence benchmark, with performance saturating quickly under token budget constraints. AI-generated summary Spatial competence is the ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09594
• PDF: https://arxiv.org/pdf/2604.09594
• Github: https://github.com/ashleyharris-maptek-com-au/SpatialCompetenceBenchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Three frontier models show declining accuracy on a new spatial competence benchmark, with performance saturating quickly under token budget constraints. AI-generated summary Spatial competence is the ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09594
• PDF: https://arxiv.org/pdf/2604.09594
• Github: https://github.com/ashleyharris-maptek-com-au/SpatialCompetenceBenchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts
📝 Summary:
Current OCR models poorly generalize across diverse scripts. GlotOCR Bench, a new benchmark for over 100 Unicode scripts, reveals most models perform well on under ten scripts. Generalization is limited and strongly depends on pretraining coverage.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12978
• PDF: https://arxiv.org/pdf/2604.12978
• Project Page: https://huggingface.co/datasets/cis-lmu/GlotOCR-bench
• Github: https://github.com/cisnlp/glotocr-bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OCR #NLP #MultilingualAI #Benchmarking #AIResearch
📝 Summary:
Current OCR models poorly generalize across diverse scripts. GlotOCR Bench, a new benchmark for over 100 Unicode scripts, reveals most models perform well on under ten scripts. Generalization is limited and strongly depends on pretraining coverage.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12978
• PDF: https://arxiv.org/pdf/2604.12978
• Project Page: https://huggingface.co/datasets/cis-lmu/GlotOCR-bench
• Github: https://github.com/cisnlp/glotocr-bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OCR #NLP #MultilingualAI #Benchmarking #AIResearch
✨LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
📝 Summary:
Language-Agnostic Semantic Alignment (LASA) addresses LLM safety gaps across languages by targeting semantic bottlenecks where representations are primarily driven by shared semantics rather than lang...
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12710
• PDF: https://arxiv.org/pdf/2604.12710
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Language-Agnostic Semantic Alignment (LASA) addresses LLM safety gaps across languages by targeting semantic bottlenecks where representations are primarily driven by shared semantics rather than lang...
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12710
• PDF: https://arxiv.org/pdf/2604.12710
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PokeRL: Reinforcement Learning for Pokemon Red
📝 Summary:
PokeRL presents a modular reinforcement learning system with environment wrapping, anti-loop mechanisms, and hierarchical rewards to train agents for early-game Pokemon Red tasks. AI-generated summary...
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10812
• PDF: https://arxiv.org/pdf/2604.10812
• Github: https://github.com/reddheeraj/PokemonRL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PokeRL presents a modular reinforcement learning system with environment wrapping, anti-loop mechanisms, and hierarchical rewards to train agents for early-game Pokemon Red tasks. AI-generated summary...
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10812
• PDF: https://arxiv.org/pdf/2604.10812
• Github: https://github.com/reddheeraj/PokemonRL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Learning Versatile Humanoid Manipulation with Touch Dreaming
📝 Summary:
A multimodal Transformer architecture that integrates tactile sensing with visual and proprioceptive data enables high-dexterity humanoid manipulation through contact-aware learning and predictive mod...
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13015
• PDF: https://arxiv.org/pdf/2604.13015
• Project Page: https://humanoid-touch-dream.github.io/
• Github: https://github.com/chrisyrniu/humanoid-touch-dream
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A multimodal Transformer architecture that integrates tactile sensing with visual and proprioceptive data enables high-dexterity humanoid manipulation through contact-aware learning and predictive mod...
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13015
• PDF: https://arxiv.org/pdf/2604.13015
• Project Page: https://humanoid-touch-dream.github.io/
• Github: https://github.com/chrisyrniu/humanoid-touch-dream
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding
📝 Summary:
This research examines how internal reasoning traces affect video scene understanding in Gemini models. Quality improvements from extended reasoning plateau quickly, with Flash Lite offering the best balance. Tight reasoning budgets can lead to content not reasoned about.
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11177
• PDF: https://arxiv.org/pdf/2604.11177
• Project Page: https://github.com/video-db/gemini-reasoning-eval
• Github: https://github.com/video-db/gemini-reasoning-eval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
This research examines how internal reasoning traces affect video scene understanding in Gemini models. Quality improvements from extended reasoning plateau quickly, with Flash Lite offering the best balance. Tight reasoning budgets can lead to content not reasoned about.
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11177
• PDF: https://arxiv.org/pdf/2604.11177
• Project Page: https://github.com/video-db/gemini-reasoning-eval
• Github: https://github.com/video-db/gemini-reasoning-eval
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Parcae: Scaling Laws For Stable Looped Language Models
📝 Summary:
Looped architectures can improve model quality but suffer from instability. Parcae, a new stable looped architecture, addresses this by constraining spectral norms. It achieves up to 6.3% lower perplexity and shows superior scaling properties, matching the quality of much larger Transformers.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12946
• PDF: https://arxiv.org/pdf/2604.12946
• Project Page: https://sandyresearch.github.io/parcae/
• Github: https://github.com/sandyresearch/parcae/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Looped architectures can improve model quality but suffer from instability. Parcae, a new stable looped architecture, addresses this by constraining spectral norms. It achieves up to 6.3% lower perplexity and shows superior scaling properties, matching the quality of much larger Transformers.
🔹 Publication Date: Published on Apr 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12946
• PDF: https://arxiv.org/pdf/2604.12946
• Project Page: https://sandyresearch.github.io/parcae/
• Github: https://github.com/sandyresearch/parcae/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents
📝 Summary:
Computer-use agents face significant safety vulnerabilities under unintended attack conditions where benign instructions lead to harmful outcomes through contextual or execution-based risks, with atta...
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10577
• PDF: https://arxiv.org/pdf/2604.10577
• Project Page: https://limenlp.github.io/OS_Blind/
• Github: https://github.com/limenlp/OS_Blind
✨ Datasets citing this paper:
• https://huggingface.co/datasets/lime-nlp/OS-Blind
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Computer-use agents face significant safety vulnerabilities under unintended attack conditions where benign instructions lead to harmful outcomes through contextual or execution-based risks, with atta...
🔹 Publication Date: Published on Apr 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10577
• PDF: https://arxiv.org/pdf/2604.10577
• Project Page: https://limenlp.github.io/OS_Blind/
• Github: https://github.com/limenlp/OS_Blind
✨ Datasets citing this paper:
• https://huggingface.co/datasets/lime-nlp/OS-Blind
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Seedance 2.0: Advancing Video Generation for World Complexity
📝 Summary:
Seedance 2.0 is a new multi-modal audio-video generation model supporting text, image, audio, and video inputs. It offers improved generation quality and speed through a unified architecture, performing on par with leading models. It generates 4-15 second content at 480p/720p.
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14148
• PDF: https://arxiv.org/pdf/2604.14148
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Seedance 2.0 is a new multi-modal audio-video generation model supporting text, image, audio, and video inputs. It offers improved generation quality and speed through a unified architecture, performing on par with leading models. It generates 4-15 second content at 480p/720p.
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14148
• PDF: https://arxiv.org/pdf/2604.14148
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
📝 Summary:
A multi-agent system automates the complete lifecycle of large language model training by coordinating research and execution modules through iterative planning and experimentation. AI-generated summa...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14116
• PDF: https://arxiv.org/pdf/2604.14116
• Project Page: https://github.com/trex-project
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A multi-agent system automates the complete lifecycle of large language model training by coordinating research and execution modules through iterative planning and experimentation. AI-generated summa...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14116
• PDF: https://arxiv.org/pdf/2604.14116
• Project Page: https://github.com/trex-project
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models
📝 Summary:
OccuBench presents a comprehensive benchmark for evaluating AI agents across 100 professional domains using Language World Models to simulate real-world environments with controlled fault injection. A...
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10866
• PDF: https://arxiv.org/pdf/2604.10866
• Project Page: https://gregxmhu.github.io/OccuBench-website/
• Github: https://github.com/GregxmHu/OccuBench
✨ Datasets citing this paper:
• https://huggingface.co/datasets/gregH/OccuBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OccuBench presents a comprehensive benchmark for evaluating AI agents across 100 professional domains using Language World Models to simulate real-world environments with controlled fault injection. A...
🔹 Publication Date: Published on Apr 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.10866
• PDF: https://arxiv.org/pdf/2604.10866
• Project Page: https://gregxmhu.github.io/OccuBench-website/
• Github: https://github.com/GregxmHu/OccuBench
✨ Datasets citing this paper:
• https://huggingface.co/datasets/gregH/OccuBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding
📝 Summary:
UI-Zoomer is a training-free adaptive zoom-in framework for GUI grounding that improves localization accuracy by selectively triggering zoom-in based on prediction uncertainty quantification. AI-gener...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14113
• PDF: https://arxiv.org/pdf/2604.14113
• Project Page: https://zju-real.github.io/UI-Zoomer/
• Github: https://github.com/ZJU-REAL/UI-Zoomer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
UI-Zoomer is a training-free adaptive zoom-in framework for GUI grounding that improves localization accuracy by selectively triggering zoom-in based on prediction uncertainty quantification. AI-gener...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14113
• PDF: https://arxiv.org/pdf/2604.14113
• Project Page: https://zju-real.github.io/UI-Zoomer/
• Github: https://github.com/ZJU-REAL/UI-Zoomer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ROSE: Retrieval-Oriented Segmentation Enhancement
📝 Summary:
A new segmentation task focusing on novel and emerging entities is introduced along with a retrieval-augmented framework that enhances multimodal language models with real-time information and visual ...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14147
• PDF: https://arxiv.org/pdf/2604.14147
• Project Page: https://henghuiding.com/ROSE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A new segmentation task focusing on novel and emerging entities is introduced along with a retrieval-augmented framework that enhances multimodal language models with real-time information and visual ...
🔹 Publication Date: Published on Apr 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14147
• PDF: https://arxiv.org/pdf/2604.14147
• Project Page: https://henghuiding.com/ROSE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research