✨HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam
📝 Summary:
HLE-Verified systematically validates and revises the HLE benchmark, resolving noisy items through expert review and model-based checks. This improves language model evaluation accuracy by 7-10 percentage points, especially on erroneous items, enabling more reliable measurement of model capabilit...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.13964
• PDF: https://arxiv.org/pdf/2602.13964
✨ Datasets citing this paper:
• https://huggingface.co/datasets/skylenage/HLE-Verified
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMEvaluation #Benchmarking #LanguageModels #AIResearch #NLP
📝 Summary:
HLE-Verified systematically validates and revises the HLE benchmark, resolving noisy items through expert review and model-based checks. This improves language model evaluation accuracy by 7-10 percentage points, especially on erroneous items, enabling more reliable measurement of model capabilit...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.13964
• PDF: https://arxiv.org/pdf/2602.13964
✨ Datasets citing this paper:
• https://huggingface.co/datasets/skylenage/HLE-Verified
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMEvaluation #Benchmarking #LanguageModels #AIResearch #NLP
✨Panini: Continual Learning in Token Space via Structured Memory
📝 Summary:
Panini is a continual learning framework storing knowledge in generative semantic workspaces to improve language model reasoning. It achieves 5-7 percent better performance using far fewer tokens and reduces unsupported answers for efficient, accurate retrieval.
🔹 Publication Date: Published on Feb 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15156
• PDF: https://arxiv.org/pdf/2602.15156
• Github: https://github.com/roychowdhuryresearch/gsw-memory
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Panini is a continual learning framework storing knowledge in generative semantic workspaces to improve language model reasoning. It achieves 5-7 percent better performance using far fewer tokens and reduces unsupported answers for efficient, accurate retrieval.
🔹 Publication Date: Published on Feb 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15156
• PDF: https://arxiv.org/pdf/2602.15156
• Github: https://github.com/roychowdhuryresearch/gsw-memory
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems
📝 Summary:
A Vision Wormhole framework enables efficient, model-agnostic communication in multi-agent systems by using visual-language models to transfer reasoning states through a shared latent space, reducing ...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15382
• PDF: https://arxiv.org/pdf/2602.15382
• Github: https://github.com/xz-liu/heterogeneous-latent-mas
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A Vision Wormhole framework enables efficient, model-agnostic communication in multi-agent systems by using visual-language models to transfer reasoning states through a shared latent space, reducing ...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15382
• PDF: https://arxiv.org/pdf/2602.15382
• Github: https://github.com/xz-liu/heterogeneous-latent-mas
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
👍1
✨How Much Reasoning Do Retrieval-Augmented Models Add beyond LLMs? A Benchmarking Framework for Multi-Hop Inference over Hybrid Knowledge
📝 Summary:
HybridRAG-Bench evaluates models multi-hop reasoning over hybrid knowledge. It uses recent scientific literature to create contamination-aware benchmarks, distinguishing genuine retrieval and reasoning from parametric recall.
🔹 Publication Date: Published on Feb 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10210
• PDF: https://arxiv.org/pdf/2602.10210
• Project Page: https://junhongmit.github.io/HybridRAG-Bench/
• Github: https://github.com/junhongmit/HybridRAG-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HybridRAG-Bench evaluates models multi-hop reasoning over hybrid knowledge. It uses recent scientific literature to create contamination-aware benchmarks, distinguishing genuine retrieval and reasoning from parametric recall.
🔹 Publication Date: Published on Feb 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10210
• PDF: https://arxiv.org/pdf/2602.10210
• Project Page: https://junhongmit.github.io/HybridRAG-Bench/
• Github: https://github.com/junhongmit/HybridRAG-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
📝 Summary:
ARTEMIS, a multi-agent framework, outperforms human cybersecurity professionals in vulnerability discovery and submission quality in an enterprise environment. AI-generated summary We present the firs...
🔹 Publication Date: Published on Dec 10, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09882
• PDF: https://arxiv.org/pdf/2512.09882
• Project Page: https://trinity.cs.stanford.edu
• Github: https://github.com/Stanford-Trinity/ARTEMIS
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ARTEMIS, a multi-agent framework, outperforms human cybersecurity professionals in vulnerability discovery and submission quality in an enterprise environment. AI-generated summary We present the firs...
🔹 Publication Date: Published on Dec 10, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09882
• PDF: https://arxiv.org/pdf/2512.09882
• Project Page: https://trinity.cs.stanford.edu
• Github: https://github.com/Stanford-Trinity/ARTEMIS
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨A decoder-only foundation model for time-series forecasting
📝 Summary:
A decoder-only foundation model is developed for time-series forecasting. Pretrained on a large corpus, this patched-decoder attention model delivers near state-of-the-art zero-shot performance across diverse datasets, time scales, and granularities.
🔹 Publication Date: Published on Oct 14, 2023
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2310.10688
• PDF: https://arxiv.org/pdf/2310.10688
• Github: https://github.com/google-research/timesfm
🔹 Models citing this paper:
• https://huggingface.co/google/timesfm-1.0-200m
• https://huggingface.co/google/timesfm-2.0-500m-pytorch
• https://huggingface.co/google/timesfm-2.5-200m-pytorch
✨ Spaces citing this paper:
• https://huggingface.co/spaces/autogluon/fev-bench
• https://huggingface.co/spaces/JayLacoma/Trader_Technical_Indicators
• https://huggingface.co/spaces/pavel321/huggingface-cli-completion
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A decoder-only foundation model is developed for time-series forecasting. Pretrained on a large corpus, this patched-decoder attention model delivers near state-of-the-art zero-shot performance across diverse datasets, time scales, and granularities.
🔹 Publication Date: Published on Oct 14, 2023
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2310.10688
• PDF: https://arxiv.org/pdf/2310.10688
• Github: https://github.com/google-research/timesfm
🔹 Models citing this paper:
• https://huggingface.co/google/timesfm-1.0-200m
• https://huggingface.co/google/timesfm-2.0-500m-pytorch
• https://huggingface.co/google/timesfm-2.5-200m-pytorch
✨ Spaces citing this paper:
• https://huggingface.co/spaces/autogluon/fev-bench
• https://huggingface.co/spaces/JayLacoma/Trader_Technical_Indicators
• https://huggingface.co/spaces/pavel321/huggingface-cli-completion
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
A decoder-only foundation model for time-series forecasting
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on...
✨SAM 3D Body: Robust Full-Body Human Mesh Recovery
📝 Summary:
A promptable 3D human mesh recovery model using a novel parametric representation and encoder-decoder architecture achieves state-of-the-art performance with strong generalization across diverse condi...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15989
• PDF: https://arxiv.org/pdf/2602.15989
• Project Page: https://ai.meta.com/research/sam3d/
• Github: https://github.com/facebookresearch/sam-3d-body
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A promptable 3D human mesh recovery model using a novel parametric representation and encoder-decoder architecture achieves state-of-the-art performance with strong generalization across diverse condi...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15989
• PDF: https://arxiv.org/pdf/2602.15989
• Project Page: https://ai.meta.com/research/sam3d/
• Github: https://github.com/facebookresearch/sam-3d-body
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation
📝 Summary:
HERO enables humanoid robots to perform object manipulation in diverse real-world environments. It combines accurate end-effector control, trained in simulation, with open-vocabulary vision for generalization, reducing tracking error by 3.2x.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16705
• PDF: https://arxiv.org/pdf/2602.16705
• Project Page: https://hero-humanoid.github.io/
• Github: https://hero-humanoid.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HERO enables humanoid robots to perform object manipulation in diverse real-world environments. It combines accurate end-effector control, trained in simulation, with open-vocabulary vision for generalization, reducing tracking error by 3.2x.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16705
• PDF: https://arxiv.org/pdf/2602.16705
• Project Page: https://hero-humanoid.github.io/
• Github: https://hero-humanoid.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Towards a Science of AI Agent Reliability
📝 Summary:
Traditional benchmark evaluations of AI agents fail to capture critical reliability issues, prompting the development of comprehensive metrics that assess consistency, robustness, predictability, and ...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16666
• PDF: https://arxiv.org/pdf/2602.16666
• Project Page: https://hal.cs.princeton.edu/reliability
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Traditional benchmark evaluations of AI agents fail to capture critical reliability issues, prompting the development of comprehensive metrics that assess consistency, robustness, predictability, and ...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16666
• PDF: https://arxiv.org/pdf/2602.16666
• Project Page: https://hal.cs.princeton.edu/reliability
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨World Action Models are Zero-shot Policies
📝 Summary:
DreamZero is a World Action Model that leverages video diffusion to enable better generalization of physical motions across novel environments and embodiments compared to vision-language-action models...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15922
• PDF: https://arxiv.org/pdf/2602.15922
• Project Page: https://dreamzero0.github.io/
• Github: https://github.com/dreamzero0/dreamzero
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DreamZero is a World Action Model that leverages video diffusion to enable better generalization of physical motions across novel environments and embodiments compared to vision-language-action models...
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15922
• PDF: https://arxiv.org/pdf/2602.15922
• Project Page: https://dreamzero0.github.io/
• Github: https://github.com/dreamzero0/dreamzero
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Multi-agent cooperation through in-context co-player inference
📝 Summary:
Sequence models enable cooperative behavior emergence in multi-agent reinforcement learning through in-context learning without hardcoded assumptions or timescale separation. AI-generated summary Achi...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16301
• PDF: https://arxiv.org/pdf/2602.16301
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Sequence models enable cooperative behavior emergence in multi-agent reinforcement learning through in-context learning without hardcoded assumptions or timescale separation. AI-generated summary Achi...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16301
• PDF: https://arxiv.org/pdf/2602.16301
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SLA2: Sparse-Linear Attention with Learnable Routing and QAT
📝 Summary:
SLA2 improves sparse-linear attention in diffusion models by introducing a learnable router, direct attention formulation, and quantization-aware fine-tuning for enhanced efficiency and quality. AI-ge...
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12675
• PDF: https://arxiv.org/pdf/2602.12675
• Project Page: https://github.com/thu-ml/SLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SLA2 improves sparse-linear attention in diffusion models by introducing a learnable router, direct attention formulation, and quantization-aware fine-tuning for enhanced efficiency and quality. AI-ge...
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12675
• PDF: https://arxiv.org/pdf/2602.12675
• Project Page: https://github.com/thu-ml/SLA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Learning Situated Awareness in the Real World
📝 Summary:
SAW-Bench presents a new benchmark for evaluating egocentric situated awareness in multimodal foundation models through real-world video datasets with human-annotated question-answer pairs, focusing o...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16682
• PDF: https://arxiv.org/pdf/2602.16682
• Project Page: https://sawbench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SAW-Bench presents a new benchmark for evaluating egocentric situated awareness in multimodal foundation models through real-world video datasets with human-annotated question-answer pairs, focusing o...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16682
• PDF: https://arxiv.org/pdf/2602.16682
• Project Page: https://sawbench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MMA: Multimodal Memory Agent
📝 Summary:
Multimodal Memory Agent (MMA) improves long-horizon agent performance by dynamically scoring memory reliability and handling visual biases in retrieval-augmented systems. AI-generated summary Long-hor...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16493
• PDF: https://arxiv.org/pdf/2602.16493
• Github: https://github.com/AIGeeksGroup/MMA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multimodal Memory Agent (MMA) improves long-horizon agent performance by dynamically scoring memory reliability and handling visual biases in retrieval-augmented systems. AI-generated summary Long-hor...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16493
• PDF: https://arxiv.org/pdf/2602.16493
• Github: https://github.com/AIGeeksGroup/MMA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨RynnBrain: Open Embodied Foundation Models
📝 Summary:
RynnBrain is an open-source spatiotemporal foundation model for embodied intelligence that unifies perception, reasoning, and planning capabilities across multiple scales and task-specific variants. A...
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14979
• PDF: https://arxiv.org/pdf/2602.14979
• Project Page: https://alibaba-damo-academy.github.io/RynnBrain.github.io/
• Github: https://github.com/alibaba-damo-academy/RynnBrain
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
RynnBrain is an open-source spatiotemporal foundation model for embodied intelligence that unifies perception, reasoning, and planning capabilities across multiple scales and task-specific variants. A...
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14979
• PDF: https://arxiv.org/pdf/2602.14979
• Project Page: https://alibaba-damo-academy.github.io/RynnBrain.github.io/
• Github: https://github.com/alibaba-damo-academy/RynnBrain
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality
📝 Summary:
LLMs demonstrate near-complete factual encoding but struggle with retrieval accessibility, where errors stem from access limitations rather than knowledge gaps, with reasoning improving recall of enco...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14080
• PDF: https://arxiv.org/pdf/2602.14080
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LLMs demonstrate near-complete factual encoding but struggle with retrieval accessibility, where errors stem from access limitations rather than knowledge gaps, with reasoning improving recall of enco...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14080
• PDF: https://arxiv.org/pdf/2602.14080
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
Empty Shelves or Lost Keys? Recall Is the Bottleneck for...
Standard factuality evaluations of LLMs treat all errors alike, obscuring whether failures arise from missing knowledge (empty shelves) or from limited access to encoded facts (lost keys). We...
✨Optimizing Few-Step Generation with Adaptive Matching Distillation
📝 Summary:
Adaptive Matching Distillation AMD improves generative model training by detecting and escaping unstable optimization regions. It uses reward proxies to correct trajectories, boosting sample fidelity and training robustness across generation tasks.
🔹 Publication Date: Published on Feb 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07345
• PDF: https://arxiv.org/pdf/2602.07345
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Adaptive Matching Distillation AMD improves generative model training by detecting and escaping unstable optimization regions. It uses reward proxies to correct trajectories, boosting sample fidelity and training robustness across generation tasks.
🔹 Publication Date: Published on Feb 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07345
• PDF: https://arxiv.org/pdf/2602.07345
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨BiManiBench: A Hierarchical Benchmark for Evaluating Bimanual Coordination of Multimodal Large Language Models
📝 Summary:
BiManiBench evaluates multimodal large language models on bimanual robotic tasks, revealing limitations in spatial grounding and control despite strong high-level reasoning capabilities. AI-generated ...
🔹 Publication Date: Published on Feb 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08392
• PDF: https://arxiv.org/pdf/2602.08392
• Project Page: https://bimanibench.github.io/
• Github: https://github.com/bimanibench/BiManiBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
BiManiBench evaluates multimodal large language models on bimanual robotic tasks, revealing limitations in spatial grounding and control despite strong high-level reasoning capabilities. AI-generated ...
🔹 Publication Date: Published on Feb 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08392
• PDF: https://arxiv.org/pdf/2602.08392
• Project Page: https://bimanibench.github.io/
• Github: https://github.com/bimanibench/BiManiBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Visual Memory Injection Attacks for Multi-Turn Conversations
📝 Summary:
Visual Memory Injection VMI covertly manipulates generative vision-language models using images. These images trigger specific manipulative responses only with certain prompts in multi-turn conversations, showing large-scale user manipulation is feasible.
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15927
• PDF: https://arxiv.org/pdf/2602.15927
• Github: https://github.com/chs20/visual-memory-injection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VMI #VisionLanguageModels #AISecurity #AIManipulation #GenerativeAI
📝 Summary:
Visual Memory Injection VMI covertly manipulates generative vision-language models using images. These images trigger specific manipulative responses only with certain prompts in multi-turn conversations, showing large-scale user manipulation is feasible.
🔹 Publication Date: Published on Feb 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15927
• PDF: https://arxiv.org/pdf/2602.15927
• Github: https://github.com/chs20/visual-memory-injection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VMI #VisionLanguageModels #AISecurity #AIManipulation #GenerativeAI
✨Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
📝 Summary:
Agent S2 is a new compositional framework for computer use agents. It uses Mixture-of-Grounding and Proactive Hierarchical Planning to achieve state-of-the-art performance across various benchmarks and operating systems, significantly improving automation.
🔹 Publication Date: Published on Apr 1, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.00906
• PDF: https://arxiv.org/pdf/2504.00906
• Project Page: https://www.simular.ai/articles/agent-s2-technical-review
• Github: https://github.com/simular-ai/Agent-S
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #AIagents #Automation #MachineLearning #ComputerScience
📝 Summary:
Agent S2 is a new compositional framework for computer use agents. It uses Mixture-of-Grounding and Proactive Hierarchical Planning to achieve state-of-the-art performance across various benchmarks and operating systems, significantly improving automation.
🔹 Publication Date: Published on Apr 1, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.00906
• PDF: https://arxiv.org/pdf/2504.00906
• Project Page: https://www.simular.ai/articles/agent-s2-technical-review
• Github: https://github.com/simular-ai/Agent-S
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #AIagents #Automation #MachineLearning #ComputerScience