Media is too big
VIEW IN TELEGRAM
✨Agent S: An Open Agentic Framework that Uses Computers Like a Human
📝 Summary:
Agent S is an open agentic framework enabling autonomous GUI interaction to automate complex tasks. It employs experience-augmented hierarchical planning and an Agent-Computer Interface with MLLMs for enhanced reasoning. Agent S achieves state-of-the-art performance on OSWorld and demonstrates br...
🔹 Publication Date: Published on Oct 10, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.08164
• PDF: https://arxiv.org/pdf/2410.08164
• Github: https://huggingface.co/collections/ranpox/awesome-computer-use-agents
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgenticAI #MultimodalAI #HumanComputerInteraction #Automation #AIResearch
📝 Summary:
Agent S is an open agentic framework enabling autonomous GUI interaction to automate complex tasks. It employs experience-augmented hierarchical planning and an Agent-Computer Interface with MLLMs for enhanced reasoning. Agent S achieves state-of-the-art performance on OSWorld and demonstrates br...
🔹 Publication Date: Published on Oct 10, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.08164
• PDF: https://arxiv.org/pdf/2410.08164
• Github: https://huggingface.co/collections/ranpox/awesome-computer-use-agents
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgenticAI #MultimodalAI #HumanComputerInteraction #Automation #AIResearch
✨Towards Seamless Interaction: Causal Turn-Level Modeling of Interactive 3D Conversational Head Dynamics
📝 Summary:
TIMAR is a new causal framework for 3D conversational head generation. It models dialogue using interleaved audio-visual contexts to predict continuous head dynamics, improving coherence and expressive variability. Experiments show TIMAR significantly reduces errors and improves performance.
🔹 Publication Date: Published on Dec 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15340
• PDF: https://arxiv.org/pdf/2512.15340
• Project Page: https://github.com/CoderChen01/towards-seamleass-interaction/blob/main/README.md
• Github: https://github.com/CoderChen01/towards-seamleass-interaction
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ConversationalAI #3DAnimation #HumanComputerInteraction #CausalModeling #AI
📝 Summary:
TIMAR is a new causal framework for 3D conversational head generation. It models dialogue using interleaved audio-visual contexts to predict continuous head dynamics, improving coherence and expressive variability. Experiments show TIMAR significantly reduces errors and improves performance.
🔹 Publication Date: Published on Dec 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15340
• PDF: https://arxiv.org/pdf/2512.15340
• Project Page: https://github.com/CoderChen01/towards-seamleass-interaction/blob/main/README.md
• Github: https://github.com/CoderChen01/towards-seamleass-interaction
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ConversationalAI #3DAnimation #HumanComputerInteraction #CausalModeling #AI
✨Continual GUI Agents
📝 Summary:
The Continual GUI Agents framework addresses performance degradation in dynamic UI environments. It introduces GUI-Anchoring in Flux GUI-AiF, a reinforcement fine-tuning method with novel anchoring rewards that stabilize learning across shifting UI domains and resolutions, outperforming existing ...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20732
• PDF: https://arxiv.org/pdf/2601.20732
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ContinualLearning #ReinforcementLearning #AIAgents #HumanComputerInteraction #MachineLearning
📝 Summary:
The Continual GUI Agents framework addresses performance degradation in dynamic UI environments. It introduces GUI-Anchoring in Flux GUI-AiF, a reinforcement fine-tuning method with novel anchoring rewards that stabilize learning across shifting UI domains and resolutions, outperforming existing ...
🔹 Publication Date: Published on Jan 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20732
• PDF: https://arxiv.org/pdf/2601.20732
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ContinualLearning #ReinforcementLearning #AIAgents #HumanComputerInteraction #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
✨Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control
📝 Summary:
This paper introduces a human-centric video world model for extended reality, using tracked head and hand poses for dexterous interaction. This system generates egocentric virtual environments, significantly improving user task performance and perceived control.
🔹 Publication Date: Published on Feb 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18422
• PDF: https://arxiv.org/pdf/2602.18422
• Project Page: https://codeysun.github.io/generated-reality/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ExtendedReality #VideoGeneration #HumanComputerInteraction #VirtualEnvironments #AIResearch
📝 Summary:
This paper introduces a human-centric video world model for extended reality, using tracked head and hand poses for dexterous interaction. This system generates egocentric virtual environments, significantly improving user task performance and perceived control.
🔹 Publication Date: Published on Feb 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18422
• PDF: https://arxiv.org/pdf/2602.18422
• Project Page: https://codeysun.github.io/generated-reality/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ExtendedReality #VideoGeneration #HumanComputerInteraction #VirtualEnvironments #AIResearch
❤1
✨How to Take a Memorable Picture? Empowering Users with Actionable Feedback
📝 Summary:
This paper introduces Memorability Feedback MemFeed, a new task providing actionable natural language guidance to improve photo memorability. Their method, MemCoach, uses MLLMs and a teacher-student strategy, demonstrating that memorability can be taught and instructed.
🔹 Publication Date: Published on Feb 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21877
• PDF: https://arxiv.org/pdf/2602.21877
• Project Page: https://laitifranz.github.io/MemCoach/
• Github: https://laitifranz.github.io/MemCoach/
✨ Datasets citing this paper:
• https://huggingface.co/datasets/laitifranz/MemBench-InternVL3.5-Eval
• https://huggingface.co/datasets/laitifranz/MemBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#PhotoMemorability #MLLMs #ComputerVision #AIResearch #HumanComputerInteraction
📝 Summary:
This paper introduces Memorability Feedback MemFeed, a new task providing actionable natural language guidance to improve photo memorability. Their method, MemCoach, uses MLLMs and a teacher-student strategy, demonstrating that memorability can be taught and instructed.
🔹 Publication Date: Published on Feb 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21877
• PDF: https://arxiv.org/pdf/2602.21877
• Project Page: https://laitifranz.github.io/MemCoach/
• Github: https://laitifranz.github.io/MemCoach/
✨ Datasets citing this paper:
• https://huggingface.co/datasets/laitifranz/MemBench-InternVL3.5-Eval
• https://huggingface.co/datasets/laitifranz/MemBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#PhotoMemorability #MLLMs #ComputerVision #AIResearch #HumanComputerInteraction
✨InfoPO: Information-Driven Policy Optimization for User-Centric Agents
📝 Summary:
InfoPO optimizes agent-user collaboration for underspecified requests. It uses an information-gain reward to credit valuable turns that reduce uncertainty, improving decision-making and outperforming multi-turn RL baselines.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00656
• PDF: https://arxiv.org/pdf/2603.00656
• Github: https://github.com/kfq20/InfoPO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #AI #HumanComputerInteraction #InformationTheory #AIagents
📝 Summary:
InfoPO optimizes agent-user collaboration for underspecified requests. It uses an information-gain reward to credit valuable turns that reduce uncertainty, improving decision-making and outperforming multi-turn RL baselines.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00656
• PDF: https://arxiv.org/pdf/2603.00656
• Github: https://github.com/kfq20/InfoPO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #AI #HumanComputerInteraction #InformationTheory #AIagents
✨MIBURI: Towards Expressive Interactive Gesture Synthesis
📝 Summary:
MIBURI is an online, real-time framework generating expressive full-body gestures and facial expressions for spoken dialogue. It uses body-part aware codecs and LLM embeddings to create natural, diverse, and contextually aligned motions causally, overcoming limitations of prior methods.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03282
• PDF: https://arxiv.org/pdf/2603.03282
• Project Page: https://vcai.mpi-inf.mpg.de/projects/MIBURI/
• Github: https://github.com/m-hamza-mughal/miburi
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GestureSynthesis #AI #HumanComputerInteraction #NLP #RealtimeTech
📝 Summary:
MIBURI is an online, real-time framework generating expressive full-body gestures and facial expressions for spoken dialogue. It uses body-part aware codecs and LLM embeddings to create natural, diverse, and contextually aligned motions causally, overcoming limitations of prior methods.
🔹 Publication Date: Published on Mar 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03282
• PDF: https://arxiv.org/pdf/2603.03282
• Project Page: https://vcai.mpi-inf.mpg.de/projects/MIBURI/
• Github: https://github.com/m-hamza-mughal/miburi
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GestureSynthesis #AI #HumanComputerInteraction #NLP #RealtimeTech