Data Science | Machine Learning with Python for Researchers
32.6K subscribers
3.31K photos
125 videos
23 files
3.52K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents

📝 Summary:
GUI-360 is a large dataset and benchmark for computer-using agents, addressing gaps in real-world tasks and unified evaluation. It contains over 1.2M action steps in Windows apps for GUI grounding, screen parsing, and action prediction. Benchmarking reveals significant shortcomings in current mod...

🔹 Publication Date: Published on Nov 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.04307
• PDF: https://arxiv.org/pdf/2511.04307

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ComputerAgents #GUIAgents #Dataset #Benchmark
OmniParser for Pure Vision Based GUI Agent

📝 Summary:
OmniParser enhances GPT-4V's ability to act as a GUI agent by improving screen parsing. It identifies interactable icons and understands element semantics using specialized models. This significantly boosts GPT-4V's performance on benchmarks like ScreenSpot, Mind2Web, and AITW.

🔹 Publication Date: Published on Aug 1, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2408.00203
• PDF: https://arxiv.org/pdf/2408.00203
• Github: https://github.com/microsoft/omniparser

🔹 Models citing this paper:
https://huggingface.co/microsoft/OmniParser
https://huggingface.co/microsoft/OmniParser-v2.0
https://huggingface.co/banao-tech/OmniParser

Datasets citing this paper:
https://huggingface.co/datasets/mlfoundations/Click-100k

Spaces citing this paper:
https://huggingface.co/spaces/callmeumer/OmniParser-v2
https://huggingface.co/spaces/nofl/OmniParser-v2
https://huggingface.co/spaces/SheldonLe/OmniParser-v2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#GUIagents #ComputerVision #GPT4V #AIagents #DeepLearning