Machine learning books and papers
22.4K subscribers
967 photos
54 videos
928 files
1.31K links
Admin: @Raminmousa
Watsapp: +989333900804
ID: @Machine_learn
link: https://t.iss.one/Machine_learn
Download Telegram
ChatGPT.Prompts.Mastering.pdf
757.3 KB
ChatGPT Prompts Mastering: A Guide to Crafting Clear and Effective Prompts – Beginners to Advanced Guide (2023)
Author:
Christian Brown
#book #GPT #2023
@Machine_learn
πŸ”₯6❀1
Developing Apps With GPT-4 and ChatGPT (2023).pdf
3 MB
Book: Developing Apps with GPT-4 and
ChatGPT
Authors: Build Intelligent Chatbots, Content Generators, and More
ISBN: 978-1-098-15248-2
year: 2023
pages: 117
Tags:#GPT
@Machine_learn
πŸ‘1
DocsGPT

DocsGPT is a cutting-edge open-source solution that streamlines the process of finding information in project documentation. With its integration of the powerful GPT models, developers can easily ask questions about a project and receive accurate answers.

Say goodbye to time-consuming manual searches, and let DocsGPT help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.

Creator: Arc53
Stars ⭐️: 7.4k
Forked By: 769
https://github.com/arc53/DocsGPT

#DocsGPT #GPT

@Machine_learn
πŸ‘1
OmniParser for Pure Vision Based GUI Agent

1 Aug 2024 Β· Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah

The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of #GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. We first curated an interactable icon detection dataset using popular webpages and an icon description dataset. These datasets were utilized to fine-tune specialized models: a detection model to parse interactable regions on the screen and a caption model to extract the functional semantics of the detected elements. \textsc{#OmniParser} significantly improves GPT-4V's performance on ScreenSpot benchmark. And on #Mind2Web and AITW benchmark, \textsc{OmniParser} with screenshot only input #outperforms the GPT-4V baselines requiring additional information outside of screenshot.

Paper: https://arxiv.org/pdf/2408.00203v1.pdf

Code: https://github.com/microsoft/omniparser

Dataset: ScreenSpot


@Machine_learn
πŸ‘3
GPT 4.1 Prompting Guide
#GPT
πŸ“š Guide

@Machine_learn
πŸ‘5