Machine learning books and papers

ChatGPT Prompts Mastering: A Guide to Crafting Clear and Effective Prompts – Beginners to Advanced Guide (2023)
Author: Christian Brown
#book #GPT #2023
@Machine_learn

🔥6❤1

5.4K viewsRamin Mousa, edited 20:30

Developing Apps With GPT-4 and ChatGPT (2023).pdf

3 MB

Book: Developing Apps with GPT-4 and
ChatGPT
Authors: Build Intelligent Chatbots, Content Generators, and More
ISBN: 978-1-098-15248-2
year: 2023
pages: 117
Tags:#GPT
@Machine_learn

👍1

4.89K viewsedited 15:12

Machine learning books and papers

DocsGPT

DocsGPT is a cutting-edge open-source solution that streamlines the process of finding information in project documentation. With its integration of the powerful GPT models, developers can easily ask questions about a project and receive accurate answers.

Say goodbye to time-consuming manual searches, and let DocsGPT help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.

Creator: Arc53
Stars ⭐️: 7.4k
Forked By: 769
https://github.com/arc53/DocsGPT

#DocsGPT #GPT

@Machine_learn

GitHub

GitHub - arc53/DocsGPT: DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while…

DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling ...

👍1

4.98K views17:06

Machine learning books and papers

OmniParser for Pure Vision Based GUI Agent

1 Aug 2024 · Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah

The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of #GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. We first curated an interactable icon detection dataset using popular webpages and an icon description dataset. These datasets were utilized to fine-tune specialized models: a detection model to parse interactable regions on the screen and a caption model to extract the functional semantics of the detected elements. \textsc{#OmniParser} significantly improves GPT-4V's performance on ScreenSpot benchmark. And on #Mind2Web and AITW benchmark, \textsc{OmniParser} with screenshot only input #outperforms the GPT-4V baselines requiring additional information outside of screenshot.

Paper: https://arxiv.org/pdf/2408.00203v1.pdf

Code: https://github.com/microsoft/omniparser

Dataset: ScreenSpot

@Machine_learn

👍3

2.02K views11:19

Machine learning books and papers

GPT 4.1 Prompting Guide
#GPT
📚 Guide

@Machine_learn

👍5

2.88K views12:19

About

Blog

Apps

Platform