Data Science | Machine Learning with Python for Researchers
31.4K subscribers
1.53K photos
102 videos
22 files
1.81K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
QLoRA: Efficient Finetuning of Quantized LLMs

Model name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99.3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU.

πŸ–₯ Github: https://github.com/artidoro/qlora

⏩ Paper: https://arxiv.org/abs/2305.14314

⭐️ Demo: https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi

πŸ“Œ Dataset: https://paperswithcode.com/dataset/ffhq

https://t.iss.one/DataScienceT
❀‍πŸ”₯2
Large Language Models as Tool Makers

In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.

πŸ–₯ Github: https://github.com/ctlllll/llm-toolmaker

⏩ Paper: https://arxiv.org/pdf/2305.17126v1.pdf

πŸ“Œ Dataset: https://paperswithcode.com/dataset/big-bench

https://t.iss.one/DataScienceT
πŸ‘2❀1
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

The performance of Text2Image is largely dependent on text prompts. In Prompt-Free Diffusion, no prompt is needed, just a reference images.

πŸ–₯ Github: https://github.com/shi-labs/prompt-free-diffusion

πŸ”Ž Demo: https://huggingface.co/spaces/shi-labs/Prompt-Free-Diffusion

⏩ Paper: https://arxiv.org/abs/2305.16223v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/ffhq

https://t.iss.one/DataScienceT
❀‍πŸ”₯2❀1πŸ‘1
Large Language Models as Tool Makers

In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.

πŸ–₯ Github: https://github.com/ctlllll/llm-toolmaker

⏩ Paper: https://arxiv.org/pdf/2305.17126v1.pdf

πŸ“Œ Dataset: https://paperswithcode.com/dataset/big-bench

https://t.iss.one/DataScienceT
❀‍πŸ”₯1πŸ‘1
πŸ–₯ A Practical Toolkit for Multilingual Question and Answer Generation

Multilingual/multidomain question generation datasets, models, and python library for question generation.

πŸ–₯ Github: https://github.com/asahi417/lm-question-generation

⏩ Paper: https://arxiv.org/abs/2305.17416v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/squad

https://t.iss.one/DataScienceT
πŸ‘1
πŸ¦™ BigTrans πŸš€

BigTrans which adapts LLaMA that covers only 20 languages and enhances it with multilingual translation capability on more than 100 languag

πŸ–₯ Github: https://github.com/ZNLP/BigTrans/tree/main

⏩ Paper: https://arxiv.org/abs/2305.18098v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/flores-200

https://t.iss.one/DataScienceT
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ GPT4Tools: Teaching LLM to Use Tools via Self-instruction

GPT4Tools is a centralized system that can control multiple visual foundation models. It is based on Vicuna (LLaMA), and 71K self-built instruction data.

πŸ–₯ Github: https://github.com/stevengrove/gpt4tools

⏩ Paper: https://arxiv.org/abs/2305.18752v1

πŸ“Œ Project: https://gpt4tools.github.io/

https://t.iss.one/DataScienceT
This media is not supported in your browser
VIEW IN TELEGRAM
Introducing BERTopic Integration with the Hugging Face Hub

BERTopic provides a powerful tool for users to uncover significant topics within text collections, thereby gaining valuable insights.

pip install bertopic

πŸ€— Hugging face: https://huggingface.co/blog/bertopic

πŸ–₯ Github: https://github.com/MaartenGr/BERTopic

⏩ Colab: https://colab.research.google.com/#fileId=https://huggingface.co/spaces/davanstrien/blog_notebooks/blob/main/BERTopic_hub_starter.ipynb

πŸ“Œ Docs: https://maartengr.github.io/BERTopic/getting_started/quickstart/quickstart.html

https://t.iss.one/DataScienceT
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Hiera is a hierarchical vision transformer that is fast, powerful, and, above all, simple. It outperforms the state-of-the-art across a wide array of image and video tasks while being much faster.

pip install hiera-transformer

πŸ–₯ Github: https://github.com/stevengrove/gpt4tools

⏩ Paper: https://arxiv.org/abs/2306.00989v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/inaturalist

https://t.iss.one/DataScienceT
❀‍πŸ”₯3πŸ‘1
Wuerstchen: Efficient Pretraining of Text-to-Image Models

Novel technique for text-to-image synthesis that unites competitive performance with unprecedented cost-effectiveness and ease of training on constrained hardwar

πŸ–₯ Github: https://github.com/dome272/wuerstchen

⏩ Paper: https://arxiv.org/abs/2306.00637v1

πŸ“Œ Colab: https://colab.research.google.com/drive/1UTP9Xn2UIrVbAXyL-SKEvyLmgVWdw-Vy

https://t.iss.one/DataScienceT
❀‍πŸ”₯3
If you’re a developer wanting to use large language model tools, our new course is for you.

You’ll learn how to use different prompts at various stages in the system-building process, strategies for parsing long documents, and much more!

Join for free:
https://learn.deeplearning.ai/chatgpt-building-system

βœ… More reaction = more posts

@CodeProgrammer β™₯️
❀‍πŸ”₯5
πŸ”­ GRES: Generalized Referring Expression Segmentation

New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.

πŸ–₯ Github: https://github.com/henghuiding/ReLA

⏩ Paper: https://arxiv.org/abs/2306.00968

πŸ”Ž Project: https://henghuiding.github.io/GRES/

πŸ“Œ New dataset: https://github.com/henghuiding/gRefCOCO

https://t.iss.one/DataScienceT
❀‍πŸ”₯3