Data Science | Machine Learning with Python for Researchers
31.5K subscribers
1.54K photos
102 videos
22 files
1.82K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
Large Language Models as Tool Makers

In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.

πŸ–₯ Github: https://github.com/ctlllll/llm-toolmaker

⏩ Paper: https://arxiv.org/pdf/2305.17126v1.pdf

πŸ“Œ Dataset: https://paperswithcode.com/dataset/big-bench

https://t.iss.one/DataScienceT
❀‍πŸ”₯1πŸ‘1
πŸ–₯ A Practical Toolkit for Multilingual Question and Answer Generation

Multilingual/multidomain question generation datasets, models, and python library for question generation.

πŸ–₯ Github: https://github.com/asahi417/lm-question-generation

⏩ Paper: https://arxiv.org/abs/2305.17416v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/squad

https://t.iss.one/DataScienceT
πŸ‘1
πŸ¦™ BigTrans πŸš€

BigTrans which adapts LLaMA that covers only 20 languages and enhances it with multilingual translation capability on more than 100 languag

πŸ–₯ Github: https://github.com/ZNLP/BigTrans/tree/main

⏩ Paper: https://arxiv.org/abs/2305.18098v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/flores-200

https://t.iss.one/DataScienceT
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ GPT4Tools: Teaching LLM to Use Tools via Self-instruction

GPT4Tools is a centralized system that can control multiple visual foundation models. It is based on Vicuna (LLaMA), and 71K self-built instruction data.

πŸ–₯ Github: https://github.com/stevengrove/gpt4tools

⏩ Paper: https://arxiv.org/abs/2305.18752v1

πŸ“Œ Project: https://gpt4tools.github.io/

https://t.iss.one/DataScienceT
This media is not supported in your browser
VIEW IN TELEGRAM
Introducing BERTopic Integration with the Hugging Face Hub

BERTopic provides a powerful tool for users to uncover significant topics within text collections, thereby gaining valuable insights.

pip install bertopic

πŸ€— Hugging face: https://huggingface.co/blog/bertopic

πŸ–₯ Github: https://github.com/MaartenGr/BERTopic

⏩ Colab: https://colab.research.google.com/#fileId=https://huggingface.co/spaces/davanstrien/blog_notebooks/blob/main/BERTopic_hub_starter.ipynb

πŸ“Œ Docs: https://maartengr.github.io/BERTopic/getting_started/quickstart/quickstart.html

https://t.iss.one/DataScienceT
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Hiera is a hierarchical vision transformer that is fast, powerful, and, above all, simple. It outperforms the state-of-the-art across a wide array of image and video tasks while being much faster.

pip install hiera-transformer

πŸ–₯ Github: https://github.com/stevengrove/gpt4tools

⏩ Paper: https://arxiv.org/abs/2306.00989v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/inaturalist

https://t.iss.one/DataScienceT
❀‍πŸ”₯3πŸ‘1
Wuerstchen: Efficient Pretraining of Text-to-Image Models

Novel technique for text-to-image synthesis that unites competitive performance with unprecedented cost-effectiveness and ease of training on constrained hardwar

πŸ–₯ Github: https://github.com/dome272/wuerstchen

⏩ Paper: https://arxiv.org/abs/2306.00637v1

πŸ“Œ Colab: https://colab.research.google.com/drive/1UTP9Xn2UIrVbAXyL-SKEvyLmgVWdw-Vy

https://t.iss.one/DataScienceT
❀‍πŸ”₯3
If you’re a developer wanting to use large language model tools, our new course is for you.

You’ll learn how to use different prompts at various stages in the system-building process, strategies for parsing long documents, and much more!

Join for free:
https://learn.deeplearning.ai/chatgpt-building-system

βœ… More reaction = more posts

@CodeProgrammer β™₯️
❀‍πŸ”₯5
πŸ”­ GRES: Generalized Referring Expression Segmentation

New benchmark (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects.

πŸ–₯ Github: https://github.com/henghuiding/ReLA

⏩ Paper: https://arxiv.org/abs/2306.00968

πŸ”Ž Project: https://henghuiding.github.io/GRES/

πŸ“Œ New dataset: https://github.com/henghuiding/gRefCOCO

https://t.iss.one/DataScienceT
❀‍πŸ”₯3
🦍 Gorilla: Large Language Model Connected with Massive APIs

Gorilla a finetuned LLaMA-based model that surpasses the performance of GPT-4 on writing API calls.

πŸ–₯ Github: https://github.com/ShishirPatil/gorilla

πŸ“• Paper: https://arxiv.org/abs/2305.15334

πŸ”— Demo: https://drive.google.com/file/d/1E0k5mG1mTiaz0kukyK1PdeohJipTFh6j/view?usp=share_link

πŸ‘‰ Project: https://shishirpatil.github.io/gorilla/

⭐️ Colab: https://colab.research.google.com/drive/1DEBPsccVLF_aUnmD0FwPeHFrtdC0QIUP?usp=sharing

https://t.iss.one/DataScienceT
πŸ‘3❀‍πŸ”₯2😍1
Segment Anything 3D

SAM-3D: A toolbox transfers 2D SAM segments into 3D scene-level point clouds.

πŸ–₯ Github: https://github.com/pointcept/segmentanything3d

⏩ Paper: https://arxiv.org/abs/2306.03908v1

πŸ“Œ Dataset: https://paperswithcode.com/dataset/scannet

https://t.iss.one/DataScienceT
❀‍πŸ”₯2πŸ‘1
🐼 PandaLM: ReProducible and Automated Language Model Assessment

Judge large language model, named PandaLM, which is trained to distinguish the superior model given several LLMs. PandaLM's focus extends beyond just the objective correctness of responses, which is the main focus of traditional evaluation datasets.

πŸ–₯ Github: https://github.com/weopenml/pandalm

πŸ“• Paper: https://arxiv.org/abs/2306.05087v1

πŸ”— Dataset: https://github.com/tatsu-lab/stanford_alpaca#data-release

https://t.iss.one/DataScienceT
❀‍πŸ”₯2