Data Science | Machine Learning with Python for Researchers
31.4K subscribers
1.53K photos
102 videos
22 files
1.81K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models

🖥 Github: https://github.com/daochenzha/neuroshard

Paper: https://arxiv.org/pdf/2305.01868v1.pdf

https://t.iss.one/DataScienceT
👍1
⭐️ Towards Building the Federated GPT: Federated Instruction Tuning

Shepherd: A lightweight, foundational framework enabling federated instruction tuning for large language models

🖥 Github: https://github.com/jayzhang42/federatedgpt-shepherd

Paper: https://arxiv.org/pdf/2305.05644.pdf

📌 Data Preparation: https://github.com/jayzhang42/federatedgpt-shepherd#Data_Preparation

https://t.iss.one/DataScienceT
❤‍🔥3
This media is not supported in your browser
VIEW IN TELEGRAM
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

You can easily plug in any 3D backbone models and pre-train it using our framework to get a jump-start for various downstreaming tasks!

🖥 Github: https://github.com/salesforce/ulip

Paper: https://arxiv.org/abs/2305.08275v1

📌 Dataset: https://paperswithcode.com/dataset/objaverse

https://t.iss.one/DataScienceT
❤‍🔥1
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

FastComposer uses subject embeddings extracted by an image encoder to augment the generic text conditioning in diffusion models, enabling personalized image generation based on subject images and textual instructions with only forward passes.

🖥 Github: https://github.com/mit-han-lab/fastcomposer

Paper: https://arxiv.org/abs/2305.10431v1

📌 Dataset: https://paperswithcode.com/dataset/ffhq

⭐️ Project: https://fastcomposer.mit.edu/

https://t.iss.one/DataScienceT
👍2
FunASR: A Fundamental End-to-End Speech Recognition Toolkit

FunASR, an open-source speech recognition toolkit designed to bridge the gap between academic research and industrial applications

🖥 Github: https://github.com/alibaba-damo-academy/FunASR

⭐️ Docs: https://alibaba-damo-academy.github.io/FunASR/en/index.html

Paper: https://arxiv.org/abs/2305.11013v1

📌 Dataset: https://paperswithcode.com/dataset/wenetspeech

https://t.iss.one/DataScienceT
❤‍🔥1👍1
Segment Any Anomaly without Training via Hybrid Prompt Regularization

This project addresses zero-shot anomaly detection by combining SAM and Grouding DINO.

🖥 Github: https://github.com/caoyunkang/segment-any-anomaly

🖥 Colab: https://colab.research.google.com/drive/1Rwio_KfziuLp79Qh_ugum64Hjnq4ZwsE?usp=sharing

Paper: https://arxiv.org/abs/2305.11013v1

📌 Dataset: https://paperswithcode.com/dataset/visa

https://t.iss.one/DataScienceT
👍4❤‍🔥1
Diff-Pruning: Structural Pruning for Diffusion Models

Structural Pruning for Diffusion Models.

🖥 Github: https://github.com/vainf/diff-pruning

Paper: https://arxiv.org/abs/2305.10924v1

📌 Dataset: https://paperswithcode.com/dataset/lsun

https://t.iss.one/DataScienceT
❤‍🔥1
🔥 Here's a list of 32 datasets that you can go over the weekend:
https://datasciencedojo.com/blog/datasets-data-science-skills/

More reaction = more projects

@CodeProgrammer ♥️
9❤‍🔥2👍1🏆1
How to Encrypt and Decrypt Image Using Python | How to Encrypt any Image File Using Python
https://morioh.com/p/978e38a1f65b?f=5c21fb01c16e2556b555ab32

More reaction = more projects

@CodeProgrammer ♥️
❤‍🔥3
🦙 LLM-Pruner: On the Structural Pruning of Large Language Models

Compress your LLMs to any size;

🖥 Github: https://github.com/horseee/llm-pruner

Paper: https://arxiv.org/abs/2305.11627v1

📌 Dataset: https://paperswithcode.com/dataset/piqa

https://t.iss.one/DataScienceT
👍3❤‍🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
Mask-Free Video Instance Segmentation

MaskFreeVIS, achieving highly competitive VIS performance, while only using bounding box annotations for the object state.

🖥 Github: https://github.com/SysCV/maskfreevis

Paper: https://arxiv.org/pdf/2303.15904.pdf

📌 Project: https://www.vis.xyz/pub/maskfreevis/

https://t.iss.one/DataScienceT
2❤‍🔥1👍1
📎 Instruction-tuning Stable Diffusion with InstructPix2Pix

InstructPix2Pix training strategy to follow more specific instructions related to tasks in image translation (such as cartoonization) and low-level image processing (such as image deraining).

🖥 Post: https://huggingface.co/blog/instruction-tuning-sd

⭐️ Training and inference code: https://github.com/huggingface/instruction-tuned-sd

📌 Demo: https://huggingface.co/spaces/instruction-tuning-sd/instruction-tuned-sd

InstructPix2Pix: https://huggingface.co/timbrooks/instruct-pix2pix

🔍Datasets and models from this post: https://huggingface.co/instruction-tuning-sd

https://t.iss.one/DataScienceT
❤‍🔥2
QLoRA: Efficient Finetuning of Quantized LLMs

Model name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99.3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU.

🖥 Github: https://github.com/artidoro/qlora

Paper: https://arxiv.org/abs/2305.14314

⭐️ Demo: https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi

📌 Dataset: https://paperswithcode.com/dataset/ffhq

https://t.iss.one/DataScienceT
❤‍🔥2
Large Language Models as Tool Makers

In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.

🖥 Github: https://github.com/ctlllll/llm-toolmaker

Paper: https://arxiv.org/pdf/2305.17126v1.pdf

📌 Dataset: https://paperswithcode.com/dataset/big-bench

https://t.iss.one/DataScienceT
👍21
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

The performance of Text2Image is largely dependent on text prompts. In Prompt-Free Diffusion, no prompt is needed, just a reference images.

🖥 Github: https://github.com/shi-labs/prompt-free-diffusion

🔎 Demo: https://huggingface.co/spaces/shi-labs/Prompt-Free-Diffusion

Paper: https://arxiv.org/abs/2305.16223v1

📌 Dataset: https://paperswithcode.com/dataset/ffhq

https://t.iss.one/DataScienceT
❤‍🔥21👍1