ML Research Hub – Telegram

ML Research Hub

32.6K subscribers

3.39K photos

132 videos

23 files

3.61K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.6K subscribers

ML Research Hub

✨FG-CLIP: Fine-Grained Visual and Textual Alignment

📝 Summary:
FG-CLIP enhances fine-grained multimodal understanding, overcoming CLIPs limitations with coarse captions. It uses large models for long captions, a high-quality dataset with region boxes and detailed captions, and hard negative samples. FG-CLIP outperforms existing methods on fine-grained and ge...

🔹 Publication Date: Published on May 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.05071
• PDF: https://arxiv.org/pdf/2505.05071
• Github: https://github.com/360CVGroup/FG-CLIP

🔹 Models citing this paper:
• https://huggingface.co/qihoo360/fg-clip2-base
• https://huggingface.co/qihoo360/fg-clip-large
• https://huggingface.co/qihoo360/fg-clip-base

✨ Datasets citing this paper:
• https://huggingface.co/datasets/qihoo360/FineHARD
• https://huggingface.co/datasets/qihoo360/DCI-CN
• https://huggingface.co/datasets/qihoo360/DOCCI-CN

✨ Spaces citing this paper:
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Retrieval-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Densefeature-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP2-Retrieval-demo

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#FGCLIP #FineGrainedAI #MultimodalLearning #ComputerVision #DeepLearning

FG-CLIP: Fine-Grained Visual and Textual Alignment

Contrastive Language-Image Pre-training (CLIP) excels in multimodal tasks such as image-text retrieval and zero-shot classification but struggles with fine-grained understanding due to its focus...

374 views12:03

✨ Explore Data Science 📝 Write your paper