ML Research Hub
32.6K subscribers
3.39K photos
132 videos
23 files
3.61K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho
Download Telegram
FG-CLIP: Fine-Grained Visual and Textual Alignment

📝 Summary:
FG-CLIP enhances fine-grained multimodal understanding, overcoming CLIPs limitations with coarse captions. It uses large models for long captions, a high-quality dataset with region boxes and detailed captions, and hard negative samples. FG-CLIP outperforms existing methods on fine-grained and ge...

🔹 Publication Date: Published on May 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.05071
• PDF: https://arxiv.org/pdf/2505.05071
• Github: https://github.com/360CVGroup/FG-CLIP

🔹 Models citing this paper:
https://huggingface.co/qihoo360/fg-clip2-base
https://huggingface.co/qihoo360/fg-clip-large
https://huggingface.co/qihoo360/fg-clip-base

Datasets citing this paper:
https://huggingface.co/datasets/qihoo360/FineHARD
https://huggingface.co/datasets/qihoo360/DCI-CN
https://huggingface.co/datasets/qihoo360/DOCCI-CN

Spaces citing this paper:
https://huggingface.co/spaces/qihoo360/FG-CLIP-Retrieval-demo
https://huggingface.co/spaces/qihoo360/FG-CLIP-Densefeature-demo
https://huggingface.co/spaces/qihoo360/FG-CLIP2-Retrieval-demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FGCLIP #FineGrainedAI #MultimodalLearning #ComputerVision #DeepLearning