✨FG-CLIP: Fine-Grained Visual and Textual Alignment
📝 Summary:
FG-CLIP enhances fine-grained multimodal understanding, overcoming CLIPs limitations with coarse captions. It uses large models for long captions, a high-quality dataset with region boxes and detailed captions, and hard negative samples. FG-CLIP outperforms existing methods on fine-grained and ge...
🔹 Publication Date: Published on May 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.05071
• PDF: https://arxiv.org/pdf/2505.05071
• Github: https://github.com/360CVGroup/FG-CLIP
🔹 Models citing this paper:
• https://huggingface.co/qihoo360/fg-clip2-base
• https://huggingface.co/qihoo360/fg-clip-large
• https://huggingface.co/qihoo360/fg-clip-base
✨ Datasets citing this paper:
• https://huggingface.co/datasets/qihoo360/FineHARD
• https://huggingface.co/datasets/qihoo360/DCI-CN
• https://huggingface.co/datasets/qihoo360/DOCCI-CN
✨ Spaces citing this paper:
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Retrieval-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Densefeature-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP2-Retrieval-demo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#FGCLIP #FineGrainedAI #MultimodalLearning #ComputerVision #DeepLearning
📝 Summary:
FG-CLIP enhances fine-grained multimodal understanding, overcoming CLIPs limitations with coarse captions. It uses large models for long captions, a high-quality dataset with region boxes and detailed captions, and hard negative samples. FG-CLIP outperforms existing methods on fine-grained and ge...
🔹 Publication Date: Published on May 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.05071
• PDF: https://arxiv.org/pdf/2505.05071
• Github: https://github.com/360CVGroup/FG-CLIP
🔹 Models citing this paper:
• https://huggingface.co/qihoo360/fg-clip2-base
• https://huggingface.co/qihoo360/fg-clip-large
• https://huggingface.co/qihoo360/fg-clip-base
✨ Datasets citing this paper:
• https://huggingface.co/datasets/qihoo360/FineHARD
• https://huggingface.co/datasets/qihoo360/DCI-CN
• https://huggingface.co/datasets/qihoo360/DOCCI-CN
✨ Spaces citing this paper:
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Retrieval-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP-Densefeature-demo
• https://huggingface.co/spaces/qihoo360/FG-CLIP2-Retrieval-demo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#FGCLIP #FineGrainedAI #MultimodalLearning #ComputerVision #DeepLearning
arXiv.org
FG-CLIP: Fine-Grained Visual and Textual Alignment
Contrastive Language-Image Pre-training (CLIP) excels in multimodal tasks such as image-text retrieval and zero-shot classification but struggles with fine-grained understanding due to its focus...
✨AIonopedia: an LLM agent orchestrating multimodal learning for ionic liquid discovery
📝 Summary:
AIonopedia is an LLM agent that orchestrates multimodal learning for Ionic Liquid discovery. It enables accurate property predictions and molecular design through hierarchical search, validated by real-world wet-lab experiments, significantly accelerating IL discovery.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11257
• PDF: https://arxiv.org/pdf/2511.11257
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #IonicLiquids #MultimodalLearning #MaterialsScience #AIforScience
📝 Summary:
AIonopedia is an LLM agent that orchestrates multimodal learning for Ionic Liquid discovery. It enables accurate property predictions and molecular design through hierarchical search, validated by real-world wet-lab experiments, significantly accelerating IL discovery.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11257
• PDF: https://arxiv.org/pdf/2511.11257
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #IonicLiquids #MultimodalLearning #MaterialsScience #AIforScience
❤1
✨Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
📝 Summary:
Uni-MoE 2.0-Omni is an open-source omnimodal large model improving multimodal understanding, reasoning, and generation. It uses dynamic MoE and progressive training to achieve state-of-the-art results across 85 benchmarks, outperforming leading models like Qwen2.5-Omni.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12609
• PDF: https://arxiv.org/pdf/2511.12609
• Project Page: https://idealistxy.github.io/Uni-MoE-v2.github.io/
• Github: https://github.com/HITsz-TMG/Uni-MoE
🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Base
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Image
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OmnimodalAI #LLMs #MixtureOfExperts #MultimodalLearning #AIResearch
📝 Summary:
Uni-MoE 2.0-Omni is an open-source omnimodal large model improving multimodal understanding, reasoning, and generation. It uses dynamic MoE and progressive training to achieve state-of-the-art results across 85 benchmarks, outperforming leading models like Qwen2.5-Omni.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12609
• PDF: https://arxiv.org/pdf/2511.12609
• Project Page: https://idealistxy.github.io/Uni-MoE-v2.github.io/
• Github: https://github.com/HITsz-TMG/Uni-MoE
🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Base
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Image
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OmnimodalAI #LLMs #MixtureOfExperts #MultimodalLearning #AIResearch