✨AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models
📝 Summary:
AffordBot uses MLLMs and chain-of-thought reasoning for fine-grained 3D embodied reasoning. It predicts affordance elements' location, motion type, and axis in 3D scenes per instructions. It achieves state-of-the-art by projecting 3D elements for 2D MLLMs.
🔹 Publication Date: Published on Nov 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10017
• PDF: https://arxiv.org/pdf/2511.10017
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AffordBot #MLLM #EmbodiedAI #3DReasoning #Robotics
📝 Summary:
AffordBot uses MLLMs and chain-of-thought reasoning for fine-grained 3D embodied reasoning. It predicts affordance elements' location, motion type, and axis in 3D scenes per instructions. It achieves state-of-the-art by projecting 3D elements for 2D MLLMs.
🔹 Publication Date: Published on Nov 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10017
• PDF: https://arxiv.org/pdf/2511.10017
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AffordBot #MLLM #EmbodiedAI #3DReasoning #Robotics
✨Part-X-MLLM: Part-aware 3D Multimodal Large Language Model
📝 Summary:
Part-X-MLLM is a 3D multimodal large language model that unifies diverse 3D tasks by generating structured programs from RGB point clouds and language prompts. It outputs part-level data and edit commands, enabling state-of-the-art 3D generation and editing through one interface.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13647
• PDF: https://arxiv.org/pdf/2511.13647
• Project Page: https://chunshi.wang/Part-X-MLLM/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3D #MLLM #GenerativeAI #ComputerVision #AIResearch
📝 Summary:
Part-X-MLLM is a 3D multimodal large language model that unifies diverse 3D tasks by generating structured programs from RGB point clouds and language prompts. It outputs part-level data and edit commands, enabling state-of-the-art 3D generation and editing through one interface.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13647
• PDF: https://arxiv.org/pdf/2511.13647
• Project Page: https://chunshi.wang/Part-X-MLLM/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3D #MLLM #GenerativeAI #ComputerVision #AIResearch
✨MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model
📝 Summary:
MicroVQA plus plus is a new high-quality microscopy VQA dataset built via a three-stage process. This includes HiCQA-Graph, a novel filtering method using NLI, CLIP, and MLLM signals. The dataset enables strong microscopy reasoning performance for MLLMs.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11407
• PDF: https://arxiv.org/pdf/2511.11407
• Github: https://github.com/ieellee/MicroVQA-PlusPlus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #Microscopy #VQA #AIResearch #Dataset
📝 Summary:
MicroVQA plus plus is a new high-quality microscopy VQA dataset built via a three-stage process. This includes HiCQA-Graph, a novel filtering method using NLI, CLIP, and MLLM signals. The dataset enables strong microscopy reasoning performance for MLLMs.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11407
• PDF: https://arxiv.org/pdf/2511.11407
• Github: https://github.com/ieellee/MicroVQA-PlusPlus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #Microscopy #VQA #AIResearch #Dataset