ML Research Hub
32.8K subscribers
4.41K photos
272 videos
23 files
4.77K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

10 Feb 2025 · Yangguang Li, Zi-Xin Zou, Zexiang Liu, Dehu Wang, Yuan Liang, Zhipeng Yu, Xingchao Liu, Yuan-Chen Guo, Ding Liang, Wanli Ouyang, Yan-Pei Cao ·

Recent advancements in diffusion techniques have propelled image and video generation to unprecedented levels of quality, significantly accelerating the deployment and application of generative AI. However, 3D shape generation technology has so far lagged behind, constrained by limitations in 3D data scale, complexity of 3D data processing, and insufficient exploration of advanced techniques in the 3D domain. Current approaches to 3D shape generation face substantial challenges in terms of output quality, generalization capability, and alignment with input conditions. We present TripoSG, a new streamlined shape diffusion paradigm capable of generating high-fidelity 3D meshes with precise correspondence to input images. Specifically, we propose: 1) A large-scale rectified flow transformer for 3D shape generation, achieving state-of-the-art fidelity through training on extensive, high-quality data. 2) A hybrid supervised training strategy combining SDF, normal, and eikonal losses for 3D VAE, achieving high-quality 3D reconstruction performance. 3) A data processing pipeline to generate 2 million high-quality 3D samples, highlighting the crucial rules for data quality and quantity in training 3D generative models. Through comprehensive experiments, we have validated the effectiveness of each component in our new framework. The seamless integration of these parts has enabled TripoSG to achieve state-of-the-art performance in 3D shape generation. The resulting 3D shapes exhibit enhanced detail due to high-resolution capabilities and demonstrate exceptional fidelity to input images. Moreover, TripoSG demonstrates improved versatility in generating 3D models from diverse image styles and contents, showcasing strong generalization capabilities. To foster progress and innovation in the field of 3D generation, we will make our model publicly available.


Paper: https://arxiv.org/pdf/2502.06608v3.pdf

Codes:
https://github.com/VAST-AI-Research/TripoSG
https://github.com/tencent/flashvdm

Dataset: 100poisonMpts

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek #RAG #Agents #GPT4

https://t.iss.one/DataScienceT
👍3
🤖🧠 Ling-1T by inclusionAI: The Future of Smarter, Faster and More Efficient AI Models

🗓️ 09 Oct 2025
📚 AI News & Trends

Artificial Intelligence is evolving at lightning speed and inclusionAI’s Ling-1T is one of the most exciting innovations leading the charge. Built on the advanced Ling 2.0 architecture, Ling-1T is a trillion-parameter model designed to combine incredible reasoning power, speed and scalability in one open-source system. Image Source : Hugging Face Unlike many AI models that ...

#Ling1T #inclusionAI #ArtificialIntelligence #OpenSourceAI #LargeLanguageModels #AIArchitecture
1
🤖🧠 Quivr AI: Building Your Second Brain with Open-Source Generative Intelligence

🗓️ 12 Oct 2025
📚 AI News & Trends

In the rapidly evolving landscape of artificial intelligence, developers and businesses are seeking solutions that merge flexibility, power, and simplicity. Enter Quivr — an open-source framework designed to help you build your own “second brain” powered by Generative AI. Whether you’re an indie developer, startup founder or enterprise engineer, it makes it possible to integrate ...

#QuivrAI #SecondBrain #GenerativeAI #OpenSourceAI #AIFramework #AIProductivity
🤖🧠 HunyuanWorld-Mirror: Tencent’s Breakthrough in Universal 3D Reconstruction

🗓️ 03 Nov 2025
📚 AI News & Trends

The race toward achieving universal 3D understanding has reached a significant milestone with Tencent’s HunyuanWorld-Mirror, a cutting-edge open-source model designed to revolutionize 3D reconstruction. In an era dominated by visual intelligence and immersive digital experiences, this new model stands out by offering a feed-forward, geometry-aware framework that can predict multiple 3D outputs in a single ...

#HunyuanWorld #Tencent #3DReconstruction #UniversalAI #GeometryAware #OpenSourceAI
1
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

📝 Summary:
WebSailor is a post-training method that teaches open-source AI models to systematically reduce uncertainty in complex information-seeking tasks. Using synthetic high-uncertainty tasks and an RL algorithm, it enables open-source agents to match the performance of proprietary systems.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13305
• PDF: https://arxiv.org/pdf/2509.13305
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #OpenSourceAI #AIAgents #MachineLearning
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

📝 Summary:
InternVL3 introduces a native multimodal pre-training paradigm, jointly learning from multimodal and text data to overcome conventional alignment challenges. This unified approach, combined with advanced techniques, achieves state-of-the-art performance on multimodal tasks, rivaling proprietary m...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.10479
• PDF: https://arxiv.org/pdf/2504.10479
• Project Page: https://internvl.github.io/blog/2025-04-11-InternVL-3.0/

🔹 Models citing this paper:
https://huggingface.co/OpenGVLab/InternVL3-78B
https://huggingface.co/OpenGVLab/InternVL3_5-241B-A28B
https://huggingface.co/OpenGVLab/InternVL3-8B

Datasets citing this paper:
https://huggingface.co/datasets/OpenGVLab/MMPR-v1.2-prompts

Spaces citing this paper:
https://huggingface.co/spaces/AntResearchNLP/ViLaBench
https://huggingface.co/spaces/TIGER-Lab/MEGA-Bench
https://huggingface.co/spaces/prithivMLmods/Tiny-VLMs-Lab

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultimodalAI #DeepLearning #AIResearch #OpenSourceAI #GenerativeAI
🤖🧠 vLLM Semantic Router: The Next Frontier in Intelligent Model Routing for LLMs

🗓️ 11 Nov 2025
📚 AI News & Trends

As large language models (LLMs) continue to evolve, organizations face new challenges in optimizing performance, accuracy and cost across various AI workloads. Running multiple models efficiently – each specialized for specific tasks has become essential for scalable AI deployment. Enter vLLM Semantic Router, an open-source innovation that introduces a new layer of intelligence to the ...

#vLLMSemanticRouter #LargeLanguageModels #AIScaling #ModelRouting #OpenSourceAI #LLMOptimization
🤖🧠 Plandex AI: The Future of Autonomous Coding Agents for Large-Scale Development

🗓️ 11 Nov 2025
📚 AI News & Trends

As software development becomes increasingly complex, developers are turning to AI tools that can manage, understand and automate large portions of the coding workflow. Among the most promising innovations in this space is Plandex AI, an open-source terminal-based coding agent designed for real-world, large-scale projects. Unlike simple AI coding assistants that handle small snippets, Plandex ...

#PlandexAI #AutonomousCoding #LargeScaleDevelopment #AICoding #OpenSourceAI #CodeAutomation
🤖🧠 Bytebot: The Future of AI Desktop Automation

🗓️ 12 Nov 2025
📚 AI News & Trends

In the era of rapid digital transformation, automation is the driving force behind business efficiency and innovation. While most AI agents are limited to browsers or APIs, a groundbreaking open-source project called Bytebot has redefined what AI can achieve. Bytebot introduces a self-hosted AI desktop agent — a virtual computer that performs complex, multi-step tasks ...

#Bytebot #AIDesktopAutomation #SelfHostedAI #OpenSourceAI #AIAgents #TaskAutomation
1
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

📝 Summary:
MiroThinker v1.0 is an open-source research agent introducing 'interactive scaling.' It trains models with reinforcement learning for deeper agent-environment interactions, performing up to 600 tool calls per task. This achieves state-of-the-art performance and establishes interaction depth as a ...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11793
• PDF: https://arxiv.org/pdf/2511.11793
• Project Page: https://dr.miromind.ai/
• Github: https://github.com/MiroMindAI/MiroThinker

🔹 Models citing this paper:
https://huggingface.co/miromind-ai/MiroThinker-v1.0-72B
https://huggingface.co/miromind-ai/MiroThinker-v1.0-8B
https://huggingface.co/miromind-ai/MiroThinker-v1.0-30B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MiroThinker #ResearchAgents #ReinforcementLearning #OpenSourceAI #LLM
1
Mobile-Agent-v3: Foundamental Agents for GUI Automation

📝 Summary:
GUI-Owl and Mobile-Agent-v3 are open-source GUI agent models achieving state-of-the-art performance on GUI benchmarks. GUI-Owl introduces large-scale environment infrastructure, diverse agent capabilities, and scalable reinforcement learning, with Mobile-Agent-v3 further improving these results.

🔹 Publication Date: Published on Aug 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.15144
• PDF: https://arxiv.org/pdf/2508.15144
• Project Page: https://github.com/X-PLUG/MobileAgent
• Github: https://github.com/X-PLUG/MobileAgent

🔹 Models citing this paper:
https://huggingface.co/mPLUG/GUI-Owl-7B
https://huggingface.co/mPLUG/GUI-Owl-32B
https://huggingface.co/mPLUG/GUI-Owl-7B-Desktop-RL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#GUIAgent #Automation #ReinforcementLearning #AIResearch #OpenSourceAI
Scaling Open-Ended Reasoning to Predict the Future

📝 Summary:
This work trains language models for open-ended future prediction using a new dataset synthesized from news. Their OpenForecaster 8B model matches larger proprietary models in accuracy, calibration, and consistency. All resources are open-sourced.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25070
• PDF: https://arxiv.org/pdf/2512.25070
• Project Page: https://www.openforecaster.github.io
• Github: https://github.com/OpenForecaster/scaling-forecasting-training

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #FuturePrediction #AI #OpenSourceAI #MachineLearning
BitNet b1.58 2B4T Technical Report

📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.

🔹 Publication Date: Published on Apr 16, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet

🔹 Models citing this paper:
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16

Spaces citing this paper:
https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
https://huggingface.co/spaces/Tonic/Native_1-bit_LLM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #Quantization #OpenSourceAI #DeepLearning