🔹 Title: NVSpeech: An Integrated and Scalable Pipeline for Human-Like Speech Modeling with Paralinguistic Vocalizations
🔹 Publication Date: Published on Aug 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04195
• PDF: https://arxiv.org/pdf/2508.04195
• Github: https://nvspeech170k.github.io/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04195
• PDF: https://arxiv.org/pdf/2508.04195
• Github: https://nvspeech170k.github.io/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Adversarial Video Promotion Against Text-to-Video Retrieval
🔹 Publication Date: Published on Aug 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06964
• PDF: https://arxiv.org/pdf/2508.06964
• Github: https://github.com/michaeltian108/ViPro
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06964
• PDF: https://arxiv.org/pdf/2508.06964
• Github: https://github.com/michaeltian108/ViPro
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion
🔹 Publication Date: Published on Aug 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06485
• PDF: https://arxiv.org/pdf/2508.06485
• Github: https://github.com/Sofianebouaziz1/WGAST
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06485
• PDF: https://arxiv.org/pdf/2508.06485
• Github: https://github.com/Sofianebouaziz1/WGAST
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
🔹 Publication Date: Published on Aug 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04676
• PDF: https://arxiv.org/pdf/2508.04676
• Github: https://github.com/Qznan/GeRe
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04676
• PDF: https://arxiv.org/pdf/2508.04676
• Github: https://github.com/Qznan/GeRe
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability
🔹 Publication Date: Published on Aug 6
🔹 Abstract: ISEval framework evaluates large multimodal models' ability to detect flawed inputs, revealing challenges in identifying certain types of errors and modality-specific biases. AI-generated summary Large Multimodal Models (LMMs) have witnessed remarkable growth, showcasing formidable capabilities in handling intricate multimodal tasks with exceptional performance. Recent research has underscored the inclination of large language models to passively accept defective inputs, often resulting in futile reasoning on invalid prompts. However, the same critical question of whether LMMs can actively detect and scrutinize erroneous inputs still remains unexplored. To address this gap, we introduce the Input Scrutiny Ability Evaluation Framework (ISEval), which encompasses seven categories of flawed premises and three evaluation metrics . Our extensive evaluation of ten advanced LMMs has identified key findings. Most models struggle to actively detect flawed textual premises without guidance, which reflects a strong reliance on explicit prompts for premise error identification. Error type affects performance: models excel at identifying logical fallacies but struggle with surface-level linguistic errors and certain conditional flaws . Modality trust varies- Gemini 2.5 pro and Claude Sonnet 4 balance visual and textual info, while aya-vision-8b over-rely on text in conflicts. These insights underscore the urgent need to enhance LMMs' proactive verification of input validity and shed novel insights into mitigating the problem. The code is available at https://github.com/MLGroupJLU/LMM_ISEval.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04017
• PDF: https://arxiv.org/pdf/2508.04017
• Github: https://github.com/MLGroupJLU/LMM_ISEval
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 6
🔹 Abstract: ISEval framework evaluates large multimodal models' ability to detect flawed inputs, revealing challenges in identifying certain types of errors and modality-specific biases. AI-generated summary Large Multimodal Models (LMMs) have witnessed remarkable growth, showcasing formidable capabilities in handling intricate multimodal tasks with exceptional performance. Recent research has underscored the inclination of large language models to passively accept defective inputs, often resulting in futile reasoning on invalid prompts. However, the same critical question of whether LMMs can actively detect and scrutinize erroneous inputs still remains unexplored. To address this gap, we introduce the Input Scrutiny Ability Evaluation Framework (ISEval), which encompasses seven categories of flawed premises and three evaluation metrics . Our extensive evaluation of ten advanced LMMs has identified key findings. Most models struggle to actively detect flawed textual premises without guidance, which reflects a strong reliance on explicit prompts for premise error identification. Error type affects performance: models excel at identifying logical fallacies but struggle with surface-level linguistic errors and certain conditional flaws . Modality trust varies- Gemini 2.5 pro and Claude Sonnet 4 balance visual and textual info, while aya-vision-8b over-rely on text in conflicts. These insights underscore the urgent need to enhance LMMs' proactive verification of input validity and shed novel insights into mitigating the problem. The code is available at https://github.com/MLGroupJLU/LMM_ISEval.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04017
• PDF: https://arxiv.org/pdf/2508.04017
• Github: https://github.com/MLGroupJLU/LMM_ISEval
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: DeCRED: Decoder-Centric Regularization for Encoder-Decoder Based Speech Recognition
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08938
• PDF: https://arxiv.org/pdf/2508.08938
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08938
• PDF: https://arxiv.org/pdf/2508.08938
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08896
• PDF: https://arxiv.org/pdf/2508.08896
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08896
• PDF: https://arxiv.org/pdf/2508.08896
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤2
🔹 Title: BiasGym: Fantastic Biases and How to Find (and Remove) Them
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08855
• PDF: https://arxiv.org/pdf/2508.08855
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08855
• PDF: https://arxiv.org/pdf/2508.08855
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤2
🔹 Title: Multi-human Interactive Talking Dataset
🔹 Publication Date: Published on Aug 5
🔹 Abstract: MIT, a large-scale dataset for multi-human talking video generation, includes fine-grained annotations and is used to demonstrate CovOG, a baseline model integrating a Multi-Human Pose Encoder and an Interactive Audio Driver. AI-generated summary Existing studies on talking video generation have predominantly focused on single-person monologues or isolated facial animations, limiting their applicability to realistic multi-human interactions. To bridge this gap, we introduce MIT, a large-scale dataset specifically designed for multi-human talking video generation. To this end, we develop an automatic pipeline that collects and annotates multi-person conversational videos. The resulting dataset comprises 12 hours of high-resolution footage, each featuring two to four speakers, with fine-grained annotations of body poses and speech interactions. It captures natural conversational dynamics in multi-speaker scenario, offering a rich resource for studying interactive visual behaviors. To demonstrate the potential of MIT, we furthur propose CovOG, a baseline model for this novel task. It integrates a Multi-Human Pose Encoder (MPE) to handle varying numbers of speakers by aggregating individual pose embeddings, and an Interactive Audio Driver (IAD) to modulate head dynamics based on speaker-specific audio features. Together, these components showcase the feasibility and challenges of generating realistic multi-human talking videos, establishing MIT as a valuable benchmark for future research. The code is avalibale at: https://github.com/showlab/Multi-human-Talking-Video-Dataset.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03050
• PDF: https://arxiv.org/pdf/2508.03050
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 5
🔹 Abstract: MIT, a large-scale dataset for multi-human talking video generation, includes fine-grained annotations and is used to demonstrate CovOG, a baseline model integrating a Multi-Human Pose Encoder and an Interactive Audio Driver. AI-generated summary Existing studies on talking video generation have predominantly focused on single-person monologues or isolated facial animations, limiting their applicability to realistic multi-human interactions. To bridge this gap, we introduce MIT, a large-scale dataset specifically designed for multi-human talking video generation. To this end, we develop an automatic pipeline that collects and annotates multi-person conversational videos. The resulting dataset comprises 12 hours of high-resolution footage, each featuring two to four speakers, with fine-grained annotations of body poses and speech interactions. It captures natural conversational dynamics in multi-speaker scenario, offering a rich resource for studying interactive visual behaviors. To demonstrate the potential of MIT, we furthur propose CovOG, a baseline model for this novel task. It integrates a Multi-Human Pose Encoder (MPE) to handle varying numbers of speakers by aggregating individual pose embeddings, and an Interactive Audio Driver (IAD) to modulate head dynamics based on speaker-specific audio features. Together, these components showcase the feasibility and challenges of generating realistic multi-human talking videos, establishing MIT as a valuable benchmark for future research. The code is avalibale at: https://github.com/showlab/Multi-human-Talking-Video-Dataset.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03050
• PDF: https://arxiv.org/pdf/2508.03050
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤3
🔹 Title: Democratizing Diplomacy: A Harness for Evaluating Any Large Language Model on Full-Press Diplomacy
🔹 Publication Date: Published on Aug 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07485
• PDF: https://arxiv.org/pdf/2508.07485
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07485
• PDF: https://arxiv.org/pdf/2508.07485
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤3
🔹 Title: TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08680
• PDF: https://arxiv.org/pdf/2508.08680
• Github: https://github.com/ArmelRandy/topxgen
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/almanach/topxgen-gemma-3-27b-and-nllb-3.3b
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08680
• PDF: https://arxiv.org/pdf/2508.08680
• Github: https://github.com/ArmelRandy/topxgen
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/almanach/topxgen-gemma-3-27b-and-nllb-3.3b
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤4
🔹 Title: Optimization-Free Style Transfer for 3D Gaussian Splats
🔹 Publication Date: Published on Aug 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05813
• PDF: https://arxiv.org/pdf/2508.05813
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05813
• PDF: https://arxiv.org/pdf/2508.05813
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤2
🔹 Title: Improving Masked Style Transfer using Blended Partial Convolution
🔹 Publication Date: Published on Aug 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05769
• PDF: https://arxiv.org/pdf/2508.05769
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05769
• PDF: https://arxiv.org/pdf/2508.05769
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤3
🔹 Title: Technical Report: Full-Stack Fine-Tuning for the Q Programming Language
🔹 Publication Date: Published on Aug 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06813
• PDF: https://arxiv.org/pdf/2508.06813
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/morganstanley/qqWEN-overview
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.06813
• PDF: https://arxiv.org/pdf/2508.06813
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/morganstanley/qqWEN-overview
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Complex Logical Instruction Generation
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09125
• PDF: https://arxiv.org/pdf/2508.09125
• Github: https://github.com/mianzhang/LogicIF
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09125
• PDF: https://arxiv.org/pdf/2508.09125
• Github: https://github.com/mianzhang/LogicIF
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: RedDino: A foundation model for red blood cell analysis
🔹 Publication Date: Published on Aug 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08180
• PDF: https://arxiv.org/pdf/2508.08180
• Github: https://github.com/Snarci/RedDino
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08180
• PDF: https://arxiv.org/pdf/2508.08180
• Github: https://github.com/Snarci/RedDino
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤4
🔹 Title: Text-conditioned State Space Model For Domain-generalized Change Detection Visual Question Answering
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08974
• PDF: https://arxiv.org/pdf/2508.08974
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08974
• PDF: https://arxiv.org/pdf/2508.08974
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03936
• PDF: https://arxiv.org/pdf/2508.03936
• Project Page: https://purcl.github.io/astra-web/
• Github: https://purcl.github.io/astra-web/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03936
• PDF: https://arxiv.org/pdf/2508.03936
• Project Page: https://purcl.github.io/astra-web/
• Github: https://purcl.github.io/astra-web/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Putnam-AXIOM: A Functional and Static Benchmark
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08292
• PDF: https://arxiv.org/pdf/2508.08292
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.08292
• PDF: https://arxiv.org/pdf/2508.08292
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1