🌟 MiraData: Large, long-duration video dataset with structured annotations.

When training generative models, the training dataset plays an important role in the quality of reference of ready-made models.
One of the good sources can be MiraData from Tencent - a ready-made dataset with a total video duration of 16 thousand hours, designed for training models for generating text in videos. It includes long videos (average 72.1 seconds) with high motion intensity and detailed structured annotations (average 318 words per video).

To assess the quality of the dataset, a system of MiraBench benchmarks was even specially created, consisting of 17 metrics that evaluate temporal consistency, movement in the frame, video quality, and other parameters. According to their results, MiroData outperforms other well-known datasets available in open sources, which mainly consist of short videos with floating quality and short descriptions.

🟡 Project page
🟡 Arxiv
🤗 Hugging Face
🖥 GitHub

#Text2Video #Dataset #ML