avatar
Big Data Science
@bdscience
19.07.2024 15:59
💡 Large video dataset with long duration and structured annotations

Tencent's MiraData is an off-the-shelf dataset with a total video duration of 16 thousand hours, designed to train text-to-video generation models. It includes long videos (average 72.1 seconds) with high motion intensity and detailed structured annotations (average 318 words per video).

To evaluate the quality of the dataset, a MiraBench benchmark system of 17 metrics assessing temporal consistency, motion in the frame, video quality, and other parameters was even specially created. According to their results, MiroData outperforms other known datasets available in open sources, which mostly consist of short videos with floating quality and short descriptions.
GitHub
GitHub - mira-space/MiraData: Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions" - mira-space/MiraData
1.6K

Обсуждение 0

Обсуждение не доступно в веб-версии. Чтобы написать комментарий, перейдите в приложение Telegram.

Обсудить в Telegram