💡 Large video dataset with long duration and structured annotations
Tencent's MiraData is an off-the-shelf dataset with a total video duration of 16 thousand hours, designed to train text-to-video generation models. It includes long videos (average 72.1 seconds) with high motion intensity and detailed structured annotations (average 318 words per video).
To evaluate the quality of the dataset, a MiraBench benchmark system of 17 metrics assessing temporal consistency, motion in the frame, video quality, and other parameters was even specially created. According to their results, MiroData outperforms other known datasets available in open sources, which mostly consist of short videos with floating quality and short descriptions.
Обсуждение 0
Обсуждение не доступно в веб-версии. Чтобы написать комментарий, перейдите в приложение Telegram.
Обсудить в Telegram