avatar
Big Data Science
@bdscience
12.04.2024 20:59
😎📊Data used when training the MA-LMM model
MA-LMM (Memory-Augmented Large Multimodal Model) is a large memory-augmented multimodal model for understanding the context of long videos.
The model allows the use of long context by significantly reducing GPU memory usage. Instead of trying to process more frames at once, like most existing models, MA-LMM processes video online while storing past information in a memory bank.
The data on which the model was trained was made publicly available. This data consists of 2 very large datasets that can be downloaded from this link
GitHub
GitHub - boheumd/MA-LMM: (2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding - boheumd/MA-LMM
2 1.9K

Обсуждение 0

Обсуждение не доступно в веб-версии. Чтобы написать комментарий, перейдите в приложение Telegram.

Обсудить в Telegram