Beepsandbreakthroughs
Beepsandbreakthroughs
Home
Back
multimodal
A model that processes multiple types of input data — usually
vision
(images/video from cameras) and
language
(task descriptions, goals, or human instructions).
Share: