VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. ⢠6 items ⢠Updated Feb 1 ⢠6
MolmoAct Data Mixture Collection All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. ⢠4 items ⢠Updated Dec 23, 2025 ⢠18
MolmoAct Collection All models for the MolmoAct (Multimodal Open Language Model for Action) release. ⢠10 items ⢠Updated Dec 23, 2025 ⢠35
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 ⢠489
Cohere Labs Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. ⢠5 items ⢠Updated Jul 31, 2025 ⢠72
Cosmos Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/nvidia-cosmos-2 ⢠14 items ⢠Updated 1 day ago ⢠300
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 ⢠15 items ⢠Updated Dec 6, 2024 ⢠658
Molmo Collection Artifacts for open multimodal language models. ⢠5 items ⢠Updated Dec 23, 2025 ⢠309
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper ⢠2309.10150 ⢠Published Sep 18, 2023 ⢠26
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping Paper ⢠2309.07970 ⢠Published Sep 14, 2023 ⢠8
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control Paper ⢠2307.00117 ⢠Published Jun 30, 2023 ⢠6