🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement.
Shawn
csfufu
AI & ML interests
None yet
Recent Activity
upvoted a paper 22 days ago
Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments upvoted a paper about 1 month ago
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models upvoted a paper about 1 month ago
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models