Perceptual-Evidence Anchored Reinforced Learning for Multimodal Reasoning
Zhangchi
Rex1090
AI & ML interests
None yet
Recent Activity
updated a model about 1 month ago
Rex1090/PEARL-8B upvoted a paper about 1 month ago
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR updated a dataset about 2 months ago
Rex1090/testOrganizations
None yet