MJ-Bench-Team

community

https://mj-bench.github.io

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Zhaorun submitted a paper 25 days ago

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

zhuokai authored a paper about 2 months ago

Synthetic Sandbox for Training Machine Learning Engineering Agents

zhuokai authored a paper 5 months ago

Preference Optimization with Multi-Sample Comparisons

View all activity

submitted a paper to Daily Papers 25 days ago

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Paper • 2605.04808 • Published 30 days ago • 20

authored a paper about 2 months ago

Synthetic Sandbox for Training Machine Learning Engineering Agents

Paper • 2604.04872 • Published Apr 6 • 14

authored 2 papers 5 months ago

Preference Optimization with Multi-Sample Comparisons

Paper • 2410.12138 • Published Oct 16, 2024

Token-Level LLM Collaboration via FusionRoute

Paper • 2601.05106 • Published Jan 8 • 40

authored 11 papers 7 months ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5, 2025 • 83

From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding

Paper • 2412.06474 • Published Dec 9, 2024

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Paper • 2501.09620 • Published Jan 16, 2025

Transfer between Modalities with MetaQueries

Paper • 2504.06256 • Published Apr 8, 2025 • 2

S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning

Paper • 2504.06426 • Published Apr 8, 2025 • 2

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning

Paper • 2503.19900 • Published Mar 25, 2025

Boosting LLM Reasoning via Spontaneous Self-Correction

Paper • 2506.06923 • Published Jun 7, 2025

RecoWorld: Building Simulated Environments for Agentic Recommender Systems

Paper • 2509.10397 • Published Sep 12, 2025 • 7

StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding

Paper • 2508.15717 • Published Aug 21, 2025 • 1

Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning

Paper • 2510.05251 • Published Oct 6, 2025 • 8

Thought Communication in Multiagent Collaboration

Paper • 2510.20733 • Published Oct 23, 2025 • 15

updated a dataset 7 months ago

MJ-Bench/MJ-Bench

Viewer • Updated Oct 23, 2025 • 7.56k • 81

published a dataset 7 months ago

MJ-Bench/MJ-Bench

Viewer • Updated Oct 23, 2025 • 7.56k • 81

authored a paper 10 months ago

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

Paper • 2508.02120 • Published Aug 4, 2025 • 20

authored a paper about 1 year ago

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Paper • 2503.07597 • Published Mar 10, 2025 • 2

updated a Space over 1 year ago

README