Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Pasquale Minervini's picture
27 25 6

Pasquale Minervini

pminervini
krohak's profile picture pingnieuk's profile picture mValentino91's profile picture
·
https://www.neuralnoise.com
  • pminervini
  • pminervini
  • pasquale-minervini-phd-47a08324
  • neuralnoise.com

AI & ML interests

NLP, ML, AI

Recent Activity

authored a paper 6 days ago
VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models
authored a paper 6 days ago
SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks
upvoted a paper 7 days ago
VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models
View all activity

Organizations

BigScience Workshop's profile picture NLP @ University of Edinburgh's profile picture ChatArena's profile picture EdinburghNLP - Natural Language Processing Group at the University of Edinburgh's profile picture Open Life Science AI's profile picture Ping Nie's profile picture hallucinations-leaderboard's profile picture Miniml's profile picture LLMAccountability's profile picture Edinburgh Dataset Analytics Working Group's profile picture OpenBox's profile picture Poster Summarization's profile picture LEMUR Decoding's profile picture DateTimeReasoning's profile picture Inverse Scaling's profile picture ML intern explorers's profile picture

liked a dataset over 1 year ago

edinburgh-dawg/mmlu-redux-2.0

Viewer • Updated Feb 25, 2025 • 5.7k • 16.6k • 37
liked 3 Spaces about 2 years ago
Running on CPU Upgrade
Agents
84

Open Ita Llm Leaderboard

🏆
84

Track, rank and evaluate open LLMs in the italian language!

Running on CPU Upgrade
Agents
60

Open CoT Leaderboard

🥇
60

Track, rank and evaluate open LLMs' CoT quality

Runtime error
Agents
145

Hallucinations Leaderboard

🔥
145

View and submit LLM evaluations

liked 2 Spaces over 2 years ago
Runtime error
Agents
16

Example Leaderboard Template

🥇
16

Duplicate this leaderboard to initialize your own!

Running on CPU Upgrade
14k

Open LLM Leaderboard

🏆
14k

Track, rank and evaluate open LLMs and chatbots

Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs