Prompting and multiple-choice (MCQ) benchmarks to evaluate reasoning capabilities of LLMs using Mastermind.
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition
FLERT: Document-Level Features for Named Entity Recognition