The Pedagogy Benchmark
AI models do well at student exams, but do they know about pedagogy and helping students learn? We made The Pedagogy Benchmark to see if models can pass teacher exams. For comparison we also show results of the MMLU benchmark which tests on student exams. The percentages show how many questions each model got right. Find out more here.