OCPSG Benchmarking LLMs is a project for benchmark development, documentation, and reproducible evaluation workflows for Large Language Models (LLMs) and fine-tuned models applied to multilingual policy agenda annotation in parliamentary speeches.
The project is developed within the Oxford Computational Political Science Group (OCPSG), a research initiative supported by Oxford’s Department of Politics and International Relations.