beyondbench added to PyPI

beyondbench added to PyPI

BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models

Latest News
DateUpdate
Feb 2026v0.0.1 released 44 tasks, 117 variations, 101+ models
Jan 2026Paper accepted at ICLR 2026… [+7923 chars]