"frontier ai systems" Papers
2 papers found
Conference
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.
NEURIPS 2025arXiv:2412.01784
7
citations
Position: Require Frontier AI Labs To Release Small "Analog" Models
Shriyash Upadhyay, Philip Quirke, Narmeen Oozeer et al.
NEURIPS 2025