"scalable oversight" Papers
3 papers found
Conference
Preference Learning with Lie Detectors can Induce Honesty or Evasion
Chris Cundy, Adam Gleave
NEURIPS 2025arXiv:2505.13787
4
citations
Scaling Laws For Scalable Oversight
Joshua Engels, David Baek, Subhash Kantamneni et al.
NEURIPS 2025spotlightarXiv:2504.18530
4
citations
Assessing Large Language Models on Climate Information
Jannis Bulian, Mike Schäfer, Afra Amini et al.
ICML 2024arXiv:2310.02932
34
citations