α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
He He
He He
6
papers
286
total citations
papers (6)
Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
NEURIPS 2023
arXiv
124
citations
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
ICML 2024
arXiv
79
citations
Language Models Learn to Mislead Humans via RLHF
ICLR 2025
arXiv
78
citations
Predicting Empirical AI Research Outcomes with Language Models
NEURIPS 2025
arXiv
5
citations
IRM—when it works and when it doesn't: A test case of natural language inference
NEURIPS 2021
0
citations
SeqPATE: Differentially Private Text Generation via Knowledge Distillation
NEURIPS 2022
0
citations