α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Satyapriya Krishna
Satyapriya Krishna
1
Affiliations
Affiliations
Harvard University
5
papers
287
total citations
papers (5)
OpenXAI: Towards a Transparent Evaluation of Model Explanations
NEURIPS 2022
arXiv
176
citations
Post Hoc Explanations of Language Models Can Improve Language Models
NEURIPS 2023
arXiv
76
citations
Understanding the Effects of Iterative Prompting on Truthfulness
ICML 2024
arXiv
21
citations
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
ICLR 2025
arXiv
10
citations
Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning
COLM 2025
arXiv
4
citations