ResearchAlpha Leak

Conferences Topics Top Authors Rankings Browse All

Home/Authors/Neel Nanda

Neel Nanda

Topic trends: 32,543 papers · similarity ≥ 0.4 · year ≥ 2024 · Data sourced from Semantic Scholar

34,598 papers | Abstracts: 31,650 (91.5%) | Citations: 34,598 (100.0%) | arXiv: 26,074 (75.4%)

Built: Feb 15, 2026, 1:13 AM AMS

8

papers

514

total citations

papers (8)

Towards Best Practices of Activation Patching in Language Models: Metrics and Methods

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

Are Sparse Autoencoders Useful? A Case Study in Sparse Probing

Sparse Autoencoders Do Not Find Canonical Units of Analysis

Explorations of Self-Repair in Language Models

Scaling Sparse Feature Circuits For Studying In-Context Learning