α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Aldo Pacchiano
Aldo Pacchiano
22
papers
739
total citations
papers (22)
Effective Diversity in Population Based Reinforcement Learning
NEURIPS 2020
arXiv
180
citations
Supervised Pretraining Can Learn In-Context Reinforcement Learning
NEURIPS 2023
arXiv
129
citations
Model Selection in Contextual Stochastic Bandit Problems
NEURIPS 2020
arXiv
102
citations
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity
NEURIPS 2022
arXiv
96
citations
Tactical Optimism and Pessimism for Deep Reinforcement Learning
NEURIPS 2021
arXiv
71
citations
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
NEURIPS 2021
arXiv
32
citations
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
NEURIPS 2020
arXiv
25
citations
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
NEURIPS 2021
arXiv
22
citations
Language Model Personalization via Reward Factorization
COLM 2025
arXiv
14
citations
Learning General World Models in a Handful of Reward-Free Deployments
NEURIPS 2022
arXiv
11
citations
Best of Both Worlds Model Selection
NEURIPS 2022
arXiv
11
citations
Improving Offline RL by Blending Heuristics
ICLR 2024
arXiv
11
citations
Neural Pseudo-Label Optimism for the Bank Loan Problem
NEURIPS 2021
arXiv
8
citations
Anytime Model Selection in Linear Bandits
NEURIPS 2023
arXiv
7
citations
Second Order Bounds for Contextual Bandits with Function Approximation
ICLR 2025
arXiv
7
citations
Experiment Planning with Function Approximation
NEURIPS 2023
arXiv
5
citations
Near Optimal Policy Optimization via REPS
NEURIPS 2021
arXiv
3
citations
Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
ICML 2025
arXiv
2
citations
Multiple-policy Evaluation via Density Estimation
ICML 2025
arXiv
1
citations
A Unified Model and Dimension for Interactive Estimation
NEURIPS 2023
arXiv
1
citations
Provable Interactive Learning with Hindsight Instruction Feedback
ICML 2024
arXiv
1
citations
Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
NEURIPS 2025
arXiv
0
citations