Poster "policy evaluation" Papers

14 papers found

Doubly Optimal Policy Evaluation for Reinforcement Learning

Shuze Liu, Claire Chen, Shangtong Zhang

ICLR 2025arXiv:2410.02226
5
citations

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning

Claire Chen, Shuze Liu, Shangtong Zhang

ICLR 2025arXiv:2410.05655
1
citations

Estimation and Inference in Distributional Reinforcement Learning

Liangyu Zhang, Yang Peng, Jiadong Liang et al.

NEURIPS 2025arXiv:2309.17262
4
citations

Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning

Yang Xu, Washim Mondal, Vaneet Aggarwal

NEURIPS 2025arXiv:2502.16816
8
citations

IRASim: A Fine-Grained World Model for Robot Manipulation

Fangqi Zhu, Hongtao Wu, Song Guo et al.

ICCV 2025arXiv:2406.14540
22
citations

Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol

Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.

NEURIPS 2025arXiv:2502.08021
4
citations

On Evaluating Policies for Robust POMDPs

Merlijn Krale, Eline M. Bovy, Maris F. L. Galesloot et al.

NEURIPS 2025

Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do

Yoav Wald, Mark Goldstein, Yonathan Efroni et al.

ICLR 2025arXiv:2503.15890

Combining Experimental and Historical Data for Policy Evaluation

Ting Li, Chengchun Shi, Qianglin Wen et al.

ICML 2024arXiv:2406.00317
4
citations

Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design

Shuze Liu, Shangtong Zhang

ICML 2024arXiv:2301.13734
7
citations

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Yassir Jedra, William Réveillard, Stefan Stojanovic et al.

ICML 2024arXiv:2402.15739
2
citations

Policy-conditioned Environment Models are More Generalizable

Ruifeng Chen, Xiong-Hui Chen, Yihao Sun et al.

ICML 2024

SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP

Subhojyoti Mukherjee, Josiah Hanna, Robert Nowak

ICML 2024arXiv:2406.02165

Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks

Khurram Javed, Haseeb Shah, Richard Sutton et al.

ICML 2024arXiv:2302.05326
10
citations