α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Wen Sun
Wen Sun
26
papers
1,062
total citations
papers (26)
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
NEURIPS 2020
arXiv
249
citations
Information Theoretic Regret Bounds for Online Nonlinear Control
NEURIPS 2020
arXiv
130
citations
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
NEURIPS 2020
arXiv
120
citations
Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates
NEURIPS 2020
arXiv
90
citations
Constrained episodic reinforcement learning in concave-convex and knapsack settings
NEURIPS 2020
arXiv
56
citations
Learning To Detect Mobile Objects From LiDAR Scans Without Labels
CVPR 2022
arXiv
51
citations
MobILE: Model-Based Imitation Learning From Observation Alone
NEURIPS 2021
arXiv
45
citations
Provable Offline Preference-Based Reinforcement Learning
ICLR 2024
arXiv
43
citations
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
NEURIPS 2022
arXiv
43
citations
Making RL with Preference-based Feedback Efficient via Randomization
ICLR 2024
arXiv
37
citations
Learning the Linear Quadratic Regulator from Nonlinear Observations
NEURIPS 2020
arXiv
31
citations
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning
NEURIPS 2023
arXiv
25
citations
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
NEURIPS 2023
arXiv
24
citations
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
ICLR 2025
arXiv
18
citations
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
ICML 2024
arXiv
17
citations
Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery
NEURIPS 2023
arXiv
15
citations
Selective Sampling and Imitation Learning via Online Regression
NEURIPS 2023
arXiv
14
citations
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
NEURIPS 2025
arXiv
12
citations
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
ICLR 2025
arXiv
10
citations
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
NEURIPS 2023
arXiv
8
citations
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
ICLR 2024
arXiv
8
citations
Value-Guided Search for Efficient Chain-of-Thought Reasoning
NEURIPS 2025
arXiv
7
citations
On Speeding Up Language Model Evaluation
ICLR 2025
arXiv
6
citations
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
ICLR 2025
arXiv
3
citations
Contextual Bandits and Imitation Learning with Preference-Based Active Queries
NEURIPS 2023
arXiv
0
citations
Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage
NEURIPS 2021
0
citations