α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Dale Schuurmans
Dale Schuurmans
24
papers
16,359
total citations
papers (24)
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
NEURIPS 2022
arXiv
15,395
citations
Learning Universal Policies via Text-Guided Video Generation
NEURIPS 2023
arXiv
424
citations
Off-Policy Evaluation via the Regularized Lagrangian
NEURIPS 2020
arXiv
121
citations
Combiner: Full Attention Transformer with Sparse Computation Cost
NEURIPS 2021
arXiv
93
citations
CoinDICE: Off-Policy Confidence Interval Estimation
NEURIPS 2020
arXiv
89
citations
Chain of Thought Imitation with Procedure Cloning
NEURIPS 2022
arXiv
46
citations
Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
NEURIPS 2020
arXiv
32
citations
The Role of Baselines in Policy Gradient Optimization
NEURIPS 2022
arXiv
25
citations
On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games
NEURIPS 2022
arXiv
24
citations
Understanding the Effect of Stochasticity in Policy Optimization
NEURIPS 2021
arXiv
20
citations
Toward Understanding In-context vs. In-weight Learning
ICLR 2025
arXiv
15
citations
Optimal Scaling for Locally Balanced Proposals in Discrete Spaces
NEURIPS 2022
arXiv
14
citations
A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs
NEURIPS 2020
arXiv
12
citations
Plastic Learning with Deep Fourier Features
ICLR 2025
arXiv
12
citations
A Simple Decentralized Cross-Entropy Method
NEURIPS 2022
arXiv
11
citations
Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
ICML 2024
arXiv
7
citations
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
ICML 2024
arXiv
7
citations
Improving Large Language Model Planning with Action Sequence Similarity
ICLR 2025
arXiv
5
citations
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
NEURIPS 2023
arXiv
4
citations
Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off
NEURIPS 2023
arXiv
3
citations
Escaping the Gravitational Pull of Softmax
NEURIPS 2020
0
citations
Position: Video as the New Language for Real-World Decision Making
ICML 2024
0
citations
DISCS: A Benchmark for Discrete Sampling
NEURIPS 2023
0
citations
The World Is Bigger: A Computationally-Embedded Perspective on the Big World Hypothesis
NEURIPS 2025
arXiv
0
citations