Poster "sample efficiency" Papers

52 papers found • Page 1 of 2

A Differential and Pointwise Control Approach to Reinforcement Learning

Minh Nguyen, Chandrajit Bajaj

NEURIPS 2025arXiv:2404.15617
1
citations

Avoiding exp(R) scaling in RLHF through Preference-based Exploration

Mingyu Chen, Yiding Chen, Wen Sun et al.

NEURIPS 2025
3
citations

BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models

Peiyan Li, Yixiang Chen, Hongtao Wu et al.

NEURIPS 2025arXiv:2506.07961
30
citations

Causal Information Prioritization for Efficient Reinforcement Learning

Hongye Cao, Fan Feng, Tianpei Yang et al.

ICLR 2025arXiv:2502.10097
5
citations

Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering

Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach

ICLR 2025arXiv:2410.01660
5
citations

Direct Alignment with Heterogeneous Preferences

Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.

NEURIPS 2025arXiv:2502.16320
10
citations

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

Wenlong Wang, Ivana Dusparic, Yucheng Shi et al.

ICLR 2025arXiv:2410.08893
3
citations

Efficient Reinforcement Learning with Large Language Model Priors

Xue Yan, Yan Song, Xidong Feng et al.

ICLR 2025arXiv:2410.07927
21
citations

JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data

Runjian Chen, Wenqi Shao, Bo Zhang et al.

CVPR 2025arXiv:2503.08422
2
citations

Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning

Oleh Kolner, Thomas Ortner, Stanisław Woźniak et al.

ICLR 2025arXiv:2409.20213

Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning

Chenjie Hao, Weyl Lu, Yifan Xu et al.

CVPR 2025arXiv:2504.07095
5
citations

ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding

Indraneil Paul, Haoyi Yang, Goran Glavaš et al.

ICLR 2025arXiv:2504.00019
3
citations

Off-policy Reinforcement Learning with Model-based Exploration Augmentation

Likun Wang, Xiangteng Zhang, Yinuo Wang et al.

NEURIPS 2025arXiv:2510.25529

On scalable and efficient training of diffusion samplers

Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.

NEURIPS 2025arXiv:2505.19552
6
citations

PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment

Daiwei Chen, Yi Chen, Aniket Rege et al.

ICLR 2025
9
citations

Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning

Ziheng Cheng, Tianyu Xie, Shiyue Zhang et al.

NEURIPS 2025arXiv:2502.04491
2
citations

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning

Timo Kaufmann, Yannick Metz, Daniel Keim et al.

NEURIPS 2025arXiv:2512.25023

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts

Minh Le, Chau Nguyen, Huy Nguyen et al.

ICLR 2025arXiv:2410.02200
14
citations

Safety-Prioritizing Curricula for Constrained Reinforcement Learning

Cevahir Koprulu, Thiago Simão, Nils Jansen et al.

ICLR 2025

Sample- and Parameter-Efficient Auto-Regressive Image Models

Elad Amrani, Leonid Karlinsky, Alex M. Bronstein

CVPR 2025arXiv:2411.15648
2
citations

Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation

Byunghyun Kim, Minyoung Bae, Jae-Gil Lee

NEURIPS 2025

Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization

Daniel Palenicek, Florian Vogt, Joe Watson et al.

NEURIPS 2025arXiv:2502.07523
9
citations

ShiQ: Bringing back Bellman to LLMs

Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.

NEURIPS 2025arXiv:2505.11081
2
citations

Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

Samuel Garcin, Trevor McInroe, Pablo Samuel Castro et al.

ICLR 2025arXiv:2503.06343
5
citations

Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control

Georgios Papoudakis, Thomas Coste, Jianye Hao et al.

NEURIPS 2025arXiv:2509.01720

Thompson Sampling in Function Spaces via Neural Operators

Rafael Oliveira, Xuesong Wang, Kian Ming Chai et al.

NEURIPS 2025arXiv:2506.21894

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound

Tal Fiskus, Uri Shaham

NEURIPS 2025arXiv:2507.11269

Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment

Chen Zhang, Qiang HE, Yuan Zhou et al.

ICML 2024arXiv:2406.01103
6
citations

A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback

Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar et al.

ICML 2024arXiv:2405.12421
2
citations

Better & Faster Large Language Models via Multi-token Prediction

Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.

ICML 2024arXiv:2404.19737
232
citations

Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation

Michelle Pan, Mariah Schrum, Vivek Myers et al.

ICML 2024arXiv:2406.06714
2
citations

Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming

Hany Hamed, Subin Kim, Dongyeong Kim et al.

ICML 2024arXiv:2402.18866
6
citations

Feasible Reachable Policy Iteration

Shentao Qin, Yujie Yang, Yao Mu et al.

ICML 2024

Hieros: Hierarchical Imagination on Structured State Space Sequence World Models

Paul Mattes, Rainer Schlosser, Ralf Herbrich

ICML 2024arXiv:2310.05167
8
citations

How Does Goal Relabeling Improve Sample Efficiency?

Sirui Zheng, Chenjia Bai, Zhuoran Yang et al.

ICML 2024

Learning to Play Atari in a World of Tokens

Pranav Agarwal, Sheldon Andrews, Samira Ebrahimi Kahou

ICML 2024arXiv:2406.01361
6
citations

LLM-Empowered State Representation for Reinforcement Learning

Boyuan Wang, Yun Qu, Yuhang Jiang et al.

ICML 2024arXiv:2407.13237
24
citations

Model-based Reinforcement Learning for Parameterized Action Spaces

Renhao Zhang, Haotian Fu, Yilin Miao et al.

ICML 2024arXiv:2404.03037
8
citations

Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL

Yu Luo, Tianying Ji, Fuchun Sun et al.

ICML 2024arXiv:2405.18520
7
citations

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.

ICML 2024arXiv:2403.00514
41
citations

Quality-Diversity with Limited Resources

Ren-Jian Wang, Ke Xue, Cong Guan et al.

ICML 2024arXiv:2406.03731
3
citations

Reflective Policy Optimization

Yaozhong Gan, yan renye, zhe wu et al.

ICML 2024arXiv:2406.03678
2
citations

Reinforcement Learning within Tree Search for Fast Macro Placement

Zijie Geng, Jie Wang, Ziyan Liu et al.

ICML 2024

Reward Shaping for Reinforcement Learning with An Assistant Reward Agent

Haozhe Ma, Kuankuan Sima, Thanh Vinh Vo et al.

ICML 2024

Rich-Observation Reinforcement Learning with Continuous Latent Dynamics

Yuda Song, Lili Wu, Dylan Foster et al.

ICML 2024arXiv:2405.19269
2
citations

Sample-Efficient Multiagent Reinforcement Learning with Reset Replay

Yaodong Yang, Guangyong Chen, Jianye Hao et al.

ICML 2024

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Ziping Xu, Zifan Xu, Runxuan Jiang et al.

ICLR 2024arXiv:2403.01636
2
citations

SAPG: Split and Aggregate Policy Gradients

Jayesh Singla, Ananye Agarwal, Deepak Pathak

ICML 2024arXiv:2407.20230
13
citations

Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic

Tianying Ji, Yu Luo, Fuchun Sun et al.

ICML 2024arXiv:2306.02865
21
citations

SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning

Matthias Weissenbacher, Rishabh Agarwal, Yoshinobu Kawahara

ICML 2024arXiv:2406.15025
1
citations
PreviousNext