"sample efficiency" Papers

72 papers found • Page 1 of 2

Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees

Sangwoo Park, Matteo Zecchin, Osvaldo Simeone

NEURIPS 2025spotlightarXiv:2505.18659
4
citations

A Differential and Pointwise Control Approach to Reinforcement Learning

Minh Nguyen, Chandrajit Bajaj

NEURIPS 2025arXiv:2404.15617
1
citations

Avoiding exp(R) scaling in RLHF through Preference-based Exploration

Mingyu Chen, Yiding Chen, Wen Sun et al.

NEURIPS 2025
3
citations

BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models

Peiyan Li, Yixiang Chen, Hongtao Wu et al.

NEURIPS 2025arXiv:2506.07961
30
citations

Causal Information Prioritization for Efficient Reinforcement Learning

Hongye Cao, Fan Feng, Tianpei Yang et al.

ICLR 2025arXiv:2502.10097
5
citations

Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering

Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach

ICLR 2025arXiv:2410.01660
5
citations

Contrastive Representation for Interactive Recommendation

Jingyu Li, Zhiyong Feng, Dongxiao He et al.

AAAI 2025paperarXiv:2412.18396

Direct Alignment with Heterogeneous Preferences

Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.

NEURIPS 2025arXiv:2502.16320
10
citations

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

Wenlong Wang, Ivana Dusparic, Yucheng Shi et al.

ICLR 2025arXiv:2410.08893
3
citations

DyMoDreamer: World Modeling with Dynamic Modulation

Boxuan Zhang, Runqing Wang, Wei Xiao et al.

NEURIPS 2025oralarXiv:2509.24804

EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling

Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.

NEURIPS 2025spotlightarXiv:2502.00466
2
citations

Efficient Multi-Policy Evaluation for Reinforcement Learning

Shuze Daniel Liu, Claire Chen, Shangtong Zhang

AAAI 2025paperarXiv:2408.08706
2
citations

Efficient Reinforcement Learning with Large Language Model Priors

Xue Yan, Yan Song, Xidong Feng et al.

ICLR 2025arXiv:2410.07927
21
citations

Flow Equivariant Recurrent Neural Networks

Andy Keller

NEURIPS 2025spotlightarXiv:2507.14793
4
citations

GLAM: Global-Local Variation Awareness in Mamba-based World Model

Qian He, Wenqi Liang, Chunhui Hao et al.

AAAI 2025paperarXiv:2501.11949
1
citations

Investigating Relational State Abstraction in Collaborative MARL

Sharlin Utke, Jeremie Houssineau, Giovanni Montana

AAAI 2025paperarXiv:2412.15388
1
citations

JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data

Runjian Chen, Wenqi Shao, Bo Zhang et al.

CVPR 2025arXiv:2503.08422
2
citations

Kernel Learning for Sample Constrained Black-Box Optimization

Rajalaxmi Rajagopalan, Yu-Lin Wei, Romit Roy Choudhury

AAAI 2025paperarXiv:2507.20533

Learning (Approximately) Equivariant Networks via Constrained Optimization

Andrei Manolache, Luiz Chamon, Mathias Niepert

NEURIPS 2025oralarXiv:2505.13631
3
citations

Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning

Oleh Kolner, Thomas Ortner, Stanisław Woźniak et al.

ICLR 2025arXiv:2409.20213

Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning

Chenjie Hao, Weyl Lu, Yifan Xu et al.

CVPR 2025arXiv:2504.07095
5
citations

ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding

Indraneil Paul, Haoyi Yang, Goran Glavaš et al.

ICLR 2025arXiv:2504.00019
3
citations

Off-policy Reinforcement Learning with Model-based Exploration Augmentation

Likun Wang, Xiangteng Zhang, Yinuo Wang et al.

NEURIPS 2025arXiv:2510.25529

On scalable and efficient training of diffusion samplers

Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.

NEURIPS 2025arXiv:2505.19552
6
citations

PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment

Daiwei Chen, Yi Chen, Aniket Rege et al.

ICLR 2025
9
citations

Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning

Ziheng Cheng, Tianyu Xie, Shiyue Zhang et al.

NEURIPS 2025arXiv:2502.04491
2
citations

QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing

Grace Zhang, Ayush Jain, Injune Hwang et al.

ICLR 2025oralarXiv:2302.00671
5
citations

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning

Timo Kaufmann, Yannick Metz, Daniel Keim et al.

NEURIPS 2025arXiv:2512.25023

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts

Minh Le, Chau Nguyen, Huy Nguyen et al.

ICLR 2025arXiv:2410.02200
14
citations

Safety-Prioritizing Curricula for Constrained Reinforcement Learning

Cevahir Koprulu, Thiago Simão, Nils Jansen et al.

ICLR 2025

Sample- and Parameter-Efficient Auto-Regressive Image Models

Elad Amrani, Leonid Karlinsky, Alex M. Bronstein

CVPR 2025arXiv:2411.15648
2
citations

Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation

Byunghyun Kim, Minyoung Bae, Jae-Gil Lee

NEURIPS 2025

Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization

Daniel Palenicek, Florian Vogt, Joe Watson et al.

NEURIPS 2025arXiv:2502.07523
9
citations

ShiQ: Bringing back Bellman to LLMs

Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.

NEURIPS 2025arXiv:2505.11081
2
citations

Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

Samuel Garcin, Trevor McInroe, Pablo Samuel Castro et al.

ICLR 2025arXiv:2503.06343
5
citations

Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control

Georgios Papoudakis, Thomas Coste, Jianye Hao et al.

NEURIPS 2025arXiv:2509.01720

Thompson Sampling in Function Spaces via Neural Operators

Rafael Oliveira, Xuesong Wang, Kian Ming Chai et al.

NEURIPS 2025arXiv:2506.21894

Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning

Yunpeng Jiang, Jianshu Hu, Paul Weng et al.

NEURIPS 2025oralarXiv:2505.13925

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound

Tal Fiskus, Uri Shaham

NEURIPS 2025arXiv:2507.11269

When Should We Prefer State-to-Visual DAgger over Visual Reinforcement Learning?

Tongzhou Mu, Zhaoyang Li, Stanisław Wiktor Strzelecki et al.

AAAI 2025paperarXiv:2412.13662
5
citations

Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment

Chen Zhang, Qiang HE, Yuan Zhou et al.

ICML 2024arXiv:2406.01103
6
citations

A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback

Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar et al.

ICML 2024arXiv:2405.12421
2
citations

Better & Faster Large Language Models via Multi-token Prediction

Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.

ICML 2024arXiv:2404.19737
232
citations

Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

Zizhao Wang, Caroline Wang, Xuesu Xiao et al.

AAAI 2024paperarXiv:2401.12497
9
citations

Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning

Guy Azran, Mohamad H Danesh, Stefano Albrecht et al.

AAAI 2024paperarXiv:2307.05209
2
citations

Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation

Michelle Pan, Mariah Schrum, Vivek Myers et al.

ICML 2024arXiv:2406.06714
2
citations

Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming

Hany Hamed, Subin Kim, Dongyeong Kim et al.

ICML 2024arXiv:2402.18866
6
citations

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

Shengjie Wang, Shaohuai Liu, Weirui Ye et al.

ICML 2024spotlightarXiv:2403.00564
31
citations

Episodic Return Decomposition by Difference of Implicitly Assigned Sub-trajectory Reward

Haoxin Lin, Hongqiu Wu, Jiaji Zhang et al.

AAAI 2024paperarXiv:2312.10642
3
citations

Feasible Reachable Policy Iteration

Shentao Qin, Yujie Yang, Yao Mu et al.

ICML 2024
PreviousNext