Poster "reinforcement learning" Papers

220 papers found • Page 4 of 5

Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts

Yanting Yang, Minghao Chen, Qibo Qiu et al.

ECCV 2024arXiv:2407.14872
5
citations

A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design

Zhihai Wang, Jie Wang, Dongsheng Zuo et al.

ICML 2024

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

Duojun Huang, Xinyu Xiong, Jie Ma et al.

CVPR 2024arXiv:2406.00480
24
citations

A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data

Wenqiang Li, Weijun Li, Lina Yu et al.

ICML 2024arXiv:2309.13705
13
citations

An Information Theoretic Approach to Interaction-Grounded Learning

Xiaoyan Hu, Farzan Farnia, Ho-fung Leung

ICML 2024arXiv:2401.05015
3
citations

Augmenting Decision with Hypothesis in Reinforcement Learning

Nguyen Minh Quang, Hady Lauw

ICML 2024

Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays

Qingyuan Wu, Simon Zhan, Yixuan Wang et al.

ICML 2024arXiv:2402.03141
4
citations

Cross-Domain Policy Adaptation by Capturing Representation Mismatch

Jiafei Lyu, Chenjia Bai, Jing-Wen Yang et al.

ICML 2024arXiv:2405.15369
25
citations

Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization

Gergely Neu, Nneka Okolo

ICML 2024arXiv:2402.13903
7
citations

Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design

Shuze Liu, Shangtong Zhang

ICML 2024arXiv:2301.13734
7
citations

Efficient World Models with Context-Aware Tokenization

Vincent Micheli, Eloi Alonso, François Fleuret

ICML 2024arXiv:2406.19320
18
citations

EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search

Pengyi Li, Yan Zheng, Hongyao Tang et al.

ICML 2024

Fair Resource Allocation in Multi-Task Learning

Hao Ban, Kaiyi Ji

ICML 2024arXiv:2402.15638
35
citations

Feedback Efficient Online Fine-Tuning of Diffusion Models

Masatoshi Uehara, Yulai Zhao, Kevin Black et al.

ICML 2024arXiv:2402.16359
44
citations

Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation

Rahul Singh, Akshay Mete, Avik Kar et al.

ICML 2024

Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning

Hengkai Tan, LIU SONGMING, Kai Ma et al.

ICML 2024arXiv:2405.19885
3
citations

FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

Yuwei Fu, Haichao Zhang, di wu et al.

ICML 2024arXiv:2406.00645
26
citations

Hieros: Hierarchical Imagination on Structured State Space Sequence World Models

Paul Mattes, Rainer Schlosser, Ralf Herbrich

ICML 2024arXiv:2310.05167
8
citations

Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States

Noam Razin, Yotam Alexander, Edo Cohen-Karlik et al.

ICML 2024arXiv:2402.07875
4
citations

Improving Token-Based World Models with Parallel Observation Prediction

Lior Cohen, Kaixin Wang, Bingyi Kang et al.

ICML 2024arXiv:2402.05643
12
citations

Iterative Regularized Policy Optimization with Imperfect Demonstrations

Xudong Gong, Feng Dawei, Kele Xu et al.

ICML 2024

Knowledge-aware Reinforced Language Models for Protein Directed Evolution

Yuhao Wang, Qiang Zhang, Ming Qin et al.

ICML 2024

Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Zelai Xu, Chao Yu, Fei Fang et al.

ICML 2024arXiv:2310.18940
136
citations

Large-scale Reinforcement Learning for Diffusion Models

Yinan Zhang, Eric Tzeng, Yilun Du et al.

ECCV 2024arXiv:2401.12244
77
citations

Learning Causal Dynamics Models in Object-Oriented Environments

Zhongwei Yu, Jingqing Ruan, Dengpeng Xing

ICML 2024arXiv:2405.12615
4
citations

Learning the Target Network in Function Space

Kavosh Asadi, Yao Liu, Shoham Sabach et al.

ICML 2024arXiv:2406.01838
1
citations

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces

Brahma Pavse, Matthew Zurek, Yudong Chen et al.

ICML 2024arXiv:2306.01896
3
citations

Linguistic Calibration of Long-Form Generations

Neil Band, Xuechen Li, Tengyu Ma et al.

ICML 2024arXiv:2404.00474
48
citations

LLM-Empowered State Representation for Reinforcement Learning

Boyuan Wang, Yun Qu, Yuhang Jiang et al.

ICML 2024arXiv:2407.13237
24
citations

Multimodal Label Relevance Ranking via Reinforcement Learning

Taian Guo, Taolin Zhang, Haoqian Wu et al.

ECCV 2024arXiv:2407.13221
1
citations

Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback

Asaf Cassel, Haipeng Luo, Aviv Rosenberg et al.

ICML 2024arXiv:2405.07637
5
citations

No-Regret Reinforcement Learning in Smooth MDPs

Davide Maran, Alberto Maria Metelli, Matteo Papini et al.

ICML 2024arXiv:2402.03792
4
citations

OMPO: A Unified Framework for RL under Policy and Dynamics Shifts

Yu Luo, Tianying Ji, Fuchun Sun et al.

ICML 2024arXiv:2405.19080
4
citations

On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation

Álvaro Labarca Silva, Denis Parra, Rodrigo A Toro Icarte

ICML 2024

Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners

Chengjie Wu, Hao Hu, yiqin yang et al.

ICML 2024

Policy-conditioned Environment Models are More Generalizable

Ruifeng Chen, Xiong-Hui Chen, Yihao Sun et al.

ICML 2024

Position: Social Environment Design Should be Further Developed for AI-based Policy-Making

Edwin Zhang, Sadie Zhao, Tonghan Wang et al.

ICML 2024

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

Han Shen, Zhuoran Yang, Tianyi Chen

ICML 2024arXiv:2402.06886
30
citations

Probabilistic Constrained Reinforcement Learning with Formal Interpretability

YANRAN WANG, QIUCHEN QIAN, David Boyle

ICML 2024arXiv:2307.07084
5
citations

Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback

GUOJUN XIONG, Jian Li

ICML 2024arXiv:2405.00950
1
citations

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

Yingru Li, Jiawei Xu, Lei Han et al.

ICML 2024arXiv:2402.10228
8
citations

Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

Quan Nguyen, Adji Bousso Dieng

ICML 2024arXiv:2405.02449
15
citations

Random Latent Exploration for Deep Reinforcement Learning

Srinath Mahankali, Zhang-Wei Hong, Ayush Sekhari et al.

ICML 2024arXiv:2407.13755
8
citations

ReCoRe: Regularized Contrastive Representation Learning of World Model

Rudra P, K. Poudel, Harit Pandya et al.

CVPR 2024arXiv:2312.09056
14
citations

Reinforcement Learning and Regret Bounds for Admission Control

Lucas Weber, Ana Busic, Jiamin ZHU

ICML 2024arXiv:2406.04766
1
citations

Reinforcement Learning Meets Visual Odometry

Nico Messikommer, Giovanni Cioffi, Mathias Gehrig et al.

ECCV 2024arXiv:2407.15626
14
citations

Reinforcement Learning via Auxillary Task Distillation

Abhinav Narayan Harish, Larry Heck, Josiah P Hanna et al.

ECCV 2024

Reinforcement Learning within Tree Search for Fast Macro Placement

Zijie Geng, Jie Wang, Ziyan Liu et al.

ICML 2024

Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making

Parand A. Alamdari, Toryn Q. Klassen, Elliot Creager et al.

ICML 2024arXiv:2312.04772
7
citations

Rethinking Transformers in Solving POMDPs

Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.

ICML 2024arXiv:2405.17358
9
citations