"reward signal design" Papers
2 papers found
Conference
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
Chris Rawles, Sarah Clinckemaillie, Yifan Chang et al.
ICLR 2025arXiv:2405.14573
207
citations
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Jorge (Zhoujun) Cheng, Shibo Hao, Tianyang Liu et al.
NEURIPS 2025arXiv:2506.14965
40
citations