"reward alignment" Papers
5 papers found
Conference
Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences
Jing-An Sun, Hang Fan, Junchao Gong et al.
NEURIPS 2025arXiv:2505.22008
2
citations
Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
Haitong Ma, Haoran Yu, Haobo Fu et al.
NEURIPS 2025
Moment- and Power-Spectrum-Based Gaussianity Regularization for Text-to-Image Models
Jisung Hwang, Jaihoon Kim, Minhyuk Sung
NEURIPS 2025arXiv:2509.07027
1
citations
Unhackable Temporal Reward for Scalable Video MLLMs
En Yu, Kangheng Lin, Liang Zhao et al.
ICLR 2025oralarXiv:2502.12081
22
citations
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Yuwei Fu, Haichao Zhang, di wu et al.
ICML 2024arXiv:2406.00645
26
citations