Poster "value function learning" Papers
2 papers found
Conference
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Ziyu Wan, Xidong Feng, Muning Wen et al.
ICML 2024arXiv:2309.17179
304
citations
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
Chang Chen, Junyeob Baek, Fei Deng et al.
ICML 2024arXiv:2406.06793
4
citations