"off-policy training" Papers
3 papers found
Conference
Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
Kiyoung Seong, Seonghyun Park, Seonghwan Kim et al.
ICLR 2025arXiv:2405.19961
9
citations
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko, Sungnyun Kim, Tianyi Chen et al.
ICML 2024arXiv:2402.03898
73
citations
Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing
Jinmin He, Kai Li, Yifan Zang et al.
AAAI 2024paperarXiv:2312.14472
11
citations