Oral "visual token compression" Papers
3 papers found
Conference
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li, Shujie HU, Shujie LIU et al.
ICLR 2025oralarXiv:2410.20502
28
citations
OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data
Hao Luo, Zihao Yue, Wanpeng Zhang et al.
NEURIPS 2025oral
StreamForest: Efficient Online Video Understanding with Persistent Event Memory
Xiangyu Zeng, Kefan Qiu, Qingyu Zhang et al.
NEURIPS 2025oralarXiv:2509.24871
6
citations