Poster "multimodal instruction tuning" Papers
3 papers found
Conference
Harnessing Webpage UIs for Text-Rich Visual Understanding
Junpeng Liu, Tianyue Ou, Yifan Song et al.
ICLR 2025arXiv:2410.13824
22
citations
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang, Shoutao Guo, Yan Zhou et al.
ICLR 2025arXiv:2409.06666
135
citations
Re-Imagining Multimodal Instruction Tuning: A Representation View
Yiyang Liu, James Liang, Ruixiang Tang et al.
ICLR 2025arXiv:2503.00723
13
citations