Poster "image understanding" Papers

10 papers found

Auto-Vocabulary Semantic Segmentation

Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.

ICCV 2025arXiv:2312.04539
4
citations

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Dongseob Kim, Hyunjung Shim

CVPR 2025arXiv:2503.16873

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

Jiahui Gao, Renjie Pi, Jipeng Zhang et al.

ICLR 2025arXiv:2312.11370
174
citations

LMFusion: Adapting Pretrained Language Models for Multimodal Generation

Weijia Shi, Xiaochuang Han, Chunting Zhou et al.

NEURIPS 2025arXiv:2412.15188
86
citations

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning

Jiaqi Huang, Zunnan Xu, Jun Zhou et al.

NEURIPS 2025arXiv:2505.22596
11
citations

SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference

Samir Khaki, Junxian Guo, Jiaming Tang et al.

ICCV 2025arXiv:2510.17777
1
citations

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Hao Li, Changyao TIAN, Jie Shao et al.

CVPR 2025arXiv:2412.09604
35
citations

Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning

wang lin, Wentao Hu, Liyu Jia et al.

NEURIPS 2025

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Liang Chen, Haozhe Zhao, Tianyu Liu et al.

ECCV 2024arXiv:2403.06764
368
citations

Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency

Hyeongjin Kim, Sangwon Kim, Dasom Ahn et al.

ICML 2024arXiv:2405.12648
7
citations