Poster "zero-shot generalization" Papers

72 papers found • Page 1 of 2

Filters:poster zero-shot generalization Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning

Xiaojun Guo, Ang Li, Yifei Wang et al.

NEURIPS 2025

citations

Aether: Geometric-Aware Unified World Modeling

Haoyi Zhu, Yifan Wang, Jianjun Zhou et al.

ICCV 2025arXiv:2503.18945

citations

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Jianyang Zhang, Qianli Luo, Guowu Yang et al.

CVPR 2025arXiv:2503.20301

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Songhua Liu, Zhenxiong Tan, Xinchao Wang

NEURIPS 2025arXiv:2412.16112

citations

Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Zigeng Chen, Xinyin Ma, Gongfan Fang et al.

CVPR 2025arXiv:2411.17787

citations

Compositional Entailment Learning for Hyperbolic Vision-Language Models

Avik Pal, Max van Spengler, Guido D'Amely di Melendugno et al.

ICLR 2025arXiv:2410.06912

citations

Cross-Embodiment Dexterous Grasping with Reinforcement Learning

Haoqi Yuan, Bohan Zhou, Yuhui Fu et al.

ICLR 2025arXiv:2410.02479

citations

DEFOM-Stereo: Depth Foundation Model Based Stereo Matching

Hualie Jiang, Zhiqiang Lou, Laiyan Ding et al.

CVPR 2025arXiv:2501.09466

citations

Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera

Yuliang Guo, Sparsh Garg, S. Mahdi H. Miangoleh et al.

CVPR 2025arXiv:2501.02464

citations

Detect Anything 3D in the Wild

Hanxue Zhang, Haoran Jiang, Qingsong Yao et al.

ICCV 2025arXiv:2504.07958

citations

Disentangling Representations through Multi-task Learning

Pantelis Vafidis, Aman Bhargava, Antonio Rangel

ICLR 2025arXiv:2407.11249

citations

DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

Tianhong Zhou, xu yin, Yingtao Zhu et al.

NEURIPS 2025arXiv:2505.24173

citations

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

Yuxuan Zhang, Yirui Yuan, Yiren Song et al.

ICCV 2025arXiv:2503.07027

citations

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Zhuofan Zong, Dongzhi Jiang, Bingqi Ma et al.

ICML 2025arXiv:2412.09618

citations

EmbodiedSAM: Online Segment Any 3D Thing in Real Time

Xiuwei Xu, Huangxing Chen, Linqing Zhao et al.

ICLR 2025arXiv:2408.11811

citations

Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games

Runyu Lu, Peng Zhang, Ruochuan Shi et al.

NEURIPS 2025arXiv:2511.00811

citations

Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization

Jiaming Zhou, Ke Ye, Jiayi Liu et al.

NEURIPS 2025arXiv:2505.15660

citations

GenM3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation

Junyu Shi, Lijiang LIU, Yong Sun et al.

ICCV 2025

citations

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

Quan Zhang, Yuxin Qi, Xi Tang et al.

ICLR 2025arXiv:2502.02454

citations

IPFormer: Visual 3D Panoptic Scene Completion with Context-Adaptive Instance Proposals

Markus Gross, Aya Fahmy, Danit Niwattananan et al.

NEURIPS 2025arXiv:2506.20671

Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation

Jiyuan Wang, Chunyu Lin, cheng guan et al.

NEURIPS 2025arXiv:2503.15905

citations

KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA

Xiaorui Su, Yibo Wang, Shanghua Gao et al.

ICLR 2025arXiv:2410.04660

citations

Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks

Michael Matthews, Michael Beukman, Chris Lu et al.

ICLR 2025arXiv:2410.23208

citations

Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels

Zhizheng Liu, Joe Lin, Wayne Wu et al.

ICLR 2025arXiv:2410.07500

citations

Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings

Yehya Farhat, Hamza ElMokhtar Shili, Fangshuo Liao et al.

NEURIPS 2025arXiv:2306.08586

citations

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Haian Jin, Hanwen Jiang, Hao Tan et al.

ICLR 2025arXiv:2410.17242

citations

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.

NEURIPS 2025arXiv:2505.24292

Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions

Wenxuan Bao, Ruxi Deng, Jingrui He

NEURIPS 2025arXiv:2510.22127

citations

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025

citations

OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-time Emotional Speech Synthesis

Run Luo, Ting-En Lin, Haonan Zhang et al.

NEURIPS 2025

OW-OVD: Unified Open World and Open Vocabulary Object Detection

Xing Xi, Yangyang Huang, Ronghua Luo et al.

CVPR 2025

citations

PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency

Haotian Wang, Aoran Xiao, Xiaoqin Zhang et al.

ICCV 2025arXiv:2507.07374

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025arXiv:2404.15228

citations

Scalable Autoregressive Monocular Depth Estimation

Jinhong Wang, Jintai Chen, Jian liu et al.

CVPR 2025arXiv:2411.11361

citations

Scale-invariant attention

Ben Anson, Xi Wang, Laurence Aitchison

NEURIPS 2025arXiv:2505.17083

citations

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Jensen Zhou, Hang Gao, Vikram Voleti et al.

ICCV 2025arXiv:2503.14489

citations

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025arXiv:2503.20211

citations

Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models

Chengyu Du, Jinyi Han, Yizhou Ying et al.

ICLR 2025arXiv:2410.13413

citations

Tree-Guided Diffusion Planner

Hyeonseong Jeon, Cheolhong Min, Jaesik Park

NEURIPS 2025arXiv:2508.21800

citations

UGM2N: An Unsupervised and Generalizable Mesh Movement Network via M-Uniform Loss

Zhichao Wang, Xinhai Chen, Qinglin Wang et al.

NEURIPS 2025arXiv:2508.08615

citations

UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation

Emmanuelle Bourigault, Amir Jamaludin, Abdullah Hamdi

ICCV 2025arXiv:2504.06908

citations

UniGTE: Unified Graph–Text Encoding for Zero-Shot Generalization across Graph Tasks and Domains

Duo Wang, Yuan Zuo, Guangyue Lu et al.

NEURIPS 2025arXiv:2510.16885

Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation

Jingbo Sun, Songjun Tu, Qichao Zhang et al.

ICLR 2025

vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.

CVPR 2025arXiv:2411.17386

citations

Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers

Omer Sahin Tas, Royden Wagner

ICLR 2025arXiv:2406.11624

citations

ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual Decoding

Haonan Wang, Jingyu Lu, Hongrui Li et al.

NEURIPS 2025arXiv:2510.27128

Zero-shot Inexact CAD Model Alignment from a Single Image

Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.

ICCV 2025arXiv:2507.03292

Zero-Shot Monocular Scene Flow Estimation in the Wild

Yiqing Liang, Abhishek Badki, Hang Su et al.

CVPR 2025arXiv:2501.10357

citations

BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models

Ye-Bin Moon, Nam Hyeon-Woo, Wonseok Choi et al.

ECCV 2024arXiv:2407.13442

citations

Bridging Environments and Language with Rendering Functions and Vision-Language Models

Théo Cachet, Christopher Dance, Olivier Sigaud

ICML 2024

← Previous

1 2