"zero-shot generalization" Papers

87 papers found • Page 1 of 2

$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning

Xiaojun Guo, Ang Li, Yifei Wang et al.

NEURIPS 2025
4
citations

Aether: Geometric-Aware Unified World Modeling

Haoyi Zhu, Yifan Wang, Jianjun Zhou et al.

ICCV 2025arXiv:2503.18945
50
citations

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Jianyang Zhang, Qianli Luo, Guowu Yang et al.

CVPR 2025arXiv:2503.20301

Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures

Tim Seizinger, Florin-Alexandru Vasluianu, Marcos Conde et al.

ICCV 2025highlightarXiv:2503.16067
5
citations

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Songhua Liu, Zhenxiong Tan, Xinchao Wang

NEURIPS 2025arXiv:2412.16112
20
citations

Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Zigeng Chen, Xinyin Ma, Gongfan Fang et al.

CVPR 2025arXiv:2411.17787
21
citations

Compositional Entailment Learning for Hyperbolic Vision-Language Models

Avik Pal, Max van Spengler, Guido D'Amely di Melendugno et al.

ICLR 2025arXiv:2410.06912
37
citations

ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning

Hongshu Guo, Zeyuan Ma, Jiacheng Chen et al.

AAAI 2025paperarXiv:2412.07507
12
citations

Cross-Embodiment Dexterous Grasping with Reinforcement Learning

Haoqi Yuan, Bohan Zhou, Yuhui Fu et al.

ICLR 2025arXiv:2410.02479
20
citations

DEFOM-Stereo: Depth Foundation Model Based Stereo Matching

Hualie Jiang, Zhiqiang Lou, Laiyan Ding et al.

CVPR 2025arXiv:2501.09466
40
citations

Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera

Yuliang Guo, Sparsh Garg, S. Mahdi H. Miangoleh et al.

CVPR 2025arXiv:2501.02464
17
citations

Detect Anything 3D in the Wild

Hanxue Zhang, Haoran Jiang, Qingsong Yao et al.

ICCV 2025arXiv:2504.07958
13
citations

Disentangling Representations through Multi-task Learning

Pantelis Vafidis, Aman Bhargava, Antonio Rangel

ICLR 2025arXiv:2407.11249
5
citations

DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

Tianhong Zhou, xu yin, Yingtao Zhu et al.

NEURIPS 2025arXiv:2505.24173
5
citations

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

Yuxuan Zhang, Yirui Yuan, Yiren Song et al.

ICCV 2025arXiv:2503.07027
75
citations

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Zhuofan Zong, Dongzhi Jiang, Bingqi Ma et al.

ICML 2025arXiv:2412.09618
20
citations

EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation

Carl Qi, Dan Haramati, Tal Daniel et al.

ICLR 2025oralarXiv:2412.18907
4
citations

EmbodiedSAM: Online Segment Any 3D Thing in Real Time

Xiuwei Xu, Huangxing Chen, Linqing Zhao et al.

ICLR 2025arXiv:2408.11811
35
citations

Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games

Runyu Lu, Peng Zhang, Ruochuan Shi et al.

NEURIPS 2025arXiv:2511.00811
2
citations

Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization

Jiaming Zhou, Ke Ye, Jiayi Liu et al.

NEURIPS 2025arXiv:2505.15660
18
citations

GenM3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation

Junyu Shi, Lijiang LIU, Yong Sun et al.

ICCV 2025
3
citations

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

Quan Zhang, Yuxin Qi, Xi Tang et al.

ICLR 2025arXiv:2502.02454
7
citations

IPFormer: Visual 3D Panoptic Scene Completion with Context-Adaptive Instance Proposals

Markus Gross, Aya Fahmy, Danit Niwattananan et al.

NEURIPS 2025arXiv:2506.20671

Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation

Jiyuan Wang, Chunyu Lin, cheng guan et al.

NEURIPS 2025arXiv:2503.15905
13
citations

KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA

Xiaorui Su, Yibo Wang, Shanghua Gao et al.

ICLR 2025arXiv:2410.04660
19
citations

Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks

Michael Matthews, Michael Beukman, Chris Lu et al.

ICLR 2025arXiv:2410.23208
21
citations

Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

Vinam Arora, Divyansha Lachi, Ian Knight et al.

NEURIPS 2025oralarXiv:2512.01199

Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels

Zhizheng Liu, Joe Lin, Wayne Wu et al.

ICLR 2025arXiv:2410.07500
2
citations

Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings

Yehya Farhat, Hamza ElMokhtar Shili, Fangshuo Liao et al.

NEURIPS 2025arXiv:2306.08586
3
citations

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Haian Jin, Hanwen Jiang, Hao Tan et al.

ICLR 2025arXiv:2410.17242
96
citations

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.

NEURIPS 2025arXiv:2505.24292

Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions

Wenxuan Bao, Ruxi Deng, Jingrui He

NEURIPS 2025arXiv:2510.22127
1
citations

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Mingjie Pan, Jiyao Zhang, Tianshu Wu et al.

CVPR 2025highlightarXiv:2501.03841
47
citations

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025
4
citations

OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-time Emotional Speech Synthesis

Run Luo, Ting-En Lin, Haonan Zhang et al.

NEURIPS 2025

OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts

Shiting (Ginny) Xiao, Rishabh Kabra, Yuhang Li et al.

NEURIPS 2025spotlightarXiv:2507.05427
2
citations

OW-OVD: Unified Open World and Open Vocabulary Object Detection

Xing Xi, Yangyang Huang, Ronghua Luo et al.

CVPR 2025
3
citations

PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency

Haotian Wang, Aoran Xiao, Xiaoqin Zhang et al.

ICCV 2025arXiv:2507.07374

Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues

Chinmay Talegaonkar, Nikhil Gandudi Suresh, Zachary Novack et al.

NEURIPS 2025spotlightarXiv:2505.17358
1
citations

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025arXiv:2404.15228
16
citations

Scalable Autoregressive Monocular Depth Estimation

Jinhong Wang, Jintai Chen, Jian liu et al.

CVPR 2025arXiv:2411.11361
4
citations

Scale-invariant attention

Ben Anson, Xi Wang, Laurence Aitchison

NEURIPS 2025arXiv:2505.17083
2
citations

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Jensen Zhou, Hang Gao, Vikram Voleti et al.

ICCV 2025arXiv:2503.14489
87
citations

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025arXiv:2503.20211
16
citations

Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models

Chengyu Du, Jinyi Han, Yizhou Ying et al.

ICLR 2025arXiv:2410.13413
5
citations

Tree-Guided Diffusion Planner

Hyeonseong Jeon, Cheolhong Min, Jaesik Park

NEURIPS 2025arXiv:2508.21800
1
citations

UGM2N: An Unsupervised and Generalizable Mesh Movement Network via M-Uniform Loss

Zhichao Wang, Xinhai Chen, Qinglin Wang et al.

NEURIPS 2025arXiv:2508.08615
1
citations

UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation

Emmanuelle Bourigault, Amir Jamaludin, Abdullah Hamdi

ICCV 2025arXiv:2504.06908
4
citations

UniGTE: Unified Graph–Text Encoding for Zero-Shot Generalization across Graph Tasks and Domains

Duo Wang, Yuan Zuo, Guangyue Lu et al.

NEURIPS 2025arXiv:2510.16885

Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation

Jingbo Sun, Songjun Tu, Qichao Zhang et al.

ICLR 2025
PreviousNext