Most Cited 2025 "conditional diffusion transformer" Papers

22,274 papers found • Page 86 of 112

Filters:Most Cited 2025 conditional diffusion transformer Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#17001

CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement

Feixiang Wang, Shuang Yang, Shiguang Shan et al.

ICCV 2025

#17002

End-to-End Entity-Predicate Association Reasoning for Dynamic Scene Graph Generation

LiWei Wang, YanDuo Zhang, Tao Lu et al.

ICCV 2025

#17003

AnomalyCoT: A Multi-Scenario Chain-of-Thought Dataset for Multimodal Large Language Models

Jiaxi Cheng, Yuliang Xu, Shoupeng Wang et al.

NEURIPS 2025

#17004

Towards Safer and Understandable Driver Intention Prediction

Mukilan Karuppasamy, Shankar Gangisetty, Shyam Nandan Rai et al.

ICCV 2025arXiv:2510.09200

#17005

Debiased Curriculum Adaptation for Safe Transfer Learning in Chest X-ray Classification

Mingyang Liu, Xinyang Chen, Yang Shu et al.

ICCV 2025

#17006

InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation

Zhuoran Yang, Xi Guo, Chenjing Ding et al.

ICCV 2025

#17007

GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking

Hyunseop Kim, Hyo-Jun Lee, Yonguk Lee et al.

CVPR 2025

#17008

Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging

Bo Wang, Dingwei Tan, Yen-Ling Kuo et al.

CVPR 2025arXiv:2411.09176

#17009

Rethinking DPO-style Diffusion Aligning Frameworks

XUN WU, Shaohan Huang, Lingjie Jiang et al.

ICCV 2025highlight

#17010

NormalLoc: Visual Localization on Textureless 3D Models using Surface Normals

Jiro Abe, Gaku Nakano, Kazumine Ogura

ICCV 2025

#17011

MMD-Regularized Unbalanced Optimal Transport

SakethaNath Jagarlapudi, Pratik Jawanpuria, Piyushi Manupriya

ICLR 2025

#17012

SPD: Shallow Backdoor Protecting Deep Backdoor Against Backdoor Detection

Shunjie Yuan, Xinghua Li, Xuelin Cao et al.

ICCV 2025

#17013

INSTINCT: Instance-Level Interaction Architecture for Query-Based Collaborative Perception

yunjiang xu, Yupeng Ouyang, Lingzhi Li et al.

ICCV 2025arXiv:2509.23700

#17014

Navigating the Unseen: Zero-shot Scene Graph Generation via Capsule-Based Equivariant Features

Wenhuan Huang, Yi JI, guiqian zhu et al.

CVPR 2025

#17015

Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders

Wang Lin, Qingsong Wang, Yueying Feng et al.

CVPR 2025

#17016

Hunyuan-Portrait: Implicit Condition Control for Enhanced Portrait Animation

Zunnan Xu, Zhentao Yu, Zixiang Zhou et al.

CVPR 2025

#17017

DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers

Mert Bülent Sarıyıldız, Philippe Weinzaepfel, Thomas Lucas et al.

CVPR 2025

#17018

Kaleidoscopic Background Attack: Disrupting Pose Estimation with Multi-Fold Radial Symmetry Textures

Xinlong Ding, Hongwei Yu, Jiawei Li et al.

ICCV 2025highlightarXiv:2507.10265

#17019

Task-Aware Clustering for Prompting Vision-Language Models

Fusheng Hao, Fengxiang He, Fuxiang Wu et al.

CVPR 2025

#17020

NGD: Neural Gradient Based Deformation for Monocular Garment Reconstruction

Soham Dasgupta, Shanthika Naik, Preet Savalia et al.

ICCV 2025arXiv:2508.17712

#17021

Vision-Language Neural Graph Featurization for Extracting Retinal Lesions

Taimur Hassan, Anabia Sohail, Muzammal Naseer et al.

ICCV 2025

#17022

Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales

Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram et al.

CVPR 2025arXiv:2412.19867

#17023

Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning

Hairui Ren, Fan Tang, He Zhao et al.

CVPR 2025arXiv:2504.11930

#17024

Activating Sparse Part Concepts for 3D Class Incremental Learning

Zhenya Tian, Jun Xiao, Liu lupeng et al.

CVPR 2025

#17025

Lifting the Structural Morphing for Wide-Angle Images Rectification: Unified Content and Boundary Modeling

Wenting Luan, Siqi Lu, Yongbin Zheng et al.

ICCV 2025

#17026

Model Diagnosis and Correction via Linguistic and Implicit Attribute Editing

Xuanbai Chen, Xiang Xu, Zhihua Li et al.

CVPR 2025

#17027

PS-EIP: Robust Photometric Stereo Based on Event Interval Profile

Kazuma Kitazawa, Takahito Aoto, Satoshi Ikehata et al.

CVPR 2025arXiv:2503.18341

#17028

ROAR: Reducing Inversion Error in Generative Image Watermarking

Hanyi Wang, Han Fang, Shi-Lin Wang et al.

ICCV 2025

#17029

Repurposing in AI: A Distinct Approach or an Extension of Creative Problem Solving?

Aissatou Diallo, Antonis Bikakis, Luke Dickens et al.

ICLR 2025

#17030

Three-view Focal Length Recovery From Homographies

Yaqing Ding, Viktor Kocur, Zuzana Berger Haladova et al.

CVPR 2025arXiv:2501.07499

#17031

Bayesian-Inspired Space-Time Superpixels

Kent Gauen, Stanley Chan

ICCV 2025

#17032

Learning Normals of Noisy Points by Local Gradient-Aware Surface Filtering

Qing Li, Huifang Feng, Xun Gong et al.

ICCV 2025arXiv:2507.03394

#17033

Free2Guide: Training-Free Text-to-Video Alignment using Image LVLM

Jaemin Kim, Bryan Sangwoo Kim, Jong Ye

ICCV 2025

#17034

Visual Representation Learning through Causal Intervention for Controllable Image Editing

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025highlight

#17035

Dynamic Content Prediction with Motion-aware Priors for Blind Face Video Restoration

Lianxin Xie, csbingbing zheng, Si Wu et al.

CVPR 2025

#17036

Diffusion Transformer meets Multi-level Wavelet Spectrum for Single Image Super-Resolution

Peng Du, Hui Li, Han Xu et al.

ICCV 2025arXiv:2511.01175

#17037

When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection

Hongliang Zhou, Yongxiang Liu, Canyu Mo et al.

ICCV 2025

#17038

LightBSR: Towards Lightweight Blind Super-Resolution via Discriminative Implicit Degradation Representation Learning

Jiang Yuan, ji ma, Bo Wang et al.

ICCV 2025arXiv:2506.22710

#17039

RayletDF: Raylet Distance Fields for Generalizable 3D Surface Reconstruction from Point Clouds or Gaussians

Shenxing Wei, Jinxi Li, Yafei YANG et al.

ICCV 2025highlightarXiv:2508.09830

#17040

Semantic-guided Camera Ray Regression for Visual Localization

Yesheng Zhang, Xu Zhao

ICCV 2025

#17041

Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability

Seungju Yoo, Hyuk Kwon, Joong-Won Hwang et al.

ICCV 2025arXiv:2508.12082

#17042

Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

Sandeep Silwal, David Woodruff, Qiuyi (Richard) Zhang

ICLR 2025arXiv:2502.19865

#17043

Polarimetric Neural Field via Unified Complex-Valued Wave Representation

Chu Zhou, Yixin Yang, Junda Liao et al.

ICCV 2025

#17044

High-Precision 3D Measurement of Complex Textured Surfaces Using Multiple Filtering Approach

Yuchong Chen, Jian Yu, Shaoyan Gai et al.

ICCV 2025

#17045

Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities

Yiyuan Zhang, Handong Li, Jing Liu et al.

ICCV 2025

#17046

Learning to See Inside Opaque Liquid Containers using Speckle Vibrometry

Matan Kichler, Shai Bagon, Mark Sheinin

ICCV 2025arXiv:2507.20757

#17047

From Gallery to Wrist: Realistic 3D Bracelet Insertion in Videos

Chenjian Gao, Lihe Ding, Rui Han et al.

ICCV 2025arXiv:2507.20331

#17048

Adversarial Training for Probabilistic Robustness

YI ZHANG, Yuhang Chen, Zhen Chen et al.

ICCV 2025

#17049

CHORDS: Diffusion Sampling Accelerator with Multi-core Hierarchical ODE Solvers

Jiaqi Han, Haotian Ye, Puheng Li et al.

ICCV 2025arXiv:2507.15260

#17050

Chain-of-Thought Provably Enables Learning the (Otherwise) Unlearnable

Chenxiao Yang, Zhiyuan Li, David Wipf

ICLR 2025

#17051

Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View

Kaiyue Wen, Zhiyuan Li, Jason Wang et al.

ICLR 2025

#17052

Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy

Aditya Ganeshan, Thibault Groueix, Paul Guerrero et al.

CVPR 2025arXiv:2412.12463

#17053

Differential Transformer

Tianzhu Ye, Li Dong, Yuqing Xia et al.

ICLR 2025arXiv:2410.05258

#17054

PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark

Mingquan Feng, Yixin Huang, Yizhou Liu et al.

ICLR 2025

#17055

Diagnosing Pretrained Models for Out-of-distribution Detection

Haipeng Xiong, Kai Xu, Angela Yao

ICCV 2025

#17056

HiNeuS: High-fidelity Neural Surface Mitigating Low-texture and Reflective Ambiguity

Yida Wang, Xueyang Zhang, Kun Zhan et al.

ICCV 2025highlightarXiv:2506.23854

#17057

CoralSRT: Revisiting Coral Reef Semantic Segmentation by Feature Rectifying via Self-supervised Guidance

Zheng Ziqiang, Wong Kwan, Binh-Son Hua et al.

ICCV 2025

#17058

OV3D-CG: Open-vocabulary 3D Instance Segmentation with Contextual Guidance

Mingquan Zhou, Chen He, Ruiping Wang et al.

ICCV 2025

#17059

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights

Zhenghao Xing, Hao Chen, Binzhu Xie et al.

CVPR 2025

#17060

Harnessing Global-Local Collaborative Adversarial Perturbation for Anti-Customization

Long Xu, Jiakai Wang, Haojie Hao et al.

CVPR 2025

#17061

PVMamba: Parallelizing Vision Mamba via Dynamic State Aggregation

Fei Xie, Zhongdao Wang, Weijia Zhang et al.

ICCV 2025

#17062

Plug-and-Play PPO: An Adaptive Point Prompt Optimizer Making SAM Greater

Xueyu Liu, Rui Wang, Yexin Lai et al.

CVPR 2025

#17063

FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation

Wenbin Teng, Gonglin Chen, Haiwei Chen et al.

ICCV 2025arXiv:2508.06392

#17064

TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice

Shen Yan, Xingyan Bin, Sijun Zhang et al.

ICLR 2025

#17065

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE

Yazhou Xing, Yang Fei, Yingqing He et al.

ICCV 2025

#17066

Motion-2-to-3: Leveraging 2D Motion Data for 3D Motion Generations

Ruoxi Guo, Huaijin Pi, Zehong Shen et al.

ICCV 2025

#17067

I2-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

Zhimin Liao, Ping Wei, Ruijie Zhang et al.

ICCV 2025

#17068

InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation

Jungmin Lee, Seonghyuk Hong, Juyong Lee et al.

ICCV 2025arXiv:2510.17864

#17069

RIOcc: Efficient Cross-Modal Fusion Transformer with Collaborative Feature Refinement for 3D Semantic Occupancy Prediction

Baojie Fan, Xiaotian Li, Yuhan Zhou et al.

ICCV 2025

#17070

A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions

Qiang Li, Jian Ruan, Fanghao Wu et al.

CVPR 2025highlight

#17071

Geometric Alignment and Prior Modulation for View-Guided Point Cloud Completion on Unseen Categories

Jingqiao Xiu, Yicong Li, Na Zhao et al.

ICCV 2025

#17072

Do vision models perceive objects like toddlers ?

Arthur Aubret, Jochen Triesch

ICLR 2025

#17073

Open Set Label Shift with Test Time Out-of-Distribution Reference

Changkun Ye, Russell Tsuchida, Lars Petersson et al.

CVPR 2025arXiv:2505.05868

#17074

DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models

SeungHoo Hong, GeonHo Son, Juhun Lee et al.

ICCV 2025arXiv:2510.00778

#17075

Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics

Tianfang Zhu, Dongli Hu, Jiandong Zhou et al.

ICLR 2025oral

#17076

Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy

Ishank Juneja, Carlee Joe-Wong, Osman Yagan

ICLR 2025arXiv:2501.10290

#17077

PDFactor: Learning Tri-Perspective View Policy Diffusion Field for Multi-Task Robotic Manipulation

Jingyi Tian, Le Wang, Sanping Zhou et al.

CVPR 2025

#17078

Incomplete Multi-View Multi-label Learning via Disentangled Representation and Label Semantic Embedding

Xu Yan, Jun Yin, Jie Wen

CVPR 2025

#17079

CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition

Xuli Shen, Hua Cai, Weilin Shen et al.

CVPR 2025

#17080

Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection

Ziqi Li, Tao Gao, Yisheng An et al.

CVPR 2025

#17081

PointSR: Self-Regularized Point Supervision for Drone-View Object Detection

Weizhuo Li, Yue Xi, Wenjing Jia et al.

CVPR 2025

#17082

Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning

Menglong Zhang, Fuyuan Qian, Quanying Liu

ICLR 2025oralarXiv:2506.19785

#17083

CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction

Yunfei Teng, Yuxuan Ren, Kai Chen et al.

ICLR 2025

#17084

LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing

Federico Girella, Davide Talon, Ziyue Liu et al.

ICCV 2025arXiv:2507.22627

#17085

Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos

Yuang Feng, Shuyong Gao, Fuzhen Yan et al.

ICCV 2025arXiv:2503.17050

#17086

Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing

Jeongmin Yu, Susang Kim, Kisu Lee et al.

ICCV 2025arXiv:2509.06336

#17087

KAN: Kolmogorov–Arnold Networks

Ziming Liu, Yixuan Wang, Sachin Vaidya et al.

ICLR 2025

#17088

GFPack++: Attention-Driven Gradient Fields for Optimizing 2D Irregular Packing

Tianyang Xue, Lin Lu, Yang Liu et al.

ICCV 2025highlight

#17089

Online Clustering with Nearly Optimal Consistency

T-H. Hubert Chan, Shaofeng Jiang, Tianyi Wu et al.

ICLR 2025

#17090

Mitigating Catastrophic Overfitting in Fast Adversarial Training via Label Information Elimination

Chao Pan, Ke Tang, Li Qing et al.

ICCV 2025

#17091

MetaScope: Optics-Driven Neural Network for Ultra-Micro Metalens Endoscopy

Wuyang Li, Wentao Pan, Xiaoyuan Liu et al.

ICCV 2025highlightarXiv:2508.03596

#17092

Dual-Rate Dynamic Teacher for Source-Free Domain Adaptive Object Detection

Qi He, Xiao Wu, Jun-Yan He et al.

ICCV 2025

#17093

Camouflage Anything: Learning to Hide using Controlled Out-painting and Representation Engineering

Biplab Das, Viswanath Gopalakrishnan

CVPR 2025

#17094

Leveraging Temporal Cues for Semi-Supervised Multi-View 3D Object Detection

Jinhyung Park, Navyata Sanghvi, Hiroki Adachi et al.

CVPR 2025

#17095

Interpretable point cloud classification using multiple instance learning

Matt De Vries, Reed Naidoo, Olga Fourkioti et al.

ICCV 2025highlight

#17096

Mitigating Geometric Degradation in Fast DownSampling via FastAdapter for Point Cloud Segmentation

Shuofeng Sun, Haibin Yan

ICCV 2025

#17097

TryOn-Refiner: Conditional Rectified-flow-based TryOn Refiner for More Accurate Detail Reconstruction

Wen Qian

ICCV 2025

#17098

Regularized Proportional Fairness Mechanism for Resource Allocation Without Money

Sujay Bhatt, Alec Koppel, Sumitra Ganesh et al.

ICLR 2025arXiv:2501.01111

#17099

Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense

Siyu Luan, Zhenyi Wang, Li Shen et al.

ICLR 2025

#17100

Compositional Targeted Multi-Label Universal Perturbations

Hassan Mahmood, Ehsan Elhamifar

CVPR 2025

#17101

Protein Language Model Fitness is a Matter of Preference

Cade Gordon, Amy Lu, Pieter Abbeel

ICLR 2025

#17102

ODA-GAN: Orthogonal Decoupling Alignment GAN Assisted by Weakly-supervised Learning for Virtual Immunohistochemistry Staining

Tong Wang, Mingkang Wang, Zhongze Wang et al.

CVPR 2025

#17103

Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy

Wang, Zongqing Lu

ICLR 2025

#17104

Learning and aligning single-neuron invariance manifolds in visual cortex

Mohammad Bashiri, Luca Baroni, Ján Antolík et al.

ICLR 2025

#17105

LACONIC: A 3D Layout Adapter for Controllable Image Creation

Léopold Maillard, Tom Durand, Adrien RAMANANA RAHARY et al.

ICCV 2025arXiv:2507.03257

#17106

ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network

Zhuochen Yu, Bijie Qiu, Andy W. H. Khong

CVPR 2025

#17107

SEHDR: Single-Exposure HDR Novel View Synthesis via 3D Gaussian Bracketing

Yiyu Li, Haoyuan Wang, Ke Xu et al.

ICCV 2025arXiv:2509.20400

#17108

Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint

Jiafei Lyu, Mengbei Yan, Zhongjian Qiao et al.

ICLR 2025

#17109

High-Resolution Spatiotemporal Modeling with Global-Local State Space Models for Video-Based Human Pose Estimation

Runyang Feng, Hyung Jin Chang, Tze Ho Elden Tse et al.

ICCV 2025arXiv:2510.11017

#17110

Beyond Single-Modal Boundary: Cross-Modal Anomaly Detection through Visual Prototype and Harmonization

Kai Mao, Ping Wei, Yiyang Lian et al.

CVPR 2025

#17111

C2MIL: Synchronizing Semantic and Topological Causalities in Multiple Instance Learning for Robust and Interpretable Survival Analysis

Min Cen, Zhenfeng Zhuang, Yuzhe Zhang et al.

ICCV 2025

#17112

Text Augmented Correlation Transformer For Few-shot Classification & Segmentation

Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng et al.

CVPR 2025

#17113

TARS: Traffic-Aware Radar Scene Flow Estimation

Jialong Wu, Marco Braun, Dominic Spata et al.

ICCV 2025arXiv:2503.10210

#17114

EYE3:Turn Anything into Naked-eye 3D

Yingde Song, Zongyuan Yang, Baolin Liu et al.

ICCV 2025

#17115

TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model

Zhichao Zhai, Guikun Chen, Wenguan Wang et al.

CVPR 2025

#17116

Lost in Prediction: Why Social Media Narratives Don't Help Macroeconomic Forecasting?

Almog Gueta, Roi Reichart, Amir Feder et al.

ICLR 2025

#17117

All-Day Multi-Camera Multi-Target Tracking

Huijie Fan, Yu Qiao, Yihao Zhen et al.

CVPR 2025

#17118

Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding

Wenbo Chen, Zhen Xu, Ruotao Xu et al.

CVPR 2025

#17119

Conditional Visual Autoregressive Modeling for Pathological Image Restoration

Ziyi Liu, Zhe Xu, Jiabo MA et al.

ICCV 2025

#17120

A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds

Jizong Peng, Tze Ho Elden Tse, Kai Xu et al.

ICCV 2025highlightarXiv:2504.09129

#17121

Parametric Shadow Control for Portrait Generation in Text-to-Image Diffusion Models

Haoming Cai, Tsung-Wei Huang, Shiv Gehlot et al.

ICCV 2025arXiv:2503.21943

#17122

Leaps and Bounds: An Improved Point Cloud Winding Number Formulation for Fast Normal Estimation and Surface Reconstruction

Chamin Hewa Koneputugodage, Dylan Campbell, Stephen Gould

ICCV 2025

#17123

Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction

Ning Ni, Libao Zhang

CVPR 2025

#17124

ADD: Attribution-Driven Data Augmentation Framework for Boosting Image Super-Resolution

Zeyu Mi, Yu-Bin Yang

CVPR 2025

#17125

LOIRE: LifelOng learning on Incremental data via pre-trained language model gRowth Efficiently

Xue Han, Yitong Wang, Junlan Feng et al.

ICLR 2025

#17126

SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds

Jinfeng Xu, Xianzhi Li, Yuan Tang et al.

CVPR 2025arXiv:2506.13224

#17127

A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields

Aoxiang Fan, Corentin Dumery, Nicolas Talabot et al.

ICCV 2025arXiv:2507.04408

#17128

MultimodalStudio: A Heterogeneous Sensor Dataset and Framework for Neural Rendering across Multiple Imaging Modalities

Federico Lincetto, Gianluca Agresti, Mattia Rossi et al.

CVPR 2025arXiv:2503.19673

#17129

EEGMirror: Leveraging EEG data in the wild via Montage-Agnostic Self-Supervision for EEG to Video Decoding

Xuan-Hao Liu, Bao-liang Lu, Wei-Long Zheng

ICCV 2025

#17130

All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising

Xiaoling Zhou, Zhemg Lee, Wei Ye et al.

CVPR 2025highlight

#17131

DejaVid: Encoder-Agnostic Learned Temporal Matching for Video Classification

Darryl Ho, Samuel Madden

CVPR 2025arXiv:2506.12585

#17132

Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation

Qiang Zhang, Mengsheng Zhao, Jiawei Liu et al.

CVPR 2025

#17133

Harnessing Text-to-Image Diffusion Models for Point Cloud Self-Supervised Learning

Yiyang Chen, Shanshan Zhao, Lunhao Duan et al.

ICCV 2025arXiv:2507.09102

#17134

A Focused Human Body Model for Accurate Anthropometric Measurements Extraction

Shuhang Chen, Xianliang Huang, Zhizhou Zhong et al.

CVPR 2025

#17135

Optimality of Matrix Mechanism on $\ell_p^p$-metric

Zongrui Zou, Jingcheng Liu, Jalaj Upadhyay

ICLR 2025arXiv:2406.02140

#17136

OD-RASE: Ontology-Driven Risk Assessment and Safety Enhancement for Autonomous Driving

Kota Shimomura, Masaki Nambata, Atsuya Ishikawa et al.

ICCV 2025

#17137

Accelerating Diffusion Sampling via Exploiting Local Transition Coherence

shangwen zhu, Han Zhang, Zhantao Yang et al.

ICCV 2025arXiv:2503.09675

#17138

Exploring Timeline Control for Facial Motion Generation

Yifeng Ma, Jinwei Qi, Chaonan Ji et al.

CVPR 2025arXiv:2505.20861

#17139

Simulating Dual-Pixel Images From Ray Tracing For Depth Estimation

Fengchen He, Dayang Zhao, Hao Xu et al.

ICCV 2025arXiv:2503.11213

#17140

Aligning Global Semantics and Local Textures in Generative Video Enhancement

Zhikai Chen, Fuchen Long, Zhaofan Qiu et al.

ICCV 2025

#17141

Completing 3D Partial Assemblies with View-Consistent 2D-3D Correspondence

Weihao Wang, Yu Lan, Mingyu You et al.

ICCV 2025

#17142

MDP-Omni: Parameter-free Multimodal Depth Prior-based Sampling for Omnidirectional Stereo Matching

Eunjin Son, HyungGi Jo, Wookyong Kwon et al.

ICCV 2025

#17143

Text-to-Any-Skeleton Motion Generation Without Retargeting

Qingyuan Liu, Ke Lv, Kun Dong et al.

ICCV 2025

#17144

STEP-DETR: Advancing DETR-based Semi-Supervised Object Detection with Super Teacher and Pseudo-Label Guided Text Queries

Tahira Shehzadi, Khurram Azeem Hashmi, Shalini Sarode et al.

ICCV 2025

#17145

Bad-PFL: Exploiting Backdoor Attacks against Personalized Federated Learning

Mingyuan Fan, Zhanyi Hu, Fuyi Wang et al.

ICLR 2025

#17146

KDA: Knowledge Diffusion Alignment with Enhanced Context for Video Temporal Grounding

Ran Ran, Jiwei Wei, Shiyuan He et al.

ICCV 2025

#17147

Be More Specific: Evaluating Object-centric Realism in Synthetic Images

Anqi Liang, Ciprian Adrian Corneanu, Qianli Feng et al.

CVPR 2025

#17148

GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes

Yunxuan Li, Lei Fan, Xiaoying Xing et al.

CVPR 2025

#17149

EDM: Efficient Deep Feature Matching

Xi Li, Tong Rao, Cihui Pan

ICCV 2025highlightarXiv:2503.05122

#17150

AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

Xuying Zhang, Yupeng Zhou, Kai Wang et al.

ICCV 2025

#17151

SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images

Shuhang Chen, Hangjie Yuan, Pengwei Liu et al.

ICCV 2025arXiv:2511.08626

#17152

Layered Motion Fusion: Lifting Motion Segmentation to 3D in Egocentric Videos

Vadim Tschernezki, Diane Larlus, Andrea Vedaldi et al.

CVPR 2025arXiv:2506.05546

#17153

Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features

Daeho Um, Yoonji Lee, Jiwoong Park et al.

ICLR 2025

#17154

UniversalBooth: Model-Agnostic Personalized Text-to-Image Generation

Songhua Liu, Ruonan Yu, Xinchao Wang

ICCV 2025

#17155

An Illustrated Guide to Automatic Sparse Differentiation

Adrian Hill, Guillaume Dalle, Alexis Montoison

ICLR 2025

#17156

Adapting Pre-trained 3D Models for Point Cloud Video Understanding via Cross-frame Spatio-temporal Perception

Baixuan Lv, Yaohua Zha, Tao Dai et al.

CVPR 2025

#17157

EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation

Zengyu Wan, Wei Zhai, Yang Cao et al.

ICCV 2025arXiv:2503.11371

#17158

Task-Decoupled Bézier Surface Constraint for Uneven Low-Light Image Enhancement

Xingxiang Zhou, Xiangdong Su, Haoran Zhang et al.

ICCV 2025

#17159

Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion

Saad Lahlali, Sandra Kara, Hejer AMMAR et al.

CVPR 2025arXiv:2503.15022

#17160

3D Test-time Adaptation via Graph Spectral Driven Point Shift

Xin Wei, Qin Yang, Yijie Fang et al.

ICCV 2025arXiv:2507.18225

#17161

TOTP: Transferable Online Pedestrian Trajectory Prediction with Temporal-Adaptive Mamba Latent Diffusion

Ziyang Ren, Ping Wei, Shangqi Deng et al.

ICCV 2025

#17162

Rethinking Reconstruction and Denoising in the Dark: New Perspective, General Architecture and Beyond

Long Ma, Tengyu Ma, Ziye Li et al.

CVPR 2025

#17163

Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection

Qiao Zhang, Mingwen Shao, Xinyuan Chen et al.

ICCV 2025

#17164

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

Yuheng Feng, Changsong Wen, Zelin Peng et al.

CVPR 2025

#17165

Semantic Discrepancy-aware Detector for Image Forgery Identification

Wang Ziye, Minghang Yu, Chunyan Xu et al.

ICCV 2025arXiv:2508.12341

#17166

GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos

Soohyun Lee, SeoYeon Kim, HeeKyung Lee et al.

CVPR 2025

#17167

Spatially-Varying Autofocus

Yingsi Qin, Aswin Sankaranarayanan, Matthew O'Toole

ICCV 2025

#17168

Robust System Identification: Finite-sample Guarantees and Connection to Regularization

Hank Park, Grani A. Hanasusanto, Yingying Li

ICLR 2025

#17169

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy

Zesen Cheng, Hang Zhang, Kehan Li et al.

CVPR 2025highlight

#17170

GeoMM: On Geodesic Perspective for Multi-modal Learning

Shibin Mei, Hang Wang, Bingbing Ni

CVPR 2025arXiv:2505.11216

#17171

Learnable Fractional Reaction-Diffusion Dynamics for Under-Display ToF Imaging and Beyond

Xin Qiao, Matteo Poggi, Xing Wei et al.

ICCV 2025arXiv:2511.01704

#17172

Diffusion Curriculum: Synthetic-to-Real Data Curriculum via Image-Guided Diffusion

Yijun Liang, Shweta Bhardwaj, Tianyi Zhou

ICCV 2025arXiv:2410.13674

#17173

Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning

Shouhang Zhu, Chenglin Li, Yuankun Jiang et al.

CVPR 2025

#17174

Font-Agent: Enhancing Font Understanding with Large Language Models

Yingxin Lai, Cuijie Xu, Haitian Shi et al.

CVPR 2025

#17175

Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets

Muhammad Abdullah Jamal, Omid Mohareri

CVPR 2025

#17176

HuPerFlow: A Comprehensive Benchmark for Human vs. Machine Motion Estimation Comparison

Yung-Hao Yang, Zitang Sun, Taiki Fukiage et al.

CVPR 2025highlight

#17177

STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation

Yisi Luo, Xile Zhao, Kai Ye et al.

CVPR 2025

#17178

3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping

Chenhui Shi, Fulin Tang, Ning An et al.

CVPR 2025

#17179

Intricacies of Feature Geometry in Large Language Models

Satvik Golechha, Lucius Bushnaq, Euan Ong et al.

ICLR 2025

#17180

Shape as Line Segments: Accurate and Flexible Implicit Surface Representation

Siyu Ren, Junhui Hou

ICLR 2025

#17181

NExUME: Adaptive Training and Inference for DNNs under Intermittent Power Environments

Cyan Subhra Mishra, Deeksha Chaudhary, Jack Sampson et al.

ICLR 2025

#17182

Flow-MIL: Constructing Highly-expressive Latent Feature Space For Whole Slide Image Classification Using Normalizing Flow

Yingfan MA, Bohan An, Ao Shen et al.

ICCV 2025

#17183

SET: Spectral Enhancement for Tiny Object Detection

Huixin Sun, Runqi Wang, Yanjing Li et al.

CVPR 2025

#17184

The Source Image is the Best Attention for Infrared and Visible Image Fusion

Song Wang, Xie Han, Liqun Kuang et al.

ICCV 2025

#17185

Illumination Spectrum Estimation for Multispectral Images via Surface Reflectance Modeling and Spatial-Spectral Feature Generation

Hyejin Oh, Woo-Shik Kim, Sangyoon Lee et al.

CVPR 2025

#17186

Event-based Visual Vibrometry

Xinyu Zhou, Peiqi Duan, Yeliduosi Xiaokaiti et al.

ICCV 2025

#17187

S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction

Guangting Zheng, Jiajun Deng, Xiaomeng Chu et al.

ICCV 2025arXiv:2503.08217

#17188

Cross-Category Subjectivity Generalization for Style-Adaptive Sketch Re-ID

Zechao Hu, Zhengwei Yang, Hao Li et al.

ICCV 2025

#17189

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game

Keyizhi Xu, Chi Zhang, Zhan Chen et al.

CVPR 2025

#17190

Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes

CHEN LIANG, Wenguan Wang, Yi Yang

ICCV 2025

#17191

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun, Rui Wang, Zixuan Zhu et al.

CVPR 2025

#17192

Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research

Michał Bortkiewicz, Władysław Pałucki, Vivek Myers et al.

ICLR 2025

#17193

How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game

Ziyue Wang, Yurui Dong, Fuwen Luo et al.

ICCV 2025

#17194

Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-based Action Recognition

Wenhan Wu, Zhishuai Guo, Chen Chen et al.

ICCV 2025arXiv:2506.22179

#17195

Frequency-Guided Diffusion for Training-Free Text-Driven Image Translation

Zheng Gao, Jifei Song, Zhensong Zhang et al.

ICCV 2025

#17196

StyleSRN: Scene Text Image Super-Resolution with Text Style Embedding

Shengrong Yuan, Runmin Wang, Ke Hao et al.

ICCV 2025

#17197

VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond

Dabing Yu, Zheng Gao

CVPR 2025

#17198

Incremental Few-Shot Semantic Segmentation via Multi-Level Switchable Visual Prompts

Maoxian Wan, Kaige Li, Qichuan Geng et al.

ICCV 2025

#17199

Neuroverse3D: Developing In-Context Learning Universal Model for Neuroimaging in 3D

Jiesi Hu, Hanyang Peng, Yanwu Yang et al.

ICCV 2025arXiv:2503.02410

#17200

Splat-based 3D Scene Reconstruction with Extreme Motion-blur

Hyeonjoong Jang, Dongyoung Choi, Donggun Kim et al.

ICCV 2025

← Previous

1...84 85 86 87 88...112