All Papers

34,598 papers found • Page 684 of 692

Verifying message-passing neural networks via topology-based bounds tightening

Christopher Hojny, Shiqiang Zhang, Juan Campos et al.

ICML 2024arXiv:2402.13937
14
citations

VersatileGaussian: Real-time Neural Rendering for Versatile Tasks using Gaussian Splatting

Renjie Li, Zhiwen Fan, Bohua Wang et al.

ECCV 2024
3
citations

Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning

Minyeong Park, Jae-Ho Lee, Gyeong-Moon Park

ECCV 2024arXiv:2409.10956
6
citations

Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation

Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.

CVPR 2024arXiv:2311.10696
15
citations

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024arXiv:2404.02176
4
citations

VersVideo: Leveraging Enhanced Temporal Diffusion Models for Versatile Video Generation

Jinxi Xiang, Ricong Huang, Jun Zhang et al.

ICLR 2024oral

VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks

Zhaomin Wu, Junyi Hou, Bingsheng He

ICLR 2024arXiv:2307.02040
7
citations

VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking

Jens Hellekes, Manuel Mühlhaus, Reza Bahmanyar et al.

ECCV 2024
3
citations

VFLAIR: A Research Library and Benchmark for Vertical Federated Learning

TIANYUAN ZOU, Zixuan GU, Yu He et al.

ICLR 2024arXiv:2310.09827
15
citations

VF-NeRF: Viewshed Fields for Rigid NeRF Registration

Leo Segre, Shai Avidan

ECCV 2024arXiv:2404.03349
2
citations

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024arXiv:2403.12034
54
citations

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight

V?: Guided Visual Search as a Core Mechanism in Multimodal LLMs

Penghao Wu, Saining Xie

CVPR 2024arXiv:2312.14135
345
citations

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders

Jefferson Hernandez, Ruben Villegas, Vicente Ordonez

ECCV 2024arXiv:2303.12001
13
citations

VicTR: Video-conditioned Text Representations for Activity Recognition

Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani et al.

CVPR 2024arXiv:2304.02560
38
citations

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

Jiaming Liu, Senqiao Yang, Peidong Jia et al.

ICLR 2024arXiv:2306.04344
63
citations

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Yue Fan, Xiaojian Ma, Rujie Wu et al.

ECCV 2024arXiv:2403.11481
161
citations

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Xiaohan Wang, Yuhui Zhang, Orr Zohar et al.

ECCV 2024arXiv:2403.10517
231
citations

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024arXiv:2403.19926
11
citations

VideoBooth: Diffusion-based Video Generation with Image Prompts

Yuming Jiang, Tianxing Wu, Shuai Yang et al.

CVPR 2024arXiv:2312.00777
123
citations

VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos

Devesh Bilwakumar Walawalkar, Pablo Garrido

ECCV 2024arXiv:2407.12214
3
citations

VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.

CVPR 2024arXiv:2311.10111
30
citations

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Haoxin Chen, Yong Zhang, Xiaodong Cun et al.

CVPR 2024arXiv:2401.09047
512
citations

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024arXiv:2308.14710
37
citations

Video Decomposition Prior: Editing Videos Layer by Layer

Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava

ICLR 2024

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024arXiv:2403.09334
29
citations

Video Event Extraction with Multi-View Interaction Knowledge Distillation

Kaiwen Wei, Du Runyan, Li Jin et al.

AAAI 2024paper

Video Frame Interpolation via Direct Synthesis with the Event-based Reference

Yuhan Liu, Yongjian Deng, Hao Chen et al.

CVPR 2024

Video Frame Prediction from a Single Image and Events

Juanjuan Zhu, Zhexiong Wan, Yuchao Dai

AAAI 2024paper
6
citations

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.

CVPR 2024

Video Harmonization with Triplet Spatio-Temporal Variation Patterns

Zonghui Guo, XinYu Han, Jie Zhang et al.

CVPR 2024

Video Interpolation with Diffusion Models

Siddhant Jain, Daniel Watson, Aleksander Holynski et al.

CVPR 2024arXiv:2404.01203
66
citations

Video-Language Aligned Transformer for Video Question Answering

AAAI 2024paper

Video Language Planning

Yilun Du, Sherry Yang, Pete Florence et al.

ICLR 2024arXiv:2310.10625
147
citations

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

Yang Jin, Zhicheng Sun, Kun Xu et al.

ICML 2024oralarXiv:2402.03161
82
citations

VideoLLM-online: Online Video Large Language Model for Streaming Video

Joya Chen, Zhaoyang Lv, Shiwei Wu et al.

CVPR 2024arXiv:2406.11816
116
citations

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024arXiv:2402.19082
21
citations

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024arXiv:2407.08476
24
citations

VideoMamba: State Space Model for Efficient Video Understanding

Kunchang Li, Xinhao Li, Yi Wang et al.

ECCV 2024arXiv:2403.06977
407
citations

Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition

Hao Fei, Shengqiong Wu, Wei Ji et al.

ICML 2024oralarXiv:2501.03230
146
citations

Video-P2P: Video Editing with Cross-attention Control

Shaoteng Liu, Yuechen Zhang, Wenbo Li et al.

CVPR 2024arXiv:2303.04761
312
citations

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Dan Kondratyuk, Lijun Yu, Xiuye Gu et al.

ICML 2024arXiv:2312.14125
420
citations

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024
16
citations

VideoPrism: A Foundational Visual Encoder for Video Understanding

Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.

ICML 2024arXiv:2402.13217
73
citations

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024arXiv:2312.00937
37
citations

Video ReCap: Recursive Captioning of Hour-Long Videos

Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.

CVPR 2024arXiv:2402.13250
85
citations

Video Recognition in Portrait Mode

Mingfei Han, Linjie Yang, Xiaojie Jin et al.

CVPR 2024arXiv:2312.13746
7
citations

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

Liao Wang, Kaixin Yao, Chengcheng Guo et al.

CVPR 2024arXiv:2312.01407
22
citations

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Guangzhi Sun, Wenyi Yu, Changli Tang et al.

ICML 2024oralarXiv:2406.15704
76
citations