"object localization" Papers
16 papers found
Conference
CrossOver: 3D Scene Cross-Modal Alignment
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.
FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection
Yao Xiao, Tingfa Xu, Yu Xin et al.
Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360° Firefighting Video
Aditi Tiwari, Farzaneh Masoud, Dac Nguyen et al.
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
Yongshuo Zong, Qin ZHANG, DONGSHENG An et al.
MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval
Huaying Yuan, Jian Ni, Zheng Liu et al.
RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations
Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
Rong Li, Shijie Li, Lingdong Kong et al.
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
Guanxing Lu, Ziwei Wang, Changliu Liu et al.
Adaptive Multi-task Learning for Few-shot Object Detection
Yan Ren, Yanling Li, Wai-Kin Adams Kong
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi, Miao Wang, Haobin Duan et al.
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
Point Segment and Count: A Generalized Framework for Object Counting
Zhizhong Huang, Mingliang Dai, Yi Zhang et al.
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna et al.
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Weiyun Wang Weiyun, yiming ren, Haowen Luo et al.