VisualLens: Personalization through Task-Agnostic Visual History

0citations

arXiv:2411.16034

citations

#3347

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Wang Bill Zhu Deqing Fu Kai Sun Yi Lu Zhaojiang Lin Seungwhan Moon Kanika Narang MUSTAFA CANIM Yue Liu Anuj Kumar Xin Dong

Topics

visual history personalization multimodal recommendation systems multimodal large language models task-agnostic user profiling image-based user modeling

Abstract

Existing recommendation systems either rely on user interaction logs, such as online shopping history for shopping recommendations, or focus on text signals. However, item-based histories are not always accessible, and are not generalizable for multimodal recommendation. We hypothesize that a user's visual history -- comprising images from daily life -- can offer rich, task-agnostic insights into their interests and preferences, and thus be leveraged for effective personalization. To this end, we propose VisualLens, a novel framework that leverages multimodal large language models (MLLMs) to enable personalization using task-agnostic visual history. VisualLens extracts, filters, and refines a spectrum user profile from the visual history to support personalized recommendation. We created two new benchmarks, Google-Review-V and Yelp-V, with task-agnostic visual histories, and show that VisualLens improves over state-of-the-art item-based multimodal recommendations by 5-10% on Hit@3, and outperforms GPT-4o by 2-5%. Further analysis shows that VisualLens is robust across varying history lengths and excels at adapting to both longer histories and unseen content categories.

Citation History

Jan 25, 2026

Jan 27, 2026

Jan 31, 2026

Feb 13, 2026