Jing Wang

papers

713

total citations

papers (29)

From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network

ICCV 2021arXiv

177

citations

SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering

AAAI 2024

citations

Online Video Understanding: OVBench and VideoChat-Online

CVPR 2025arXiv

citations

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation

ECCV 2022arXiv

citations

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

CVPR 2025arXiv

citations

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

CVPR 2025arXiv

citations

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

AAAI 2025arXiv

citations

What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context

ICLR 2025arXiv

citations

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

ICCV 2025arXiv

citations

StreamForest: Efficient Online Video Understanding with Persistent Event Memory

NEURIPS 2025arXiv

citations

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

CVPR 2025arXiv

citations

AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering

CVPR 2025

citations

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

ECCV 2024arXiv

citations

CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework

AAAI 2025arXiv

citations

Detecting Tampered Scene Text in the Wild

ECCV 2022

citations

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

CVPR 2021

citations

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

ICCV 2025

citations

Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation

CVPR 2025arXiv

citations

MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding

ICCV 2025arXiv

citations

Handling Heterogeneous Curvatures in Bandit LQR Control

ICML 2024

citations

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

ICCV 2025

citations

Learning with Adaptive Resource Allocation

ICML 2024

citations

Jing Wang

papers (29)

From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network

Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation

Learning To Filter: Siamese Relation Network for Robust Tracking

AlphaVC: High-Performance and Efficient Learned Video Compression

Scene Text Retrieval via Joint Text Detection and Similarity Learning

WISA: World simulator assistant for physics-aware text-to-video generation

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

Content-Oriented Learned Image Compression

SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering

Online Video Understanding: OVBench and VideoChat-Online

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

StreamForest: Efficient Online Video Understanding with Persistent Event Memory

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework

Detecting Tampered Scene Text in the Wild

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation

MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding

Handling Heterogeneous Curvatures in Bandit LQR Control

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

Learning with Adaptive Resource Allocation

papers (29)

From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network

Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation

Learning To Filter: Siamese Relation Network for Robust Tracking

AlphaVC: High-Performance and Efficient Learned Video Compression

Scene Text Retrieval via Joint Text Detection and Similarity Learning

WISA: World simulator assistant for physics-aware text-to-video generation

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

Content-Oriented Learned Image Compression

SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering

Online Video Understanding: OVBench and VideoChat-Online

Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

StreamForest: Efficient Online Video Understanding with Persistent Event Memory

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework

Detecting Tampered Scene Text in the Wild

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation

MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding

Handling Heterogeneous Curvatures in Bandit LQR Control

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

Learning with Adaptive Resource Allocation