Most Cited AAAI "predictive accuracy optimization" Papers
5,317 papers found • Page 2 of 27
Conference
CUTS+: High-Dimensional Causal Discovery from Irregular Time-Series
Yuxiao Cheng, Lianglong Li, Tingxiong Xiao et al.
Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning
Samyadeep Basu, Shell Hu, Daniela Massiceti et al.
MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation
Jinfeng Xu, Zheyu Chen, Shuo Yang et al.
Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition
Jianyang Xie, Yanda Meng, Yitian Zhao et al.
Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation
Yuyang Ye, Zhi Zheng, Yishan Shen et al.
Gramformer: Learning Crowd Counting via Graph-Modulated Transformer
Hui LIN, Zhiheng Ma, Xiaopeng Hong et al.
Feature Fusion from Head to Tail for Long-Tailed Visual Recognition
Mengke Li, Zhikai HU, Yang Lu et al.
ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank
Zhanjie Zhang, Quanwei Zhang, Wei Xing et al.
Improving Audio-Visual Segmentation with Bidirectional Generation
Dawei Hao, Yuxin Mao, Bowen He et al.
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang, L. F. D’Haro, Yiming Chen et al.
Reinforced Adaptive Knowledge Learning for Multimodal Fake News Detection
Litian Zhang, Xiaoming Zhang, Chaozhuo Li et al.
Improving Automatic VQA Evaluation Using Large Language Models
Oscar Mañas, Benno Krojer, Aishwarya Agrawal
DeS3: Adaptive Attention-Driven Self and Soft Shadow Removal Using ViT Similarity
Yeying Jin, Wenhan Yang, W. Ye et al.
Affordances-Oriented Planning Using Foundation Models for Continuous Vision-Language Navigation
Jiaqi Chen, Bingqian Lin, Xinmin Liu et al.
DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis
Pan Wang, Qiang Zhou, Yawen Wu et al.
Unifying Visual and Vision-Language Tracking via Contrastive Learning
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Fulong Ye, Guang Liu, Xinya Wu et al.
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Namhyuk Ahn, Junsoo Lee, Chunggi Lee et al.
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang et al.
End-to-End Autonomous Driving Through V2X Cooperation
Haibao Yu, Wenxian Yang, Jiaru Zhong et al.
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Chenyang Zhu, Kai Li, Yue Ma et al.
HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection
Zican Shi, Jing Hu, Jie Ren et al.
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs
Pham Vu Tuan Dat, Long Doan, Huynh Thi Thanh Binh
Image Conductor: Precision Control for Interactive Video Synthesis
Yaowei Li, Xintao Wang, Zhaoyang Zhang et al.
DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental Learning
Huiping Zhuang, Run He, Kai Tong et al.
TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation
Yuhao Wang, Xuehu Liu, Pingping Zhang et al.
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
Xiangpeng Yang, Linchao Zhu, Xiaohan Wang et al.
Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning
Alexey Skrynnik, Anton Andreychuk, Maria Nesterova et al.
Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification
Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao et al.
Fine-Grained Prototypes Distillation for Few-Shot Object Detection
Zichen Wang, Bo Yang, Haonan Yue et al.
Unsupervised Continual Anomaly Detection with Contrastively-Learned Prompt
Jiaqi Liu, Kai Wu, Qiang Nie et al.
Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
Kun Li, Dan Guo, Guoliang Chen et al.
Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries
Xinyi He, Mengyu Zhou, Xinrun Xu et al.
LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs
Yan Wang, Zhixuan Chu, Xin Ouyang et al.
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style
Shuai Tan, Bin Ji, Ye Pan
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang et al.
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
Wanggui He, Siming Fu, Mushui Liu et al.
Towards Real-World Test-Time Adaptation: Tri-net Self-Training with Balanced Normalization
Yongyi Su, Xun Xu, Kui Jia
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling
Shimin Zhang, Qu Yang, Chenxiang Ma et al.
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
Yu Yang, Jianbiao Mei, Yukai Ma et al.
Debiasing Multimodal Sarcasm Detection with Contrastive Learning
Mengzhao Jia, Can Xie, Liqiang Jing
PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine
Chenrui Zhang, Lin Liu, Chuyuan Wang et al.
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection
XiaoHui Zhang, Jiangyan Yi, Chenglong Wang et al.
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models
Yubin Wang, Xinyang Jiang, De Cheng et al.
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
Qingping Zheng, Yuanfan Guo, Jiankang Deng et al.
ENCODER: Entity Mining and Modification Relation Binding for Composed Image Retrieval
Zixu Li, Zhiwei Chen, Haokun Wen et al.
Object-Aware Domain Generalization for Object Detection
WooJu Lee, Dasol Hong, Hyungtae Lim et al.
Learning to Prompt with Text Only Supervision for Vision-Language Models
Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer et al.
Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking
Xiantao Hu, Ying Tai, Xu Zhao et al.
Fine-Grained Distillation for Long Document Retrieval
Yucheng Zhou, Tao Shen, Xiubo Geng et al.
EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer
Fei Wang, Dan Guo, Kun Li et al.
Transformer Layers as Painters
Qi Sun, Marc Pickett, Aakash Kumar Nain et al.
Large Language Models Are Neurosymbolic Reasoners
Meng Fang, Shilong Deng, Yudi Zhang et al.
SUTrack: Towards Simple and Unified Single Object Tracking
Xin Chen, Ben Kang, Wanting Geng et al.
Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector
An Lao, Qi Zhang, Chongyang Shi et al.
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Han Shu, Wenshuo Li, Yehui Tang et al.
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
Zhihang Liu, Jun Li, Hongtao Xie et al.
A Diffusion-Based Framework for Multi-Class Anomaly Detection
Haoyang He, Jiangning Zhang, Hongxu Chen et al.
Attribute-Missing Graph Clustering Network
Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.
Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding
Taolin Zhang, Sunan He, Tao Dai et al.
Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models
Liang Li, Qingyuan Li, Bo Zhang et al.
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
Conglong Li, Zhewei Yao, Xiaoxia Wu et al.
Text-Guided Molecule Generation with Diffusion Language Model
Haisong Gong, Qiang Liu, Shu Wu et al.
StyleSinger: Style Transfer for Out
of-Domain Singing Voice Synthesis
Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models
Shuang Li, Jiangjie Chen, Siyu Yuan et al.
Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion
Shenghong Luo, Xuhang Chen, Weiwen Chen et al.
XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Pritam Sarkar, Ali Etemad
Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection
Soopil Kim, Sion An, Philip Chikontwe et al.
Rethinking Propagation for Unsupervised Graph Domain Adaptation
Meihan Liu, Zeyu Fang, Zhen Zhang et al.
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation
Xinshuo Hu, Dongfang Li, Zihao Zheng et al.
How to Protect Copyright Data in Optimization of Large Language Models?
Timothy Chu, Zhao Song, Chiwun Yang
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
Clément Chadebec, Onur Tasar, Eyal Benaroche et al.
Evaluating the Evaluator: Measuring LLMs’ Adherence to Task Evaluation Instructions
Bhuvanashree Murugadoss, Christian Poelitz, Ian Drosos et al.
Approximating the Shapley Value without Marginal Contributions
Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik et al.
Controllable Mind Visual Diffusion Model
Bohan Zeng, Shanglin Li, Xuhui Liu et al.
No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation
Nimesh Agrawal, Anuj Sirohi, Sandeep Kumar et al.
Multi-Objective Evolution of Heuristic Using Large Language Model
Shunyu Yao, Fei Liu, Xi Lin et al.
RATT: A Thought Structure for Coherent and Correct LLM Reasoning
Jinghan Zhang, Xiting Wang, Weijieying Ren et al.
STEM: Unleashing the Power of Embeddings for Multi-Task Recommendation
Liangcai Su, Junwei Pan, Ximei Wang et al.
CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility
Bojia Zi, Shihao Zhao, Xianbiao Qi et al.
Towards Continual Knowledge Graph Embedding via Incremental Distillation
Jiajun Liu, Ke Wenjun, Peng Wang et al.
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
Xuan Shen, Zhao Song, Yufa Zhou et al.
Rethinking Reverse Distillation for Multi-Modal Anomaly Detection
Zhihao Gu, Jiangning Zhang, Liang Liu et al.
Latent Space Editing in Transformer-Based Flow Matching
Vincent Tao Hu, Wei Zhang, Meng Tang et al.
Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle
Zhenyu Tang, Junwu Zhang, Xinhua Cheng et al.
Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Jinsong Shi, Pan Gao, Jie Qin
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
Yaoting Wang, Liu Weisong, Guangyao Li et al.
U-mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting
Xiang Ma, Xuemei Li, Lexin Fang et al.
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
Shangchao Su, Mingzhao Yang, Bin Li et al.
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee, Jin-Young Kim, Hyojun Go et al.
SlowTrack: Increasing the Latency of Camera-Based Perception in Autonomous Driving Using Adversarial Examples
Chen Ma, Ningfei Wang, Qi Alfred Chen et al.
MCL-NER: Cross-Lingual Named Entity Recognition via Multi-View Contrastive Learning
Authors: Ying Mo, Jian Yang, Jiahao Liu et al.
GFlow: Recovering 4D World from Monocular Video
Shizun Wang, Xingyi Yang, Qiuhong Shen et al.
Deep Variational Incomplete Multi-View Clustering: Exploring Shared Clustering Structures
Gehui Xu, Jie Wen, Chengliang Liu et al.
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model
Decheng Liu, Xijun Wang, Chunlei Peng et al.
Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
Liqi He, Zuchao Li, Xiantao Cai et al.
SAM-PARSER: Fine-Tuning SAM Efficiently by Parameter Space Reconstruction
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community
Jiancheng Pan, Yanxing Liu, Yuqian Fu et al.
Parallel Vertex Diffusion for Unified Visual Grounding
Authors: Zesen Cheng, Kehan Li, Peng Jin et al.
MathAttack: Attacking Large Language Models towards Math Solving Ability
Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.
Robust Node Classification on Graph Data with Graph and Label Noise
Yonghua Zhu, Lei Feng, Zhenyun Deng et al.
Exploiting Label Skews in Federated Learning with Model Concatenation
Yiqun Diao, Qinbin Li, Bingsheng He
AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection
Jingchun Zhou, Zongxin He, Kin-Man Lam et al.
When Model Meets New Normals: Test-Time Adaptation for Unsupervised Time-Series Anomaly Detection
InstructDoc: A Dataset for Zero
Shot Generalization of Visual Document Understanding with Instructions - Ryota Tanaka, Taichi Iki, Kyosuke Nishida et al.
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan, Yuan Yuan, Zhitong Xiong
When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming
Hussein Mozannar, Gagan Bansal, Adam Fourney et al.
Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance
Wenhao Sun, Xue-Mei Dong, Benlei Cui et al.
NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views
Han Huang, Yulun Wu, Junsheng Zhou et al.
MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL
Arian Askari, Christian Poelitz, Xinye Tang
LION: Implicit Vision Prompt Tuning
Haixin Wang, Jianlong Chang, Yihang Zhai et al.
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.
xPatch: Dual-Stream Time Series Forecasting with Exponential Seasonal-Trend Decomposition
Artyom Stitsyuk, Jaesik Choi
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
Haojun Shi, Suyu Ye, Xinyu Fang et al.
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong, Yunji Kim, Sanghyuk Chun et al.
VLM2Scene: Self-Supervised Image-Text-LiDAR Learning with Foundation Models for Autonomous Driving Scene Understanding
Guibiao Liao, Jiankun Li, Xiaoqing Ye
Causal Prompting: Debiasing Large Language Model Prompting Based on Front-Door Adjustment
Congzhi Zhang, Linhai Zhang, Jialong Wu et al.
DiffBEV: Conditional Diffusion Model for Bird’s Eye View Perception
Jiayu Zou, Kun Tian, Zheng Zhu et al.
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models
Lingzhi Wang, Xingshan Zeng, Jinsong Guo et al.
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input
Qijian Tian, Xin Tan, Yuan Xie et al.
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Hao Li et al.
Training-Free Quantum Architecture Search
Zhimin He, Maijie Deng, Shenggen Zheng et al.
Generative Multi-Modal Knowledge Retrieval with Large Language Models
Xinwei Long, Jiali Zeng, Fandong Meng et al.
Improving Retrieval Augmented Language Model with Self-Reasoning
Yuan Xia, Jingbo Zhou, Zhenhui Shi et al.
Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting
Yifan Hu, Peiyuan Liu, Peng Zhu et al.
Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning Approach for Rumor Detection
FedMut: Generalized Federated Learning via Stochastic Mutation
Ming Hu, Cao Yue, Anran Li et al.
SCALM: Detecting Bad Practices in Smart Contracts Through LLMs
Zongwei Li, Xiaoqi Li, Wenkai Li et al.
LLM-Powered User Simulator for Recommender System
Zijian Zhang, Shuchang Liu, Ziru Liu et al.
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
Hailang Huang, Zhijie Nie, Ziqiao Wang et al.
Learning Continuous Implicit Field with Local Distance Indicator for Arbitrary-Scale Point Cloud Upsampling
Shujuan Li, Junsheng Zhou, Baorui Ma et al.
SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization
Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.
6385 Efficient Spiking Neural Networks with Sparse Selective Activation for Continual Learning
Jiangrong Shen, Wenyao Ni, Qi Xu et al.
Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation
Derong Xu, Xinhang Li, Ziheng Zhang et al.
GIN-SD: Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion
Le Cheng, Peican Zhu, Keke Tang et al.
DGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization
Aritra Bhowmick, Mert Kosan, Zexi Huang et al.
Towards Effective and General Graph Unlearning via Mutual Evolution
Xunkai Li, Yulin Zhao, Zhengyu Wu et al.
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu et al.
Online Boosting Adaptive Learning under Concept Drift for Multistream Classification
En Yu, Jie Lu, Bin Zhang et al.
MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models
Yan Cai, Linlin Wang, Ye Wang et al.
An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction
Urchade Zaratiana, Nadi Tomeh, Pierre Holat et al.
Probabilities of Causation with Nonbinary Treatment and Effect
Ang Li, Judea Pearl
Hierarchical Multi-Marginal Optimal Transport for Network Alignment
Zhichen Zeng, Boxin Du, Si Zhang et al.
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
Yi Zhang, Ce Zhang, Ke Yu et al.
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang, Ray Zhang, Zoey Guo et al.
Fair Text-to-Image Diffusion via Fair Mapping
Jia Li, Lijie Hu, Jingfeng Zhang et al.
On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling
Xiaobao Wu, Fengjun Pan, Thong Nguyen et al.
Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation
Qihan Huang, Siming Fu, Jinlong Liu et al.
Urban Region Embedding via Multi-View Contrastive Prediction
Zechen Li, Weiming Huang, Kai Zhao et al.
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
Barys Liskavets, Maxim Ushakov, Shuvendu Roy et al.
Guided Real Image Dehazing Using YCbCr Color Space
Wenxuan Fang, Junkai Fan, Yu Zheng et al.
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng et al.
Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles
Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen, Zhentao Tan, Qi Chu et al.
Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization
Tianrui Jia, Haoyang Li, Cheng Yang et al.
Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation
Xiaoyi Bao, Jie Qin, Siyang Sun et al.
Provably Powerful Graph Neural Networks for Directed Multigraphs
Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning
Xinlu Zhang, Zhiyu Zoey Chen, Xi Ye et al.
TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents
Geon Lee, Wenchao Yu, Kijung Shin et al.
Stable-Hair: Real-World Hair Transfer via Diffusion Model
Yuxuan Zhang, Qing Zhang, Yiren Song et al.
Adaptive Integration of Partial Label Learning and Negative Learning for Enhanced Noisy Label Learning
Mengmeng Sheng, Zeren Sun, Zhenhuang Cai et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan et al.
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Mushui Liu, Yuhang Ma, Zhen Yang et al.
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie, Qipeng Zhang, Peng Taiying et al.
Rethinking Graph Masked Autoencoders through Alignment and Uniformity
Liang Wang, Xiang Tao, Qiang Liu et al.
QAGait: Revisit Gait Recognition from a Quality Perspective
Zengbin Wang, Saihui Hou, Man Zhang et al.
Shrinking Your TimeStep: Towards Low-Latency Neuromorphic Object Recognition with Spiking Neural Networks
Yongqi Ding, Lin Zuo, Mengmeng Jing et al.
Evolutionary Large Language Model for Automated Feature Transformation
Nanxu Gong, Chandan K Reddy, Wangyang Ying et al.
CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation
Shoukun Sun, Min Xian, Fei Xu et al.
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
Root Cause Analysis in Microservice Using Neural Granger Causal Discovery
Cheng-Ming Lin, Ching Chang, Wei-Yao Wang et al.
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning
Qingdong He, Jiangning Zhang, Jinlong Peng et al.
Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed
Yubin Xiao, Di Wang, Boyang Li et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Tao Rui et al.
ACPBench: Reasoning About Action, Change, and Planning
Harsha Kokel, Michael Katz, Kavitha Srinivas et al.
Adaptive Hardness Negative Sampling for Collaborative Filtering
Riwei Lai, Rui Chen, Qilong Han et al.
Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization
Yanan Wu, Zhixiang Chi, Yang Wang et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
Jiaxin Zhang, Wentao Yang, Songxuan Lai et al.
A Non-parametric Graph Clustering Framework for Multi-View Data
Shengju Yu, Siwei Wang, Zhibin Dong et al.
Region-Disentangled Diffusion Model for High-Fidelity PPG-to-ECG Translation
Debaditya Shome, Pritam Sarkar, Ali Etemad
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks
Anchun Gui, Jinqiang Ye, Han Xiao
CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification
Yuchen Tian, Weixiang Yan, Qian Yang et al.
Learning Generalized Medical Image Segmentation from Decoupled Feature Queries
1207 Qi Bi, Jingjun Yi, Hao Zheng et al.
Deep Contrastive Graph Learning with Clustering-Oriented Guidance
Mulin Chen, Bocheng Wang, Xuelong Li
Mesoscopic Insights: Orchestrating Multi-Scale & Hybrid Architecture for Image Manipulation Localization
Xuekang Zhu, Xiaochen Ma, Lei Su et al.
ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement
Mengqi Lei, Haochen Wu, Xinhua Lv et al.
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
Can Jin, Tianjin Huang, Yihua Zhang et al.
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Qinliang Lin, Cheng Luo, Zenghao Niu et al.
Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data
Yiwei Li, Peiwen Yuan, Shaoxiong Feng et al.
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
Zihui Cheng, Qiguang Chen, Jin Zhang et al.
DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer
Exploring Enhanced Contextual Information for Video-Level Object Tracking
Ben Kang, Xin Chen, Simiao Lai et al.
Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation
Zhuqiang Lu, Kun Hu, Chaoyue Wang et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.
CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers
Yi Rong, Haoran Zhou, Lixin Yuan et al.