"adversarial attacks" Papers

115 papers found • Page 2 of 3

Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance

Shuchao Pang, Zhenghan Chen, Shen Zhang et al.

ICCV 2025arXiv:2508.15650
2
citations

Towards Adversarially Robust Dataset Distillation by Curvature Regularization

Eric Xue, Yijiang Li, Haoyang Liu et al.

AAAI 2025paperarXiv:2403.10045
18
citations

Towards Certification of Uncertainty Calibration under Adversarial Attacks

Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz et al.

ICLR 2025arXiv:2405.13922
2
citations

Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models

Hongbang Yuan, Zhuoran Jin, Pengfei Cao et al.

AAAI 2025paperarXiv:2408.10682
25
citations

Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective

Yiming Liu, Kezhao Liu, Yao Xiao et al.

ICLR 2025arXiv:2404.14309
7
citations

Transstratal Adversarial Attack: Compromising Multi-Layered Defenses in Text-to-Image Models

Chunlong Xie, Kangjie Chen, Shangwei Guo et al.

NEURIPS 2025spotlight

Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits

Weixin Chen, Han Zhao

NEURIPS 2025arXiv:2509.20549

Unveiling the Threat of Fraud Gangs to Graph Neural Networks: Multi-Target Graph Injection Attacks Against GNN-Based Fraud Detectors

Jinhyeok Choi, Heehyeon Kim, Joyce Jiyoung Whang

AAAI 2025paperarXiv:2412.18370
4
citations

Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2

Ziqi Zhou, Yifan Hu, Yufei Song et al.

NEURIPS 2025spotlightarXiv:2510.24195
9
citations

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Zi Liang, Qingqing Ye, Xuan Liu et al.

NEURIPS 2025spotlight

Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning

Sunwoo Lee, Jaebak Hwang, Yonghyeon Jo et al.

ICML 2025arXiv:2502.02844
1
citations

$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts

Guanjie Chen, Xinyu Zhao, Tianlong Chen et al.

ICML 2024arXiv:2406.11353
6
citations

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model

Decheng Liu, Xijun Wang, Chunlei Peng et al.

AAAI 2024paperarXiv:2312.11285
37
citations

Adversarial Attacks on the Interpretation of Neuron Activation Maximization

Géraldin Nanfack, Alexander Fulleringer, Jonathan Marty et al.

AAAI 2024paperarXiv:2306.07397
12
citations

Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework

Haonan Huang, Guoxu Zhou, Yanghang Zheng et al.

ICML 2024

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024arXiv:2311.11261
34
citations

A Secure Image Watermarking Framework with Statistical Guarantees via Adversarial Attacks on Secret Key Networks

Feiyu CHEN, Wei Lin, Ziquan Liu et al.

ECCV 2024
1
citations

Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents

Chung-En Sun, Sicun Gao, Lily Weng

ICML 2024arXiv:2406.18062
6
citations

Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks

Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin et al.

AAAI 2024paperarXiv:2310.06958
19
citations

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models

Vitali Petsiuk, Kate Saenko

ECCV 2024arXiv:2404.13706
8
citations

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri, Steffen Jung, Margret Keuper

ICML 2024arXiv:2302.02213
30
citations

DataFreeShield: Defending Adversarial Attacks without Training Data

Hyeyoon Lee, Kanghyun Choi, Dain Kwon et al.

ICML 2024arXiv:2406.15635
1
citations

Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization

Yujia Liu, Chenxi Yang, Dingquan Li et al.

CVPR 2024arXiv:2403.11397
12
citations

Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay

Yuhang Zhou, Zhongyun Hua

CVPR 2024arXiv:2404.01828
7
citations

Enhancing Adversarial Robustness in SNNs with Sparse Gradients

Yujia Liu, Tong Bu, Ding Jianhao et al.

ICML 2024arXiv:2405.20355
14
citations

Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks

Zhewei Wu, Ruilong Yu, Qihe Liu et al.

ECCV 2024arXiv:2402.17976
4
citations

Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data

Yanmeng Yao, Xiaohan Zhao, Bin Gu

ECCV 2024
9
citations

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

Jon Vadillo, Roberto Santana, Jose A Lozano

ICML 2024arXiv:2004.06383
1
citations

Fast Adversarial Attacks on Language Models In One GPU Minute

Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan et al.

ICML 2024arXiv:2402.15570
72
citations

GLOW: Global Layout Aware Attacks on Object Detection

Jun Bao, Buyu Liu, Kui Ren et al.

CVPR 2024arXiv:2302.14166

Graph Neural Network Explanations are Fragile

Jiate Li, Meng Pang, Yun Dong et al.

ICML 2024arXiv:2406.03193
18
citations

Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes

Weijia Shao

ICML 2024

IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics

Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin

ICML 2024oralarXiv:2403.05955
6
citations

Lyapunov-Stable Deep Equilibrium Models

Haoyu Chu, Shikui Wei, Ting Liu et al.

AAAI 2024paperarXiv:2304.12707
8
citations

Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen et al.

ICML 2024arXiv:2405.09800
9
citations

MathAttack: Attacking Large Language Models towards Math Solving Ability

Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.

AAAI 2024paperarXiv:2309.01686
37
citations

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

Hyejin Park, Jeongyeon Hwang, Sunung Mun et al.

CVPR 2024arXiv:2403.19326
8
citations

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

Xin Liu, Yichen Zhu, Jindong Gu et al.

ECCV 2024arXiv:2311.17600
199
citations

MultiDelete for Multimodal Machine Unlearning

Jiali Cheng, Hadi Amiri

ECCV 2024arXiv:2311.12047
13
citations

On the Duality Between Sharpness-Aware Minimization and Adversarial Training

Yihao Zhang, Hangzhou He, Jingyu Zhu et al.

ICML 2024arXiv:2402.15152
25
citations

On the Robustness of Large Multimodal Models Against Image Adversarial Attacks

Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang et al.

CVPR 2024arXiv:2312.03777
89
citations

Rethinking Adversarial Robustness in the Context of the Right to be Forgotten

Chenxu Zhao, Wei Qian, Yangyi Li et al.

ICML 2024

Rethinking Independent Cross-Entropy Loss For Graph-Structured Data

Rui Miao, Kaixiong Zhou, Yili Wang et al.

ICML 2024arXiv:2405.15564
5
citations

Revisiting Character-level Adversarial Attacks for Language Models

Elias Abad Rocamora, Yongtao Wu, Fanghui Liu et al.

ICML 2024arXiv:2405.04346
6
citations

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Zhuowen Yuan, Zidi Xiong, Yi Zeng et al.

ICML 2024arXiv:2403.13031
67
citations

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

Christian Schlarmann, Naman Singh, Francesco Croce et al.

ICML 2024arXiv:2402.12336
88
citations

Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Lebin Yu, Yunbo Qiu, Quanming Yao et al.

AAAI 2024paperarXiv:2312.11545
9
citations

Robustness Tokens: Towards Adversarial Robustness of Transformers

Brian Pulfer, Yury Belousov, Slava Voloshynovskiy

ECCV 2024arXiv:2503.10191

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models

Yongshuo Zong, Ondrej Bohdal, Tingyang Yu et al.

ICML 2024arXiv:2402.02207
123
citations

Shedding More Light on Robust Classifiers under the lens of Energy-based Models

Mujtaba Hussain Mirza, Maria Rosaria Briglia, Senad Beadini et al.

ECCV 2024arXiv:2407.06315
7
citations