Poster "adversarial attacks" Papers
83 papers found • Page 1 of 2
Conference
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia, Sensen Gao, Simeng Qin et al.
Adversarial Attacks on Data Attribution
Xinhe Wang, Pingbang Hu, Junwei Deng et al.
Adversarial Robustness of Discriminative Self-Supervised Learning in Vision
Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy
Adversary Aware Optimization for Robust Defense
Daniel Wesego, Pedram Rooshenas
Boosting Adversarial Transferability with Spatial Adversarial Alignment
Zhaoyu Chen, HaiJing Guo, Kaixun Jiang et al.
Confidence Elicitation: A New Attack Vector for Large Language Models
Brian Formento, Chuan Sheng Foo, See-Kiong Ng
Democratic Training Against Universal Adversarial Perturbations
Bing Sun, Jun Sun, Wei Zhao
DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
Yun Xing, Yue Cao, Nhat Chung et al.
Detecting Adversarial Data Using Perturbation Forgery
Qian Wang, Chen Li, Yuchen Luo et al.
DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models
SeungHoo Hong, GeonHo Son, Juhun Lee et al.
Endowing Visual Reprogramming with Adversarial Robustness
Shengjie Zhou, Xin Cheng, Haiyang Xu et al.
Enhancing Graph Classification Robustness with Singular Pooling
Sofiane Ennadir, Oleg Smirnov, Yassine ABBAHADDOU et al.
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao, Bryan Hooi, Jun Liu et al.
Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models
Hai Yan, Haijian Ma, Xiaowen Cai et al.
GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
Md Farhamdur Reza, Richeng Jin, Tianfu Wu et al.
Instant Adversarial Purification with Adversarial Consistency Distillation
Chun Tong Lei, Hon Ming Yam, Zhongliang Guo et al.
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
Zheng CHEN, Yushi Feng, Jisheng Dang et al.
Jailbreaking as a Reward Misspecification Problem
Zhihui Xie, Jiahui Gao, Lei Li et al.
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
Ran Li, Hao Wang, Chengzhi Mao
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents
Lukas Aichberger, Alasdair Paren, Guohao Li et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Fei Liu et al.
Non-Adaptive Adversarial Face Generation
Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.
NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary
Zezeng Li, Xiaoyu Du, Na Lei et al.
On the Alignment between Fairness and Accuracy: from the Perspective of Adversarial Robustness
Junyi Chai, Taeuk Jang, Jing Gao et al.
On the Stability of Graph Convolutional Neural Networks: A Probabilistic Perspective
Ning Zhang, Henry Kenlay, Li Zhang et al.
Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images
Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.
Robust LLM safeguarding via refusal feature adversarial training
Lei Yu, Virginie Do, Karen Hambardzumyan et al.
R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning
Lijun Sheng, Jian Liang, Zilei Wang et al.
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
Jiaming Ji, Xinyu Chen, Rui Pan et al.
SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
Buyun Liang, Liangzu Peng, Jinqi Luo et al.
Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh et al.
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
Dongyoon Yang, Jihu Lee, Yongdai Kim
Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance
Shuchao Pang, Zhenghan Chen, Shen Zhang et al.
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz et al.
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
Yiming Liu, Kezhao Liu, Yao Xiao et al.
Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits
Weixin Chen, Han Zhao
Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning
Sunwoo Lee, Jaebak Hwang, Yonghyeon Jo et al.
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen, Xinyu Zhao, Tianlong Chen et al.
Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework
Haonan Huang, Guoxu Zhou, Yanghang Zheng et al.
Adversarial Prompt Tuning for Vision-Language Models
Jiaming Zhang, Xingjun Ma, Xin Wang et al.
A Secure Image Watermarking Framework with Statistical Guarantees via Adversarial Attacks on Secret Key Networks
Feiyu CHEN, Wei Lin, Ziquan Liu et al.
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
Chung-En Sun, Sicun Gao, Lily Weng
Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
Vitali Petsiuk, Kate Saenko
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
Shashank Agnihotri, Steffen Jung, Margret Keuper
DataFreeShield: Defending Adversarial Attacks without Training Data
Hyeyoon Lee, Kanghyun Choi, Dain Kwon et al.
Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization
Yujia Liu, Chenxi Yang, Dingquan Li et al.
Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay
Yuhang Zhou, Zhongyun Hua
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Yujia Liu, Tong Bu, Ding Jianhao et al.