X2M: Cross-norm Dual-Momentum Adversarial Attack for Transferable and Perceptually Similar Perturbations

by Martinraj Nadar | Friday, Mar 27, 2026

Abstract: Adversarial attacks pose a growing challenge to modern smart-city vision systems, where misclassification in autonomous vehicles, surveillance platforms, and traffic-control sensors can have direct safety implications. To assess these vulnerabilities, we introduce X2M, a cross-norm dual-momentum adversarial attack designed to produce highly transferable and perceptually similar perturbations using an ensemble of lightweight and mid-scale surrogate models. X2M demonstrates reliable effectiveness against CNNs, Transformers, and adversarially trained networks across ImageNet, GTSRB, COCO-mini, and the real-world HIT-UAV infrared dataset. It maintains more than 99% visual similarity on COCO-mini and HIT-UAV, achieves near perfect white-box success on CNN and YOLO classifier backbones, obtains 94% white-box success on ViT-B, and reaches by 56% transfer success on Swin-T under strict black-box settings. These results highlight the importance of comprehensive adversarial robustness evaluation prior to deploying AI driven vision components in safety critical consumer and smart-city environments. Authors: By Suvosree Chatterjee, Hari Kalva & Velibor Adzic Conference / Journal 2026 IEEE International Conference on Consumer Electronics (ICCE)