Volume 31 Issue 2
Mar.  2022
Turn off MathJax
Article Contents
ZHANG Quanxin, MA Wencong, WANG Yajie, ZHANG Yaoyuan, SHI Zhiwei, LI Yuanzhang. Backdoor Attacks on Image Classification Models in Deep Neural Networks[J]. Chinese Journal of Electronics, 2022, 31(2): 199-212. doi: 10.1049/cje.2021.00.126
Citation: ZHANG Quanxin, MA Wencong, WANG Yajie, ZHANG Yaoyuan, SHI Zhiwei, LI Yuanzhang. Backdoor Attacks on Image Classification Models in Deep Neural Networks[J]. Chinese Journal of Electronics, 2022, 31(2): 199-212. doi: 10.1049/cje.2021.00.126

Backdoor Attacks on Image Classification Models in Deep Neural Networks

doi: 10.1049/cje.2021.00.126
Funds:  This work was supported by the National Natural Science Foundation of China (61876019)
More Information
  • Author Bio:

    was born in 1974. He received the Ph.D. degree in computer application technology from Beijing Institute of Technology, in 2003. He is currently an Associate Professor of Beijing Institute of Technology. His current research interests include deep learning and information security. (Email: zhangqx@bit.edu.cn)

    is a graduate in the School of Computer Science, Beijing Institute of Technology. Her main research interests are the backdoor attacks and defences. (Email: mawencong1066@foxmail.com)

    is a Ph.D. candidate in the School of Computer Science, Beijing Institute of Technology. His main research interests are the robustness and vulnerability of artificial intelligence, cyberspace security. (Email: wangyajie19@bit.edu.cn)

    (corresponding author) received the B.S., M.S., and Ph.D. degrees in software and theory of computer from Beijing Institute of Technology (BIT) in 2001, 2004, and 2015 respectively. He has been an Associate Professor at BIT. His research interests include mobile computing and information security. (Email: popular@bit.edu.cn)

  • Received Date: 2021-04-12
  • Accepted Date: 2021-08-10
  • Available Online: 2021-11-09
  • Publish Date: 2022-03-05
  • Deep neural network (DNN) is applied widely in many applications and achieves state-of-the-art performance. However, DNN lacks transparency and interpretability for users in structure. Attackers can use this feature to embed trojan horses in the DNN structure, such as inserting a backdoor into the DNN, so that DNN can learn both the normal main task and additional malicious tasks at the same time. Besides, DNN relies on data set for training. Attackers can tamper with training data to interfere with DNN training process, such as attaching a trigger on input data. Because of defects in DNN structure and data, the backdoor attack can be a serious threat to the security of DNN. The DNN attacked by backdoor performs well on benign inputs while it outputs an attacker-specified label on trigger attached inputs. Backdoor attack can be conducted in almost every stage of the machine learning pipeline. Although there are a few researches in the backdoor attack on image classification, a systematic review is still rare in this field. This paper is a comprehensive review of backdoor attacks. According to whether attackers have access to the training data, we divide various backdoor attacks into two types: poisoning-based attacks and non-poisoning-based attacks. We go through the details of each work in the timeline, discussing its contribution and deficiencies. We propose a detailed mathematical backdoor model to summary all kinds of backdoor attacks. In the end, we provide some insights about future studies.
  • loading
  • [1]
    Y. Taigman, M. Yang, M. Ranzato, et al., “Deepface: Closing the gap to human-level performance in face verification,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, pp.1701–1708, 2014.
    [2]
    A. B. Nassif, I. Shahin, I. Attili, et al., “Speech recognition using deep neural networks: A systematic review,” IEEE Access, vol.7, pp.19143–19165, 2019. doi: 10.1109/ACCESS.2019.2896880
    [3]
    M. Bojarski, D. Del Testa, D. Dworakowski, et al., “End to end learning for self-driving cars,” arXiv preprint, arXiv: 1604.07316, 2016.
    [4]
    M. Xu, J. Liu, Y. Liu, et al., “A first look at deep learning apps on smartphones,” The World Wide Web Conference, San Francisco, California, USA, pp.2125–2136, 2019.
    [5]
    C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile and wireless networking: A survey,” IEEE Communications Surveys & Tutorials, vol.21, no.3, pp.2224–2287, 2019.
    [6]
    J. Hochstetler, R. Padidela, Q. Chen, et al., “Embedded deep learning for vehicular edge computing,” ACM/IEEE Symposium on Edge Computing, Bellevue, Washington, USA, pp.341–343, 2018.
    [7]
    G. Ananthanarayanan, P. Bahl, P. Bodík, et al., “Real-time video analytics: The killer app for edge computing,” Computer, vol.50, no.10, pp.58–67, 2017. doi: 10.1109/MC.2017.3641638
    [8]
    C. Szegedy, W. Zaremba, I. Sutskever, et al, “Intriguing properties of neural networks,” International Conference on Learning Representations, Banff, Canada, arXiv: 1312.6199, 2014.
    [9]
    N. Papernot, P. McDaniel, I. Goodfellow, et al, “Practical black-box attacks against machine learning,” in Proc. of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, pp.506–519, 2017.
    [10]
    Y. Vorobeychik and B. Li, “Optimal randomized classification in adversarial settings,” International Conference on Autonomous Agents and Multiagent Systems, Paris, France, pp.485–492, 2014.
    [11]
    N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” IEEE Symposium on Security and Privacy, San Jose, CA, USA, pp.39–57, 2017.
    [12]
    B. Li and Y. Vorobeychik, “Scalable optimization of randomized operational decisions in adversarial classification settings,” in Proc. of the Eighteenth International Conference on Artificial Intelligence and Statistics, San Diego, California, USA, pp.599–607, 2015.
    [13]
    N. Papernot, P. McDaniel, S. Jha, et al, “The limitations of deep learning in adversarial settings,” IEEE European Symposium on Security and Privacy, Saarbrucken, Germany, pp.372–387, 2016.
    [14]
    F. Tramèr, A. Kurakin, N. Papernot, et al, “Ensemble adversarial training: Attacks and defenses,” International Conference on Learning Representations, Vancouver, BC, Canada, arXiv:1705.07204, 2018.
    [15]
    W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” Network and Distributed System Security Symposium, San Diego, California, USA, DOI: 10.14722/ndss.2018.23198, 2018.
    [16]
    T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnerabilities in the machine learning model supply chain,” Neural Information Processing Systems Workshop, Long Beach, California, USA, arXiv: 1708.06733, 2017.
    [17]
    A. Shafahi, W. R. Huang, M. Najibi, et al., “Poison frogs! Targeted clean-label poisoning attacks on neural networks,” in Proc. of the 32nd Conference on Neural Information Processing Systems, Montréal, Canada, pp.6103–6113, 2018.
    [18]
    Y. Kim, R. Daly, J. Kim, et al., “Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors,” ACM SIGARCH Computer Architecture News, vol.42, no.3, pp.361–372, 2014. doi: 10.1145/2678373.2665726
    [19]
    E. M. Rudd, A. Rozsa, M. Günther, et al., “A survey of stealth malware attacks, mitigation measures, and steps toward autonomous open world solutions,” IEEE Communications Surveys & Tutorials, vol.19, no.2, pp.1145–1172, 2016.
    [20]
    J. Dai, C. Chen, and Y. Li, “A backdoor attack against LSTM-based text classification systems,” IEEE Access, vol. 7, DOI: 10.1109/ACCESS.2019.2941376, 2019.
    [21]
    X. Chen, A. Salem, M. Backes, et al., “Badnl: Backdoor attacks against NLP models,” ICML 2021 Workshop on Adversarial Machine Learning, DOI: 10.1145/3485832. 3485837, 2021.
    [22]
    K. Kurita, P. Michel, and G. Neubig, “Weight poisoning attacks on pre-trained models,” in Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pp.2793–2806, 2020.
    [23]
    P. Kiourti, K. Wardega, S. Jha, et al., “TroJDRL: Trojan attacks on deep reinforcement learning agents,” in Proc. 57th ACM/IEEE Design Automation Conference, Virtual Conference, article no.31, 2020.
    [24]
    Z. Yang, N. Iyer, J. Reimann, et al., “Design of intentional backdoors in sequential models,” arXiv preprint, arXiv: 1902.09972, 2019.
    [25]
    E. Bagdasaryan, A. Veit, Y. Hua, et al., “How to backdoor federated learning,” in Proc. of 23rd International Conference on Artificial Intelligence and Statistics, Palermo, Sicily, Italy, PMLR 108, pp.2938–2948, 2020.
    [26]
    A. N. Bhagoji, S. Chakraborty, P. Mittal, et al, “Analyzing federated learning through an adversarial lens,” in Proc. of the 36th International Conference on Machine Learning, Long Beach, California, USA, pp.634–643, 2019.
    [27]
    C. Xie, K. Huang, P.-Y. Chen, et al., “DBA: Distributed backdoor attacks against federated learning,” International Conference on Learning Representations, Addis Ababa, Ethiopia, https://openreview.net/pdf?id=rkgyS0VFvr, 2020.
    [28]
    M. Jagielski, A. Oprea, B. Biggio, et al., “Manipulating machine learning: Poisoning attacks and countermeasures for regression learning,” IEEE Symposium on Security and Privacy, San Francisco, California, USA, pp.19–35, 2018.
    [29]
    X. Chen, C. Liu, B. Li, et al., “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint, arXiv: 1712.05526, 2017.
    [30]
    B. Wang, Y. Yao, S. Shan, et al., “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” IEEE Symposium on Security and Privacy, San Francisco, California, USA, pp.707–723, 2019.
    [31]
    K. Liu, B. Dolan-Gavitt, and S. Garg, “Finepruning: Defending against backdooring attacks on deep neural networks,” Int. Symp. on Research in Attacks, Intrusions, and Defenses, Heraklion, Crete, Greece, pp.273–294, 2018.
    [32]
    S. Li, M. Xue, B. Zhao, et al., “Invisible backdoor attacks on deep neural networks via steganography and regularization,” IEEE Transactions on Dependable and Secure Computing, vol.18, pp.2088–2105, 2020.
    [33]
    N. Provos, “Defending against statistical steganalysis,” in Proc. USENIX Security Symposium, Washington D.C., USA, vol.10, pp.323–336, 2001.
    [34]
    Y. Liu, X. Ma, J. Bailey, et al., “Reflection backdoor: A natural backdoor attack on deep neural networks,” European Conference on Computer Vision, Glasgow, Scotland, pp.82–199, 2020.
    [35]
    R. Wan, B. Shi, L.-Y. Duan, et al., “Benchmarking single-image reflection removal algorithms,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp.3922–3930, 2017.
    [36]
    E. Sarkar, H. Benkraouda, and M. Maniatakos, “Facehack: Triggering backdoored facial recognition systems using facial characteristics,” arXiv preprint, arXiv:2006.11623, 2020.
    [37]
    FaceApp Technology Limited, “Faceapp,” https://faceapp. com, 2021-04-19.
    [38]
    A. Nguyen and A. Tran, “Wanet-imperceptible warping-based backdoor attack,” Int. Conf. on Learning Representations, Virtual Conference, arXiv: 2102.10369, 2021.
    [39]
    Y. Li, T. Zhai, B. Wu, et al., “Rethinking the trigger of backdoor attack,” Int. Conf. on Learning Representations, Virtual Conference, arXiv: 2004.04692, 2021.
    [40]
    H. Zhong, C. Liao, A. Squicciarini, et al., “Backdoor embedding in convolutional neural network models via invisible perturbation,” in Proc. of the Tenth ACM Conference on Data and Application Security and Privacy, Virtual Conference, pp.97−108, 2020.
    [41]
    S. -M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, et al., “Universal adversarial perturbations,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp.1765–1773, 2017.
    [42]
    T. J. L. Tan and R. Shokri, “Bypassing backdoor detection algorithms in deep learning,” IEEE European Symposium on Security and Privacy, Genoa, Italy, pp.175–183, 2020.
    [43]
    B. Tran, J. Li, and A. Madry, “Spectral signatures in backdoor attacks,” Advances in Neural Information Processing Systems, Montreal, Canada, pp.8000–8010, 2018.
    [44]
    B. Chen, W. Carvalho, N. Baracaldo, et al., “Detecting backdoor attacks on deep neural networks by activation clustering,” arXiv preprint, arXiv: 1811.03728, 2018.
    [45]
    A. Nguyen and A. Tran, “Input-aware dynamic backdoor attack,” The 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, pp.3454−3464, 2020.
    [46]
    A. Salem, R. Wen, M. Backes, et al., “Dynamic backdoor attacks against machine learning models,” arXiv preprint, arXiv: 2003.03675, 2020.
    [47]
    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, et al., “Generative adversarial networks,” Communications of the ACM, vol.63, no.11, pp.139–144, 2020. doi: 10.1145/3422622
    [48]
    S. Cheng, Y. Liu, S. Ma, et al., “Deep feature space trojan attack of neural networks by controlled detoxification,” The 35th AAAI Conference on Artificial Intelligence (AAAI 21), Virtual Conference, pp.1148–1156, 2021.
    [49]
    J. -Y. Zhu, T. Park, P. Isola, et al., “Unpaired image-to-image translation using cycleconsistent adversarial networks,” in Proc. of the IEEE International Conference on Computer Vision, Venice, Italy, pp.2223–2232, 2017.
    [50]
    R. S. S. Kumar, M. Nyström, J. Lambert, et al., “Adversarial machine learning-industry perspectives,” IEEE Security and Privacy Workshops, San Francisco, California, USA, pp.69–75, 2020.
    [51]
    E. Wenger, J. Passananti, A.N. Bhagoji, et al., “Backdoor attacks against deep learning systems in the physical world,” in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, pp.6206–6215, 2021.
    [52]
    Y. Gao, C. Xu, D. Wang, et al., “Strip: A defence against trojan attacks on deep neural networks,” in Proc. of the 35th Annual Computer Security Applications Conference, San Juan, Puerto Rico, USA, pp.113–125, 2019.
    [53]
    S. Garg, A. Kumar, V. Goel, et al., “Can adversarial weight perturbations inject neural backdoors,” in Proc. of the 29th ACM International Conference on Information & Knowledge Management, Virtual Conference, pp.2029–2032, 2020.
    [54]
    Y. Liu, S. Ma, Y. Aafer, et al., “Trojaning attack on neural networks,” Network and Distributed System Security Symp., San Diego, California, USA, DOI: 10.14722/ndss.2018.23300, 2018.
    [55]
    Y. Yao, H. Li, H. Zheng, et al., “Latent backdoor attacks on deep neural networks,” in Proc. of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, pp.2041–2055, 2019.
    [56]
    J. Yosinski, J. Clune, Y. Bengio, et al., “How transferable are features in deep neural networks?” in Advances in Neural Information Processing Systems 27 − the Proc. of 28th Conference on Neural Information Processing Systems (NeurIPS2014), Montreal, Quebec, Canada, pp.3320–3328, 2014.
    [57]
    Sasank Chilamkurthy, “Pytorch transfer learning tutorial,” available at: https://pytorch.org/tutorials/, 2021-11-06.
    [58]
    Google Codelabs, “Image classification transfer learning with inception v3,” available at: https://kiosk-dot-codelabs-site.appspot.com/codelabs/cpb102-txf-learning/#0, 2017-01-20.
    [59]
    E. Bagdasaryan and V. Shmatikov, “Blind backdoors in deep learning models,” arXiv preprint, arXiv: 2005.03823, 2020.
    [60]
    A. Saha, A. Subramanya, and H. Pirsiavash, “Hidden trigger backdoor attacks,” Proceedings of the AAAI Conference on Artificial Intelligence, vol.34, no.7, pp.11957–11965, 2020. doi: 10.1609/aaai.v34i07.6871
    [61]
    M. Barni, K. Kallas, and B. Tondi, “A new backdoor attack in cnns by training set corruption without label poisoning,” IEEE International Conference on Image Processing, Taipei, China, pp.101–105, 2019.
    [62]
    A. Turner, D. Tsipras, and A. Madry, “Label-consistent backdoor attacks,” arXiv preprint, arXiv: 1912.02771, 2019.
    [63]
    M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in Proc. of the 34th International Conference on Machine Learning, Sydney, Australia, pp.214–223, 2017.
    [64]
    S. Zhao, X. Ma, X. Zheng, et al., “Clean-label backdoor attacks on video recognition models,” in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp.14443–14452, 2020.
    [65]
    A. Madry, A. Makelov, L. Schmidt, et al., “Towards deep learning models resistant to adversarial attacks,” International Conference on Learning Representations, Vancouver, BC, Canada, arXiv: 1706.06083, 2018.
    [66]
    I.J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” International Conference on Learning Representations, San Diego, CA, USA, arXiv: 1412.6572, 2015.
    [67]
    Q. Xiao, Y. Chen, C. Shen, et al., “Seeing is not believing: Camouflage attacks on image scaling algorithms,” USENIX Security Symp., Santa Clara, CA, USA, pp.443–460, 2019.
    [68]
    E. Quiring and K. Rieck, “Backdooring and poisoning neural networks with image-scaling attacks,” IEEE Security and Privacy Workshops, San Francisco, California, USA, pp.41–47, 2020.
    [69]
    J. Dumford and W. Scheirer, “Backdooring convolutional neural networks via targeted weight perturbations,” IEEE International Joint Conference on Biometrics, Houston, TX, USA, pp.1–9, 2020.
    [70]
    R. Costales, C. Mao, R. Norwitz, et al., “Live trojan attacks on deep neural networks,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, pp.796–797, 2020.
    [71]
    M. Kerrisk, The Linux Programming Interface: A Linux and UNIX System Programming Handbook, San Francisco, California, USA: No Starch Press, 2010.
    [72]
    N. Ruff, “Windows memory forensics,” Journal in Computer Virology, vol.4, no.2, pp.83–100, 2008. doi: 10.1007/s11416-007-0070-0
    [73]
    C. Guo, R. Wu, and K.Q. Weinberger, “Trojannet: Embedding hidden trojan horse models in neural networks,” arXiv preprint, arXiv: 2002.10078, 2020.
    [74]
    J. Katz and Y. Lindell, Introduction to Modern Cryptography, Boca Raton, FL, USA: CRC Press, 2020.
    [75]
    M. Tehranipoor and F. Koushanfar, “A survey of hardware trojan taxonomy and detection,” IEEE Design & Test of Computers, vol.27, no.1, pp.10–25, 2010.
    [76]
    A. P. Felt, M. Finifter, E. Chin, et al., “A survey of mobile malware in the wild,” in Proc. of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, Chicago, Illinois, USA, pp.3–14, 2011.
    [77]
    A. S. Rakin, Z. He, and D. Fan, “TBT: Targeted neural network attack with bit trojan,” in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp.13198–13207, 2020.
    [78]
    R. Tang, M. Du, N. Liu, et al., “An embarrassingly simple approach for trojan attack in deep neural networks,” in Proc. of the 26th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, Virtual Conference, pp.218–228, 2020.
    [79]
    Y. Li, J. Hua, H. Wang, et al., “DeepPayload: Black-box backdoor attack on deep learning models through neural payload injection,” IEEE/ACM 43rd Int. Conf. on Software Engineering, Madrid, Spain, pp.263–274, 2021.
    [80]
    H. Li, Y. Wang, X. Xie, et al., “Light can hack your face! Black-box backdoor attack on face recognition systems,” arXiv preprint, arXiv: 2009.06996, 2020.
    [81]
    M.D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” European Conference on Computer Vision, Zurich, Switzerland, pp.818–833, 2014.
    [82]
    X. Xu, Q. Wang, H. Li, et al., “Detecting ai trojans using meta neural analysis,” IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp.103–120, 2021.
    [83]
    S. Kolouri, A. Saha, H. Pirsiavash, et al., “Universal litmus patterns: Revealing backdoor attacks in CNNs,” in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp.301–310, 2020.
    [84]
    K. Bonawitz, H. Eichner, W. Grieskamp, et al., “Towards federated learning at scale: System design,” in Proc. of the 2nd SysML Conf., Palo Alto, CA, USA, pp.374–388, 2019.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(2)

    Article Metrics

    Article views (1478) PDF downloads(308) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return