Turn off MathJax
Article Contents
Lingshuo MENG, Xueluan GONG, Yanjiao CHEN, “BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data Prediction,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–16, xxxx doi: 10.23919/cje.2023.00.041
Citation: Lingshuo MENG, Xueluan GONG, Yanjiao CHEN, “BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data Prediction,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–16, xxxx doi: 10.23919/cje.2023.00.041

BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data Prediction

doi: 10.23919/cje.2023.00.041
More Information
  • Author Bio:

    Lingshuo MENG received his B.E. degree in cyber science and engineering from Wuhan University in 2022. He is currently pursuing the M.S. degree with the College of Electrical Engineering with Zhejiang University, China. His research interests include network security and AI security.(Email: emeng@zju.edu.cn)

    Xueluan GONG received her B.S. degree in Computer Science and Electronic Engineering from Hunan University in 2018. She is currently pursuing the Ph.D. degree in Computer Science with Wuhan University, China. Her research interests include AI security and information security.(Email: xueluangong@whu.edu.cn)

    Yanjiao CHEN received her B.E. degree in Electronic Engineering from Tsinghua University in 2010 and Ph.D. degree in Computer Science and Engineering from Hong Kong University of Science and Technology in 2015. She is currently a Bairen researcher in the College of Electrical Engineering, Zhejiang University, China. Her research interests include computer networks, network security and Internet of Things. (Email: chenyanjiao@zju.edu.cn)

  • Corresponding author: Email: chenyanjiao@zju.edu.cn
  • Received Date: 2023-02-12
  • Accepted Date: 2023-08-24
  • Available Online: 2024-02-04
  • Backdoor attacks pose great threats to deep neural network (DNN) models. However, all existing backdoor attacks are designed for unstructured data (image, voice, and text), but not structured tabular data, which has wide real-world applications, e.g., recommendation systems, fraud detection, and click-through rate (CTR) prediction. To bridge this research gap, we make the first attempt to design a backdoor attack framework, named BAD-FM, for tabular data prediction models. Unlike images or voice samples composed of homogeneous pixels or signals with continuous values, tabular data samples contain well-defined heterogeneous fields that are usually sparse and discrete. Moreover, tabular data prediction models do not solely rely on deep networks but combine shallow components (e.g., factorization machine, FM) with deep components to capture sophisticated feature interactions among fields. To tailor the backdoor attack framework to tabular data models, we carefully design field selection and trigger formation algorithms to intensify the influence of the trigger on the backdoored model. We evaluate BAD-FM with extensive experiments on four datasets, i.e., HUAWEI, Criteo, Avazu, and KDD. The results show that BAD-FM can achieve an attack success rate as high as 100% at a poison ratio of 0.001%, outperforming baselines adapted from existing backdoor attacks against unstructured data models. As tabular data prediction models are widely adopted in finance and commerce, our work may raise alarms on the potential risks of these models and spur future research on defenses.
  • 1https://www.kaggle.com/c/criteo-display-ad-challenge
    2https://www.kaggle.com/c/avazu-ctr-prediction/data
    3https://www.kaggle.com/c/kddcup2012-track2
    4https://www.kaggle.com/louischen7/2020-digix-advertisement-ctr-prediction
    5https://deepctr-torch.readthedocs.io/en/latest
    6We can manually determine the trigger of BL1 and BL4.
  • loading
  • [1]
    T. Y. Gu, K. Liu, B. Dolan-Gavitt, et al., “BadNets: Evaluating backdooring attacks on deep neural networks,” IEEE Access, vol. 7 pp. 47230–47244, 2019. doi: 10.1109/ACCESS.2019.2909068
    [2]
    Y. Q. Liu, S. Q. Ma, Y. Aafer, et al., “Trojaning attack on neural networks,” in Proceedings of the 25th Annual Network and Distributed System Security Symposium, San Diego, USA, 2018.
    [3]
    S. H. Zhao, X. J. Ma, X. Zheng, et al., “Clean-label backdoor attacks on video recognition models,” in Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp. 14431–14440, 2020.
    [4]
    Y. Q. Liu, G. Y. Shen, G. H. Tao, et al., “Piccolo: Exposing complex backdoors in NLP transformer models,” in Proceedings of the 2022 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp. 2025–2042, 2022.
    [5]
    W. Zong, Y. W. Chow, W. Susilo, et al., “TrojanModel: A practical Trojan attack against automatic speech recognition systems,” in Proceedings of the 2023 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp. 1667–1683, 2023.
    [6]
    J. R. Qin, W. N. Zhang, R. Su, et al., “Retrieval & interaction machine for tabular data prediction,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discov ery & Data Mining, Virtual Event, pp. 1379–1389, 2021.
    [7]
    K. Kireev, B. Kulynych, and C. Troncoso, “Adversarial robustness for tabular data through cost and utility awareness,” in Proceedings of the 30th Annual Network and Distributed System Security Symposium, San Diego, CA, USA, 2023.
    [8]
    Q. Pang, Y. Y. Yuan, S. Wang, et al., “ADI: Adversarial dominating inputs in vertical federated learning systems,” in Proceedings of the 2023 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp. 1875–1892, 2023.
    [9]
    S. Rendle, “Factorization machines,” in Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, Australia, pp. 995–1000, 2010.
    [10]
    R. J. Oentaryo, E. P. Lim, J. W. Low, et al., “Predicting response in mobile advertising with hierarchical importance-aware factorization machine,” in Proceedings of the 7th ACM International Conference on Web Search and Data Mining, New York, NY, USA, pp. 123–132, 2014.
    [11]
    W. N. Zhang, T. M. Du, and J. Wang, “Deep learning over multi-field categorical data,” in Proceedings of the 38th European Conference on Information Retrieval, Padua, Italy, pp. 45–57, 2016.
    [12]
    H. T. Cheng, L. Koc, J. Harmsen, et al., “Wide & deep learning for recommender systems,” in Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, pp. 7–10, 2016.
    [13]
    J. X. Lian, X. H. Zhou, F. Z. Zhang, et al., “xDeepFM: Combining explicit and implicit feature interactions for recommender systems,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, pp. 1754–1763, 2018.
    [14]
    S. F. Li, M. H. Xue, B. Z. H. Zhao, et al., “Invisible backdoor attacks on deep neural networks via steganography and regularization,” IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 5, pp. 2088–2105, 2021. doi: 10.1109/TDSC.2020.3021407
    [15]
    X. L. Gong, Y. J. Chen, Q. Wang, et al., “Defense-resistant backdoor attacks against deep neural networks in outsourced cloud environment,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 8, pp. 2617–2631, 2021. doi: 10.1109/JSAC.2021.3087237
    [16]
    A. Saha, A. Subramanya, and H. Pirsiavash, “Hidden trigger backdoor attacks,” in Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, USA, pp. 11957–11965, 2020.
    [17]
    X. Y. Chen, C. Liu, B. Li, et al., “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint, arXiv: 1712.05526, 2017.
    [18]
    J. Y. Lin, L. Xu, Y. Q. Liu, et al., “Composite backdoor attack for deep neural network by mixing existing benign features,” in Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, pp. 113–131, 2020.
    [19]
    K. C. Lee, B. Orten, A. Dasdan, et al., “Estimating conversion rate in display advertising from past erformance data,” in Proceedings of the 18th ACM SIGKDD International Confer ence on Knowledge Discovery and Data Mining, Beijing, China, pp. 768–776, 2012.
    [20]
    X. R. He, J. F. Pan, O. Jin, et al., “Practical lessons from predicting clicks on ads at facebook,” in Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, New York, NY, USA, pp. 1–9, 2014.
    [21]
    R. X. Wang, R. Shivanna, D. Cheng, et al., “DCN V2: Improved deep & cross network and practical lessons for web-scale learning to rank systems,” in Proceedings of the Web Conference, Ljubljana, Slovenia, pp. 1785–1797, 2021.
    [22]
    H. F. Guo, R. M. Tang, Y. M. Ye, et al., “DeepFM: A factorization-machine based neural network for CTR prediction,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 1725–1731, 2017.
    [23]
    E. Wenger, J. Passananti, A. N. Bhagoji, et al., “Backdoor attacks against deep learning systems in the physical world,” in Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, pp. 6202–6211, 2021.
    [24]
    C. L. Xie, K. L. Huang, P. Y. Chen, et al., “DBA: Distributed backdoor attacks against federated learning,” in Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [25]
    Y. Zeng, M. Z. Pan, H. A. Just, et al., “Narcissus: A practical clean-label backdoor attack with limited information,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen, Denmark, pp. 771–785, 2023.
    [26]
    K. Liu, B. Dolan-Gavitt, and S. Garg, “Fine-pruning: Defending against backdooring attacks on deep neural networks,” in Proceedings of the 21st International Symposium on Research in Attacks, Intrusions, and Defenses, Heraklion, Greece, pp. 273–294, 2018.
    [27]
    B. Tran, J. Li, and A. Mądry, “Spectral signatures in backdoor attacks,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 8011–8021, 2018.
    [28]
    W. L. Ma, D. R. Wang, R. X. Sun, et al., “The “Beatrix” resurrections: Robust backdoor detection via gram matrices,” in Proceedings of the 30th Annual Network and Distributed System Security Symposium, San Diego, CA, USA, 2023.
    [29]
    Q. Q. Song, D. H. Cheng, H. N. Zhou, et al., “Towards automated neural interaction discovery for click-through rate prediction,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, pp. 945–955, 2020.
    [30]
    A. Salem, R. Wen, M. Backes, et al., “Dynamic backdoor attacks against machine learning models,” in Proceedings of the IEEE 7th European Symposium on Security and Privacy, Genoa, Italy, pp. 703–718, 2022.
    [31]
    S. Wang, S. Nepal, C. Rudolph, et al., “Backdoor attacks against transfer learning with pre-trained deep learning models,” IEEE Transactions on Services Computing, vol. 15, no. 3, pp. 1526–1539, 2022. doi: 10.1109/TSC.2020.3000900
    [32]
    T. Q. Zhai, Y. M. Li, Z. Q. Zhang, et al., “Backdoor attack against speaker verification,” in Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, pp. 2560–2564, 2021.
    [33]
    J. Z. Dai, C. S. Chen, and Y. F. Li, “A backdoor attack against LSTM-based text classification systems,” IEEE Access, vol. 7 pp. 138872–138878, 2019. doi: 10.1109/ACCESS.2019.2941376
    [34]
    S. Garg, A. Kumar, V. Goel, et al., “Can adversarial weight perturbations inject neural backdoors,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, pp. 2029–2032, 2020.
    [35]
    W. K. Yang, L. Li, Z. Y. Zhang, et al., “Be careful about poisoned word embeddings: Exploring the vulnerability of the embedding layers in NLP models,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, pp. 2048–2058, 2021.
    [36]
    M. Jagielski, G. Severi, N. P. Harger, et al., “Subpopulation data poisoning attacks,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, pp. 3104–3122, 2021.
    [37]
    E. Chou, F. Tramèr, and G. Pellegrino, Sentinet: Detecting localized universal attacks against deep learning systems,” in Proceedings of the 2020 IEEE Security and Privacy Workshops, San Francisco, CA, USA, pp. 48–54, 2020.
    [38]
    Y. S. Gao, C. E. Xu, D. R. Wang, et al., “STRIP: A defence against Trojan attacks on deep neural networks,” in Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, PR, USA, pp. 113–125, 2019.
    [39]
    Y. Q. Liu, W. C. Lee, G. H. Tao, et al., “ABS: Scanning neural networks for back-doors by artificial brain stimulation,” in Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, pp. 1265–1282, 2019.
    [40]
    B. L. Wang, Y. S. Yao, S. Shan, et al., “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” in Proceedings of the 2019 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp. 707–723, 2019.
    [41]
    D. Tang, X. F. Wang, H. X. Tang, et al., “Demon in the variant: Statistical analysis of DNNs for robust backdoor contamination detection,” in Proceedings of the 30th USENIX Security Symposium, pp. 1541–1558, 2021. (查阅网上资料, 未找到本条文献出版地信息, 请确认) .

    D. Tang, X. F. Wang, H. X. Tang, et al., “Demon in the variant: Statistical analysis of DNNs for robust backdoor contamination detection,” in Proceedings of the 30th USENIX Security Symposium, pp. 1541–1558, 2021. (查阅网上资料, 未找到本条文献出版地信息, 请确认).
    [42]
    X. J. Xu, Q. Wang, H. C. Li, et al., “Detecting AI Trojans using meta neural analysis,” in Proceedings of the 2021 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, pp. 103–120, 2021.
    [43]
    B. Sarwar, G. Karypis, J. Konstan, et al., “Item-based collaborative filtering recommendation algorithms,” in Proceedings of the 10th International Conference on World Wide Web, Hong Kong, China, pp. 285–295, 2001.
    [44]
    P. Covington, J. Adams, and E. Sargin, “Deep neural networks for YouTube recommendations,” in Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, pp. 191–198, 2016.
    [45]
    Y. Juan, Y. Zhuang, W. S. Chin, et al., “Field-aware factorization machines for CTR prediction,” in Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, pp. 43–50, 2016.
    [46]
    J. Xiao, H. Ye, X. N. He, et al., “Attentional factorization machines: Learning the weight of feature interactions via attention networks,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 3119–3125, 2017.
    [47]
    Y. R. Qu, H. Cai, K. Ren, et al., “Product-based neural networks for user response prediction,” in Proceedings of the IEEE 16th International Conference on Data Mining, Barcelona, Spain, pp. 1149–1154, 2016.
    [48]
    X. N. He and T. S. Chua, “Neural factorization machines for sparse predictive analytics,” in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Japan, pp. 355–364, 2017.
    [49]
    R. X. Wang, B. Fu, G. Fu, et al., “Deep & cross network for ad click predictions,” in Proceedings of the ADKDD'17, Halifax, Canada, article no. 12, 2017.
    [50]
    W. P. Song, C. C. Shi, Z. P. Xiao, et al., “AutoInt: Automatic feature interaction learning via self-attentive neural networks,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, pp. 1161–1170, 2019.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(3)  / Tables(12)

    Article Metrics

    Article views (70) PDF downloads(12) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return