Turn off MathJax
Article Contents
LIN Chenhao, ZHANG Xingliang, SHEN Chao, “DeepLogic: Priority Testing of Deep Learning through Interpretable Logic Units,” Chinese Journal of Electronics, in press, doi: 10.23919/cje.2022.00.451, 2022.
Citation: LIN Chenhao, ZHANG Xingliang, SHEN Chao, “DeepLogic: Priority Testing of Deep Learning through Interpretable Logic Units,” Chinese Journal of Electronics, in press, doi: 10.23919/cje.2022.00.451, 2022.

DeepLogic: Priority Testing of Deep Learning through Interpretable Logic Units

doi: 10.23919/cje.2022.00.451
Funds:  This work is supported by the National Key Research and Development Program of China (2020AAA0107702), the National Natural Science Foundation of China (62006181, 62161160337, 62132011, U21B2018, U20A20177, 62206217), the Shaanxi Province Key Industry Innovation Program (2021ZD LGY01-02).
More Information
  • Author Bio:

    Chenhao LIN received the B.E. degree in automation from Xi’an Jiongtong University in 2011, the M.Sc. degree in electrical engineering from Columbia University, in 2013 and the Ph.D. degree from The Hong Kong Polytechnic University, in 2018. He is currently a Research Fellow at the Xi’an Jiongtong University of China. His research interests are in artificial intelligence security, adversarial attack and robustness, identity authentication, and pattern recognition. (Email: linchenhao@xjtu.edu.cn)

    Xingliang ZHANG received the B.E. degree from Information Engineering University. He is currently pursuing a Master’s degree in Cyberspace Security from Xi’an Jiaotong University. His current research interests include artificial intelligence security. (Email: zhangxliang@stu.xjtu.edu.cn)

    Chao SHEN received the B.S. degree in Automation from Xi’an Jiaotong University, China in 2007; and the Ph.D. degree in Control Theory and Control Engineering from Xi’an Jiaotong University, China in 2014. He is currently a Professor in the Faculty of Electronic and Information Engineering, Xi’an Jiaotong University of China. His current research interests include AI Security, insider/intrusion detection, behavioral biometrics, and measurement and experimental methodology. (Email: chaoshen@xjtu.edu.cn)

  • Received Date: 2022-03-22
  • Accepted Date: 2022-03-22
  • Available Online: 2023-08-19
  • With the increasing deployment of deep learning-based systems in various scenes, it is becoming important to conduct sufficient testing and evaluation of deep learning models to improve their interpretability and robustness. Recent studies have proposed different testing criteria and strategies for deep neural network (DNN) testing. However, they rarely conduct effective testing on the robustness of DNN models and lack interpretability. This paper proposes priority testing criteria called DeepLogic, to analyze the robustness of the DNN models from the perspective of model interpretability. Specifically, we first define the neural units in DNN with the highest average activation probability as “interpretable logic units.” Then we analyze the changes in these units to evaluate the model's robustness by conducting adversarial attacks. After that, the interpretable logic units of the inputs are taken as context attributes, and the probability distribution of the softmax layer in the model is taken as internal attributes to establish a comprehensive test prioritization framework. Finally, the weight fusion of context and internal factors is carried out, and the test cases are sorted according to this priority. The experimental results on 4 popular DNN models using 8 testing metrics show that our DeepLogic significantly outperforms existing state-of-the-art methods.
  • loading
  • [1]
    K. Eykholt, I. Evtimov, E. Fernandes, et al., “Robust physical-world attacks on deep learning visual classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp.1625–1634, 2018.
    [2]
    X. J. Ma, Y. H. Niu, L. Gu, et al., “Understanding adversarial attacks on deep learning based medical image analysis systems,” Pattern Recognition, vol.110, article no.article no. 107332, 2021. doi: 10.1016/j.patcog.2020.107332
    [3]
    K. D. Julian, J. Lopez, J. S. Brush, et al., “Policy compression for aircraft collision avoidance systems,” in Proceedings of the 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), Sacramento, CA, USA, pp.1–10, 2016.
    [4]
    K. Eykholt, I. Evtimov, E. Fernandes, et al., “Physical adversarial examples for object detectors,” in Proceedings of the 12th USENIX Conference on Offensive Technologies, Baltimore, MD, USA, pp.1, 2018.
    [5]
    J. M. Zhang, M. Harman, L. Ma, et al., “Machine learning testing: Survey, landscapes and horizons,” IEEE Transactions on Software Engineering, vol.48, no.1, pp.1–36, 2022. doi: 10.1109/TSE.2019.2962027
    [6]
    K. X. Pei, Y. Z. Cao, J. F. Yang, et al., “DeepXplore: Automated whitebox testing of deep learning systems,” in Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, pp.1–18, 2017.
    [7]
    L. Ma, F. Juefei-Xu, F. Y. Zhang, et al., “DeepGauge: Multi-granularity testing criteria for deep learning systems,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, pp.120–131, 2018.
    [8]
    Y. C. Sun, X. W. Huang, D. Kroening, et al., “Testing deep neural networks,” arXiv preprint, arXiv: 1803.04792, 2018.
    [9]
    Y. Feng, Q. K. Shi, X. Y. Gao, et al., “DeepGini: Prioritizing massive tests to enhance the robustness of deep neural networks,” in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, pp.177–188, 2020.
    [10]
    Y. Z. Dong, P. X. Zhang, J. Y. Wang, et al., “There is limited correlation between coverage and robustness for deep neural networks,” arXiv preprint, arXiv: 1911.05904, 2019.
    [11]
    D. Wang, Z. Y. Wang, C. R. Fang, et al., “DeepPath: Path-driven testing criteria for deep neural networks,” in Proceedings of the 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), Newark, CA, USA, pp.119–120, 2019.
    [12]
    T. W. Weng, H. Zhang, P. Y. Chen, et al., “Evaluating the robustness of neural networks: An extreme value theory approach,” in Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 2018.
    [13]
    G. Katz, C. Barrett, D. L. Dill, et al., “Reluplex: An efficient SMT solver for verifying deep neural networks,” in Proceedings of the 29th International Conference on Computer Aided Verification, Heidelberg, Germany, pp.97–117, 2017.
    [14]
    T. Gehr, M. Mirman, D. Drachsler-Cohen, et al., “AI2: Safety and robustness certification of neural networks with abstract interpretation,” in Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, pp.3–18, 2018.
    [15]
    Z. N. Li, X. X. Ma, C. Xu, et al., “Structural coverage criteria for neural networks could be misleading,” in Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), Montreal, QC, Canada, pp.89–92, 2019.
    [16]
    F. Harel-Canada, L. X. Wang, M. A. Gulzar, et al., “Is neuron coverage a meaningful measure for testing deep neural networks?,” in Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, pp.851–862, 2020.
    [17]
    J. Y. Wang, J. L. Chen, Y. C. Sun, et al., “RobOT: Robustness-oriented testing for deep learning systems,” in Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, pp.300–311, 2021.
    [18]
    Q. Hu, Y. J. Guo, M. Cordy, et al., “An empirical study on data distribution-aware test selection for deep learning enhancement,” ACM Transactions on Software Engineering and Methodology, vol.31, no.4, article no.78, 2022. doi: 10.1145/3511598
    [19]
    K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” in Proceedings of the 2nd International Conference on Learning Representations, Banff, AB, Canada, 2014.
    [20]
    B. L. Zhou, A. Khosla, A. Lapedriza, et al., “Learning deep features for discriminative localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.2921–2929, 2016.
    [21]
    R. R. Selvaraju, M. Cogswell, A. Das, et al., “Grad-CAM: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp.618–626, 2017.
    [22]
    D. Bau, J. Y. Zhu, H. Strobelt, et al., “Understanding the role of individual units in a deep neural network,” Proceedings of the National Academy of Sciences of the United States of America, vol.117, no.48, pp.30071–30078, 2020. doi: 10.1073/pnas.1907375117
    [23]
    Y. Bai, Y. Y. Zeng, Y. Jiang, et al., “Improving adversarial robustness via channel-wise activation suppressing,” in Proceedings of the 9th International Conference on Learning Representations, Virtual Event, 2021.
    [24]
    S. C. Han, C. H. Lin, C. Shen, et al., “Interpreting adversarial examples in deep learning: A review,” ACM Computing Surveys, vol.55, no.14s, article no.328, 2023. doi: 10.1145/3594869
    [25]
    G. Rothermel, R. H. Untch, C. Y. Chu, et al., “Prioritizing test cases for regression testing,” IEEE Transactions on Software Engineering, vol.27, no.10, pp.929–948, 2001. doi: 10.1109/32.962562
    [26]
    J. M. Kim and A. Porter, “A history-based test prioritization technique for regression testing in resource constrained environments,” in Proceedings of the 24th International Conference on Software Engineering, Orlando, FL, USA, pp.119–129, 2002.
    [27]
    Z. Li, M. Harman, and R. M. Hierons, “Search algorithms for regression test case prioritization,” IEEE Transactions on Software Engineering, vol.33, no.4, pp.225–237, 2007. doi: 10.1109/TSE.2007.38
    [28]
    D. Leon and A. Podgurski, “A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases,” in Proceedings of the 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003, Denver, CO, USA, pp. 442–453, 2003.
    [29]
    M. Tyagi and S. Malhotra, “An approach for test case prioritization based on three factors,” International Journal of Information Technology and Computer Science, vol.7, no.4, pp.79–86, 2015. doi: 10.5815/ijitcs.2015.04.09
    [30]
    W. J. Shen, Y. H. Li, L. Chen, et al., “Multiple-boundary clustering and prioritization to promote neural network retraining,” in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, Melbourne, VIC, Australia, pp.410–422, 2020.
    [31]
    J. Kim, R. Feldt, and S. Yoo, “Guiding deep learning system testing using surprise adequacy,” in Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, QC, Canada, pp.1039–1049, 2019.
    [32]
    A. Sharif, D. Marijan, and M. Liaaen, “DeepOrder: Deep learning for test case prioritization in continuous integration testing,” in Proceedings of the 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), Luxembourg, Luxembourg, pp.525–534, 2021.
    [33]
    Y. Li, M. Li, Q. X. Lai, et al., “TestRank: Bringing order into unlabeled test instances for deep learning tasks,” in Proceedings of the 35th International Conference on Neural Information Processing Systems, virtual, pp.20874–20886, 2021.
    [34]
    H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp.1520–1528, 2015.
    [35]
    M. Lin, Q. Chen, and S. C. Yan, “Network in network,” arXiv preprint, arXiv: 1312.4400, 2013.
    [36]
    A. Mor, “Evaluate the effectiveness of test suite prioritization techniques using APFD metric,” IOSR Journal of Computer Engineering, vol.16, no.4, pp.47–51, 2014. doi: 10.9790/0661-16414751
    [37]
    I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 2015.
    [38]
    A. Madry, A. Makelov, L. Schmidt, et al., “Towards deep learning models resistant to adversarial attacks,” in Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 2018.
    [39]
    N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, pp.39–57, 2017.
    [40]
    H. Y. Zhang, Y. D. Yu, J. T. Jiao, et al., “Theoretically principled trade-off between robustness and accuracy,” in Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, pp.7472–7482, 2019.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(8)

    Article Metrics

    Article views (167) PDF downloads(20) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return