Turn off MathJax
Article Contents
Zequn NIU, Jingfeng XUE, Yong WANG, et al., “QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic Streams,” Chinese Journal of Electronics, vol. 33, no. 3, pp. 1–12, 2024 doi: 10.23919/cje.2022.00.360
Citation: Zequn NIU, Jingfeng XUE, Yong WANG, et al., “QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic Streams,” Chinese Journal of Electronics, vol. 33, no. 3, pp. 1–12, 2024 doi: 10.23919/cje.2022.00.360

QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic Streams

doi: 10.23919/cje.2022.00.360
More Information
  • Author Bio:

    Zequn NIU was born in 1994. He received the B.E. degree in software engineering from Beijing Institute of Technology. He is a Ph.D. candidate of Beijing Institute of Technology. His research interests include data mining and traffic analysis. (Email: niuzq@ouchn.edu.cn)

    Jingfeng XUE was born in 1975. He is a Professor and Ph.D. supervisor in Beijing Institute of Technology. His main research interests focus on network security, data security and software security

    Yong WANG was born in 1975. She is an Associate Professor of Beijing Institute of Technology. Her main research interests focus on cyber security and machine learning. (Email: wangyong@bit.edu.cn)

    Tianwei LEI was born in 1993. She received the M.E. degree in software engineering from Beijing Institute of Technology. She is a Ph.D. candidate of Beijing Institute of Technology. Her research interests include software fault and malware analysis

    Weijie HAN was born in 1980. He received the Ph.D. degree from Beijing Institute of Technology. He is currently a Lecture in Space Engineering University. His research interests include malware detection and APT detection

    Xianwei GAO was born in 1978. He received the Ph.D. degree from Beijing Institute of Technology. He has many years of experience in security operation and software engineering in a famous IT enterprise. His research interests mainly focus on artificial intelligence and cyber security

  • Corresponding author: Email: wangyong@bit.edu.cn
  • Received Date: 2022-10-24
  • Accepted Date: 2023-02-14
  • Available Online: 2023-07-14
  • In practical abnormal traffic detection scenarios, traffic often appears as drift, imbalanced and rare labeled streams, and how to effectively identify malicious traffic in such complex situations has become a challenge for malicious traffic detection. Researchers have extensive studies on malicious traffic detection with single challenge, but the detection of complex traffic has not been widely noticed. Queried adaptive random forests (QARF) is proposed to detect traffic streams with concept drift, imbalance and lack of labeled instances. QARF is an online active learning based approach which combines Adaptive Random Forests method and adaptive margin sampling strategy. QARF achieves querying a small number of instances from unlabeled traffic streams to obtain effective training. We conduct experiments using the NSL-KDD dataset to evaluate the performance of QARF. QARF is compared with other state-of-the-art methods. The experimental results show that QARF obtains 98.20% accuracy on the NSL-KDD dataset. QARF performs better than other state-of-the-art methods in comparisons.
  • loading
  • [1]
    H. Y. Liu and B. Lang, “Machine learning and deep learning methods for intrusion detection systems: A survey,” Applied Sciences, vol. 9, no. 20, article no. 4396, 2019. doi: 10.3390/app9204396
    [2]
    L. Yang, D. M. Manias, and A. Shami, “PWPAE: An ensemble framework for concept drift adaptation in IoT data streams,” in Proceedings of 2021 IEEE Global Communications Conference, Madrid, Spain, pp.1–6, 2021.
    [3]
    G. Andresini, F. Pendlebury, F. Pierazzi, et al., “INSOMNIA: Towards concept-drift robustness in network intrusion detection,” in Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, Virtual Event, pp.111–122, 2021.
    [4]
    Y. Zhang, J. Niu, G. J. He, et al., “Network intrusion detection based on active semi-supervised learning,” in Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, Taipei, China, pp.129–135, 2021.
    [5]
    H. L. Du, Y. Zhang, K. Gang, et al., “Online ensemble learning algorithm for imbalanced data stream,” Applied Soft Computing, vol. 107, article no. 107378, 2021. doi: 10.1016/j.asoc.2021.107378
    [6]
    A. Chhabra, T. S. A. Nandyala, and P. Branco, “HEAL: Heterogeneous ensemble and active learning framework,” in Proceedings of the 34th Canadian Conference on Artificial Intelligence, Vancouver, Canada, pp.1-6, 2021.
    [7]
    C. A. M. S. Teles, C. R. G. V. Filho, and F. da Rocha Henriques, “A black-box framework for malicious traffic detection in ICT environments,” in Handbook of Research on Cyber Crime and Information Privacy, M. M. Cruz-Cunha and N. R. Mateus-Coelho, Eds. IGI-Global, Hershey, PA, USA, pp.1–20, 2021.
    [8]
    M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network anomaly detection: Methods, systems and tools,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1, pp. 303–336, 2014. doi: 10.1109/SURV.2013.052213.00046
    [9]
    D. E. Denning, “An intrusion-detection model,” IEEE Transactions on Software Engineering, vol. SE-13, no. 2, pp. 222–232, 1987. doi: 10.1109/TSE.1987.232894
    [10]
    A. Javaid, Q. Niyaz, W. Q. Sun, et al., “A deep learning approach for network intrusion detection system, ” in Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies, New York City, NY, USA, pp.21–26, 2015.
    [11]
    J. Klein, S. Bhulai, M. Hoogendoorn, et al., “Plusmine: Dynamic active learning with semi-supervised learning for automatic classification,” in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Melbourne, Australia, pp.146–153, 2021.
    [12]
    H. P. Yao, D. Y. Fu, P. Y. Zhang, et al., “MSML: A novel multilevel semi-supervised machine learning framework for intrusion detection system,” IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1949–1959, 2019. doi: 10.1109/JIOT.2018.2873125
    [13]
    M. Odiathevar, W. K. G. Seah, and M. Frean, “A hybrid online offline system for network anomaly detection,” in Proceedings of 2019 28th International Conference on Computer Communication and Networks, Valencia, Spain, pp.1–9, 2019.
    [14]
    E. Mahmodi, H. S. Yazdi, and A. G. Bafghi, “A drift aware adaptive method based on minimum uncertainty for anomaly detection in social networking,” Expert Systems with Applications, vol. 162, article no. 113881, 2020. doi: 10.1016/j.eswa.2020.113881
    [15]
    S. Bhatia, A. Jain, P. Li, et al., “MStream: Fast anomaly detection in multi-aspect streams,” in Proceedings of the Web Conference 2021, Ljubljana, Slovenia, pp.3371–3382, 2021.
    [16]
    M. Jain and G. Kaur, “Distributed anomaly detection using concept drift detection based hybrid ensemble techniques in streamed network data,” Cluster Computing, vol. 24, no. 3, pp. 2099–2114, 2021. doi: 10.1007/s10586-021-03249-9
    [17]
    R. K. Deka, D. K. Bhattacharyya, and J. K. Kalita, “Active learning to detect DDoS attack using ranked features,” Computer Communications, vol. 145, pp. 203–222, 2019. doi: 10.1016/j.comcom.2019.06.010
    [18]
    W. L. Al-Yaseen, Z. A. Othman, and M. Z. A. Nazri, “Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system,” Expert Systems with Applications, vol. 67, pp. 296–303, 2017. doi: 10.1016/j.eswa.2016.09.041
    [19]
    A. Ahmim, L. Maglaras, M. A. Ferrag, et al., “A novel hierarchical intrusion detection system based on decision tree and rules-based models,” in Proceedings of the 15th International Conference on Distributed Computing in Sensor Systems, Santorini, Greece, pp.228–233, 2019.
    [20]
    R. N. Wang, J. L. Fei, M. Zhao, et al., “DA-transfer: A transfer method for malicious network traffic classification with small sample problem,” Electronics, vol. 11, no. 21, article no. 3577, 2022. doi: 10.3390/electronics11213577
    [21]
    K. D. Lin, X. L. Xu, and F. Xiao, “MFFusion: A multi-level features fusion model for malicious traffic detection based on deep learning,” Computer Networks, vol. 202, article no. 108658, 2022. doi: 10.1016/j.comnet.2021.108658
    [22]
    R. Chapaneri and S. Shah, “Enhanced detection of imbalanced malicious network traffic with regularized Generative Adversarial Networks,” Journal of Network and Computer Applications, vol. 202, article no. 103368, 2022. doi: 10.1016/j.jnca.2022.103368
    [23]
    H. M. Gomes, A. Bifet, J. Read, et al., “Adaptive random forests for evolving data stream classification,” Machine Learning, vol. 106, no. 9-10, pp. 1469–1495, 2017. doi: 10.1007/s10994-017-5642-8
    [24]
    P. Domingos and G. Hulten, “Mining high-speed data streams,” in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, pp.71–80, 2000.
    [25]
    N. C. Oza, “Online bagging and boosting,” in Proceedings of 2005 IEEE International Conference on Systems, Man and Cybernetics, Waikoloa, HI, USA, pp.2340–2345, 2005.
    [26]
    A. Bifet and R. Gavaldà, “Learning from time-changing data with adaptive windowing, ” in Proceedings of the 2007 SIAM International Conference on Data Mining, Minneapolis, MA, USA, pp.443–448, 2007.
    [27]
    A. Shahraki, M. Abbasi, A. Taherkordi, et al., “Active learning for network traffic classification: A technical study,” IEEE Transactions on Cognitive Communications and Networking, vol. 8, no. 1, pp. 422–439, 2022. doi: 10.1109/TCCN.2021.3119062
    [28]
    M. Tavallaee, E. Bagheri, W. Lu, et al., “A detailed analysis of the KDD CUP 99 data set,” in Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, pp.1–6, 2009.
    [29]
    V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset,” Expert Systems with Applications, vol. 38, no. 5, pp. 5947–5957, 2011.
    [30]
    J. Montiel, M. Halford, S. M. Mastelini, et al., “River: Machine learning for streaming data in Python,” The Journal of Machine Learning Research, vol. 22, no. 1, article no. 110, 2021.
    [31]
    J. Gama, R. Sebastião, and P. P. Rodrigues, “Issues in evaluation of stream learning algorithms,” in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, pp.329–338, 2009.
    [32]
    J. C. Shan, W. K. Liu, C. X. Chu, et al., “Online active learning with drifted data streams using paired ensemble framework,” ITM Web of Conferences, vol. 12, article no. 05016, 2017. doi: 10.1051/itmconf/20171205016
    [33]
    U. Ahmed, J. C. W. Lin, and G. Srivastava, “A resource allocation deep active learning based on load balancer for network intrusion detection in SDN sensors,” Computer Communications, vol. 184, pp. 56–63, 2022. doi: 10.1016/j.comcom.2021.12.009
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)  / Tables(8)

    Article Metrics

    Article views (155) PDF downloads(22) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return