CHENG Ming, WU Guoqing, YUAN Mengting, WAN Hongyan. Semi-supervised Software Defect Prediction Using Task-Driven Dictionary Learning[J]. Chinese Journal of Electronics, 2016, 25(6): 1089-1096. doi: 10.1049/cje.2016.08.034
Citation: CHENG Ming, WU Guoqing, YUAN Mengting, WAN Hongyan. Semi-supervised Software Defect Prediction Using Task-Driven Dictionary Learning[J]. Chinese Journal of Electronics, 2016, 25(6): 1089-1096. doi: 10.1049/cje.2016.08.034

Semi-supervised Software Defect Prediction Using Task-Driven Dictionary Learning

doi: 10.1049/cje.2016.08.034
Funds:  This work is supported by the National Natural Science Foundation of China (No.91118003, No.61170022, No.61003071).
  • Received Date: 2015-04-29
  • Rev Recd Date: 2015-10-13
  • Publish Date: 2016-11-10
  • We present a semi-supervised approach for software defect prediction. The proposed method is designed to address the special problematic characteristics of software defect datasets, namely, lack of labeled samples and class-imbalanced data. To alleviate these problems, the proposed method features the following components. Being a semi-supervised approach, it exploits the wealth of unlabeled samples in software systems by evaluating the confidence probability of the predicted labels, for each unlabeled sample. And we propose to jointly optimize the classifier parameters and the dictionary by a task-driven formulation, to ensure that the learned features (sparse code) are optimal for the trained classifier. Finally, during the dictionary learning process we take the different misclassification costs into consideration to improve the prediction performance. Experimental results demonstrate that our method outperforms several representative state-of-the-art defect prediction methods.
  • loading
  • G. Czibula, Z. Marian and I.G. Czibula, "Software defect prediction using relational association rule mining", Information Sciences, Vol.264, pp.260-278, 2014.
    I.H. Laradji, M. Alshayeb and L. Ghouti, "Software defect prediction using ensemble learning on selected features", Information and Software Technology, Vol.58, pp.388-402, 2015.
    M. Li, H.Y. Zhang, R.X. Wu, et al., "Sample-based software defect prediction with active and semi-supervised learning", Automated Software Engineering, Vol.19, No.2, pp.201-230, 2012.
    J. Zheng, "Cost-sensitive boosting neural networks for software defect prediction", Expert Systems with Applications, Vol.37, No.6, pp.4537-4543, 2010.
    Y. Jiang, M. Li and Z.H. Zhou, "Software defect detection with rocus", Journal of Computer Science and Technology, Vol.26, No.2, pp.328-342, 2011.
    K.O. Elish and M.O. Elish, "Predicting defect-prone software modules using support vector machines", Journal of Systems and Software, Vol.81, No.5, pp.649-660, 2008.
    T. Wang and W.H. Li, "Naive bayes software defect prediction model", Proc. of the International Conference on Computational Intelligence and Software Engineering, Wuhan, China, pp.1-4, 2010.
    Z.B. Sun, Q.B. Song and X.Y. Zhu, "Using coding-based ensemble learning to improve software defect prediction", IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Vol.42, No.6, pp.1806-1817, 2012.
    L.H. Zhang and K.Y. Zhang, "Weighted discriminative sparse coding for image classification", Chinese Journal of Electronics, Vol.23, No.1, pp.104-108, 2014.
    X.Y. Jing, S. Ying, Z.W. Zhang, et al., "Dictionary learning based software defect prediction", Proc. of the 36th International Conference on Software Engineering, Hyderabad, India, pp.414-423, 2014.
    J. Mairal, F. Bach and J. Ponce, "Task-driven dictionary learning", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.34, No.4, pp.791-804, 2012.
    H. Zou and T. Hastie, "Regularization and variable selection via the elastic net", Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol.67, No.2, pp.301-320, 2005.
    B. Colson, P. Marcotte and G. Savard, "An overview of bilevel optimization", Annals of Operations Research, Vol.153, No.1, pp.235-256, 2007.
    H. Lee and A.Y. Ng, "Efficient sparse coding algorithms", Proc. of Advances in Neural Information Processing Systems, Vancouver, Canada, pp.801-808, 2006.
    X.Y. Liu, J.X. Wu and Z.H. Zhou, "Exploratory undersampling for class-imbalance learning", IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol.39, No.2, pp.539-550, 2009.
    N.V. Chawla, K.W. Bowyer, L.O. Hall, et al., "Smote: synthetic minority over-sampling technique", Journal of Artificial Intelligence Research, Vol.16, pp.321-357, 2002.
    R.E Schapire, "A brief introduction to Boosting", Proc. of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp.1401-1406, 1999.
    Y. Jiang, B. Cukic and T. Menzies, "Cost curve evaluation of fault prediction models", Proc. of the 19th International Symposium on Software Reliability Engineering, Seattle, USA, pp.197-206, 2008.
    M. Aharon, M. Elad and A. Bruckstein, "K-SVD: An algorithm for designing over-complete dictionaries for sparse representation", IEEE Transactions on Signal Processing, Vol.54, No.11, pp.4311-4322, 2006.
  • 加载中


    通讯作者: 陈斌,
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (201) PDF downloads(1250) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint