ZHAI Yun, MA Nan, RUAN Da, AN Bing. An Effective Over-sampling Method for Imbalanced Data Sets Classification[J]. Chinese Journal of Electronics, 2011, 20(3): 489-494.
Citation: ZHAI Yun, MA Nan, RUAN Da, AN Bing. An Effective Over-sampling Method for Imbalanced Data Sets Classification[J]. Chinese Journal of Electronics, 2011, 20(3): 489-494.

An Effective Over-sampling Method for Imbalanced Data Sets Classification

  • Received Date: 2010-09-01
  • Rev Recd Date: 2011-01-01
  • Publish Date: 2011-07-25
  • Imbalanced data sets in real-world applications have a majority class with normal instances and a minority class with abnormal or important instances. Learning from such data sets usually generates biased classifiers that have a higher predictive accuracy over the majority class, but a rather poorer predictive accuracy over the minority class. The Synthetic minority over-sampling technique (SMOTE) is specifically designed for learning from imbalanced data sets. This paper presents a novel approach for learning from imbalanced data sets, based on an improved SMOTE algorithm. The approach deals with noise data by a hierarchical filtering mechanism, employs a selection strategy of the minority instances and makes full use of dynamic distribution density of the minority followed by the SMOTE process. This empirical analysis of the approach showed quantitatively competitive with SMOTE and series of its improved algorithm in terms of the receiver operating characteristic curve when applied to several highly and moderately imbalanced data sets.
  • loading
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (710) PDF downloads(2043) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return