XU Wenhua and QIN Zheng, “Constructing Decision Trees for Mining High-speed Data Streams,” Chinese Journal of Electronics, vol. 21, no. 2, pp. 215-220, 2012,
Citation:
XU Wenhua and QIN Zheng, “Constructing Decision Trees for Mining High-speed Data Streams,” Chinese Journal of Electronics, vol. 21, no. 2, pp. 215-220, 2012,
XU Wenhua and QIN Zheng, “Constructing Decision Trees for Mining High-speed Data Streams,” Chinese Journal of Electronics, vol. 21, no. 2, pp. 215-220, 2012,
Citation:
XU Wenhua and QIN Zheng, “Constructing Decision Trees for Mining High-speed Data Streams,” Chinese Journal of Electronics, vol. 21, no. 2, pp. 215-220, 2012,
Very fast decision tree is one of the most successful and prominent algorithms specifically designed for stream data classification. In this paper, we develop a new decision tree induction model CFDT (Clustering feature decision tree model), which is an extension to VFDT (Very fast decision tree). CFDT applies a micro-clustering algorithm that scans the data only once to provide the statistical summaries of the data for incremental decision tree induction. Moreover, micro-clusters also serve as classifiers in tree leaves to improve classification accuracy and reinforce any-time property. Our experiments on synthetic and real-world datasets show that CFDT is highly scalable for data streams while also generating high classification accuracy with high speed.