LI Wenping, YANG Jing, ZHANG Jianpei. Sampling Streaming Data Along Geodesic[J]. Chinese Journal of Electronics, 2015, 24(2): 251-257. doi: 10.1049/cje.2015.04.005
Citation: LI Wenping, YANG Jing, ZHANG Jianpei. Sampling Streaming Data Along Geodesic[J]. Chinese Journal of Electronics, 2015, 24(2): 251-257. doi: 10.1049/cje.2015.04.005

Sampling Streaming Data Along Geodesic

doi: 10.1049/cje.2015.04.005
Funds:  This work is supported by the National Natural Science Foundation of China (No.61370083, No.61073043, No.61073041), the National Research Foundation for the Doctoral Program of Higher Education of China (No.20112304110011, No.20122304110012), the Natural Science Foundation of Heilongjiang Province (No.F200901), and the Harbin Outstanding Academic Leader Foundation of Heilongjiang Province of China (No.2011RFXXG015).
  • Publish Date: 2015-04-10
  • This paper proposes an approach to sample data stream based on differential geometry. Our aim is to take advantage of information of discarded data and support stream to generate different number of transactions during different periods. To this end, we establish a novel data stream model represented by a surface, within which time is quantified and probability, value and time, viewed as one united body, could be calculated simultaneously. We project data stream onto a surface of the model and replace points which have the shortest geodesic distance with their mid-point. To the best of our knowledge, this is the first work on introducing differential geometry as a sampling trick. Experimental results show that our approach is effective.
  • loading
  • J. Zhang, J. Xu and S.S. Liao, “Sampling methods for summarizing unordered vehicle-to-vehicle data streams”, Transportation Research Part C: Emerging Technologies, Vol.23, No.8, pp.56-67, 2012.
    J. Sun, K.Y. He and H. Li, “SFFS-PC-NN optimized by genetic algorithm for dynamic prediction of financial distress with longitudinal data streams”, Knowledge-Based Systems, Vol.24, No.7, pp.1013-1023, 2011.
    H. Yang and S. Fong, “Incremental optimization mechanism for constructing a decision tree in data stream mining”, Mathematical Problems in Engineering, Vol.2013, Article ID.580397, pp.1-14, 2013.
    A. Marascu and F. Masseglia, “Atypicity detection in data streams: A self-adjusting approach”, Intelligent Data Analysis, Vol.15, No.1, pp.89-105, 2011.
    P. Byung-Hoon, O. George and F.S. Nagiza, “Sampling streaming data with replacement”, Computational Statistics and Data Analysis, Vol.52, No.2, pp.750-762, 2007.
    V. Braverman, R. Ostrovsky and C. Zaniolo, “Optimal sampling from sliding windows”, Journal of Computer and System Sciences, Vol.78, No.1, pp.260-272, 2012.
    C.R. Palmer and C. Faloutsos, “Density biased sampling: An improved method for data mining and clustering”, Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, United States, pp.82-92, 2000.
    C.C Aggarwal, “On biased reservoir sampling in the presence of stream evolution”, Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB'06, Seoul, Korea, pp.607-618, 2006.
    Z. Zhang and J. Zhou, “Transfer estimation of evolving class priors in data stream classification”, Pattern Recognition, Vol.43, No.9, pp.3151-3161, 2010.
    L. Serir, E. Ramasso and N. Zerhouni, “Evidential evolving Gustafson-kessel algorithm for online data streams partitioning using belief function theory”, International Journal of Approximate Reasoning, Vol.53, No.5, pp.747-768, 2012.
    T. Zhang, D. Yue, Y. Gu, Y. Wang and G. Yu, “Adaptive correlation analysis in stream time series with sliding windows”, Computers and Mathematics with Applications, Vol.57, No.6, pp.937-948, 2009.
    M. Deypir and M.H. Sadreddini, “Eclatds: An efficient sliding window based frequent pattern mining method for data streams”, Intelligent Data Analysis, Vol.15, No.4, pp.571-587, 2011.
    B. Babcock, S. Babu, M. Datar, R. Motwani and J.Widom, “Models and issues in data stream systems”, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS'02, Madison, WI, United States, pp.1-16, 2002.
    M.M. Gaber, A. Zaslavsky and S. Krishnaswamy, “Mining data streams: A review”, ACM SIGMOD Record, pp.18-26, 2005.
    E.D. Demaine, A. López-Ortiz and J.I. Munro, “Frequency estimation of internet packet streams with limited space”, Proceedings of the 10th Annual European Symposium on Algorithms, Rome, Italy, pp.348-360, 2002.
    V.A. Toponogov, Differential Geometry of Curves and Surfaces, Birkh Press, Berlin, pp.11-172, 2006.
    B. Ram, Numerical Methods, Dorling Kindersley India Pvt. Ltd. Press, New Delhi, pp.11-50, 2010.
  • 加载中


    通讯作者: 陈斌,
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (207) PDF downloads(1463) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint