LI Dong, ZHANG Qixu, LIANG Xiaochong, GUAN Jida, XU Yang. Selectivity Estimation for String Predicates Based on Modified Pruned Count-Suffix Tree[J]. Chinese Journal of Electronics, 2015, 24(1): 76-82.
Citation: LI Dong, ZHANG Qixu, LIANG Xiaochong, GUAN Jida, XU Yang. Selectivity Estimation for String Predicates Based on Modified Pruned Count-Suffix Tree[J]. Chinese Journal of Electronics, 2015, 24(1): 76-82.

Selectivity Estimation for String Predicates Based on Modified Pruned Count-Suffix Tree

Funds:  This work is supported by National Natural Science Foundation of China (No.71090403), Education Ministry of Education of P.R.C (No.x2rjB7110020), Bureau of Science and Information Technology of Guangzhou (No.2011J4100061), Bureau of Science and Information Technology of Zhaoqing (No.2013C001).
  • Received Date: 2013-11-01
  • Rev Recd Date: 2014-04-01
  • Publish Date: 2015-01-10
  • The accuracy of predicates selectivity estimation is one of the important factors affecting query optimization performance. State-of-art selectivity estimation algorithms for string predicates based on Pruned count-suffix tree (PST) often suffer severe underestimating and overestimating problems, thus their average relative errors are not good. We analyze the main causes of the underestimating and overestimating prob-lems, propose a novel Restricted pruned count-suffix tree (RPST) and a new pruning strategy. Based on these, we present the EKVI algorithm and the EMO algorithm which are extended from the KVI algorithm and the MO algorithm respectively. The experiments compare the EKVI algorithm and the EMO algorithm with the traditional KVI algorithm and the MO algorithm, and the results show that the average relative errors of our selectivity estimation algorithms are significantly better than the traditional selectivity estimation algorithms. The EMO algorithm is the best among these algorithms from the overall view.
  • loading
  • P. Krishnan, J.S. Vitter and B. Iyer, "Estimating alphanumeric selectivity in the presence of wildcards", Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Vol.2, No.25, pp.282-293, 1996.
    H.V. Jagadish, R.T. Ng and D. Srivastava, "Substring selectivity estimation", Proceedings of the Eighteenth ACM SIGMODSIGACT- SIGART Symposium on Principles of Database Systems, Philadelphia, Pennsylvania, USA, pp.249-260, 1999.
    L. Lim, M. Wang and J.S. Vitter, "CXHist: An on-line classification-based histogram for XML string selectivity estimation", Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, pp.1187-1198, 2005.
    H.V. Jagadish, O. Kapitskaia, R.T. Ng and D. Srivastava, "One dimensional and multi-dimensional substring selectivity estimation", The VLDB Journal, Vol.9, No.3, pp.214-230, 2000.
    A. Wagner, V. Bicer and T.D. Tran, "Selectivity estimation for hybrid queries over text-rich data graphs", Proceedings of the 16th International Conference on Extending Database Technology, Genoa, Italy, pp.383-394, 2013.
    D.Z. Wang, L. Wei, Y. Li, et al., "Selectivity estimation for extraction operators over text data", 2011 IEEE 27th International Conference on Data Engineering (ICDE), Hannover, Germany, pp.685-696, 2011.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (266) PDF downloads(919) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return