Selectivity Estimation for String Predicates Based on Modified Pruned Count-Suffix Tree
-
Graphical Abstract
-
Abstract
The accuracy of predicates selectivity estimation is one of the important factors affecting query optimization performance. State-of-art selectivity estimation algorithms for string predicates based on Pruned count-suffix tree (PST) often suffer severe underestimating and overestimating problems, thus their average relative errors are not good. We analyze the main causes of the underestimating and overestimating prob-lems, propose a novel Restricted pruned count-suffix tree (RPST) and a new pruning strategy. Based on these, we present the EKVI algorithm and the EMO algorithm which are extended from the KVI algorithm and the MO algorithm respectively. The experiments compare the EKVI algorithm and the EMO algorithm with the traditional KVI algorithm and the MO algorithm, and the results show that the average relative errors of our selectivity estimation algorithms are significantly better than the traditional selectivity estimation algorithms. The EMO algorithm is the best among these algorithms from the overall view.
-
-