Document Type


Publication Date



The problem of rewriting queries has been heavily explored in recent years, including in work on query processing and optimization, semantic query refinement in decentralized environments, the rewriting of queries using views, and view maintenance. Previous work has made the restricting assumption that the rewritten query must be equivalent to the initially given query. We now propose to relax this assumption to allow for query rewriting in situations where equivalent rewritings may not exist - yet alternate not necessarily equivalent query rewritings may still be preferable to users over not receiving any answers at all. Our approach is based on a preference model, an extension of SQL called E-SQL, that captures the intention of the query by how much deviation from the original query would still be acceptable to the user. In this paper, we introduce an analytical model of query rewritings that incorporates measures of quality of a query in addition to the commonly studied measures of costs (query performance). Quality is modeled as a function of the divergence from the intended view extent, both in terms of the preservation of the information amount and the information type. Both quality and cost are integrated into one uniform model, called the QC-Model , to allow for a trade-off among these two measures. This model can be used to compare two alternate (even if not equivalent) rewritings, and thus to establish a ranking among a possibly large set of query rewritings. Our model is the first to allow for automatic selection of good solutions in environments with numerous non-equivalent query rewritings. In this paper, we also report experimental studies that characterize trends, correlations and independence among the different efficiency factors, and demonstrate the utility of the proposed QC-Model in terms of establishing a ranking among rewritings.