QoS: Quality Driven Data Abstraction for Large Databases

Wad, Charudatta V

Etd

QoS: Quality Driven Data Abstraction for Large Databases

Public

Data abstraction is the process of reducing a large dataset into one of moderate size, while maintaining dominant characteristics of the original dataset. Data abstraction quality refers to the degree by which the abstraction represents original data. Clearly, the quality of an abstraction directly affects the confidence an analyst can have in results derived from such abstracted views about the actual data. While some initial measures to quantify the quality of abstraction have been proposed, they currently can only be used as an after thought. While an analyst can be made aware of the quality of the data he works with, he cannot control the desired quality and the trade off between the size of the abstraction and its quality. While some analysts require atleast a certain minimal level of quality, others must be able to work with certain sized abstraction due to resource limitations. consider the quality of the data while generating an abstraction. To tackle these problems, we propose a new data abstraction generation model, called the QoS model, that presents the performance quality trade-off to the analyst and considers that quality of the data while generating an abstraction. As the next step, it generates abstraction based on the desired level of quality versus time as indicated by the analyst. The framework has been integrated into XmdvTool, a freeware multi-variate data visualization tool developed at WPI. Our experimental results show that our approach provides better quality with the same resource usage compared to existing abstraction techniques.

Creator