UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Efficient data mining of constrained association rules Pang, Chiu Yan (Alex)

Abstract

With the recent advances in information technology, companies are now collecting more and more data related to their business. Companies are very interested in decision support systems that can discover knowledge from data and help them gain insight into their data. Data mining with the goal of discovering non-trivial information or patterns hidden in large databases has, therefore, recently become one of the most active research areas in database technology. Association rules relate items which tend to occur together in a given event or record. Mining association rules represents one of the most important problems in data mining. However, the current framework suffers seriously from the lack of user interaction and focus. In this thesis, we propose a new paradigm called Constrained Association Rules where (i) the mining of the rules is divided into two phases with various breakpoints for user feedback, and (ii) users can associate constraints with their queries. We analyze many SQL-style constraints and introduce the notions of succinctness and anti-monotonicity for their classification. We design a new algorithm called CAP for mining association rules that satisfy a set of given constraints. The idea is to check for satisfaction of the constraints as early as possible by exploiting the properties of anti-monotonicity and succinctness of the constraints. Several optimization techniques are developed. Our experimental evaluation indicates that CAP runs much faster and can sometimes outrun several basic algorithms by as much as 80 times.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.