Tehnički vjesnik, Vol. 25 No. 2, 2018.
Izvorni znanstveni članak
https://doi.org/10.17559/TV-20180218021747
Mining Weighted Frequent Closed Episodes over Multiple Sequences
Guoqiong Liao
; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China
Xiaoting Yang
; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China
Sihong Xie
; Computer Science and Engineering Department, Lehigh University, Bethlehem, PA 18015, USA
Philip S. Yu
; Department of Computer Science, University of Illinois at Chicago, Chicago, IL, 60607, USA
Changxuan Wan
; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China
Sažetak
Frequent episode discovery is introduced to mine useful and interesting temporal patterns from sequential data. The existing episode mining methods mainly focused on mining from a single long sequence consisting of events with time constraints. However, there can be multiple sequences of different importance as the persons or entities associated with each sequence can be of different importance. Aiming to mine episodes in multiple sequences of different importance, we first define a new kind of episodes, i.e., the weighted frequent closed episodes, to take sequence importance, episode distribution and occurrence frequency into account together. Secondly, to facilitate the mining of such new episodes, we present a new concept called maximal duration serial episodes to cut a whole sequence into multiple maximum episodes using duration constraints, and discuss its properties for episode shrinking processing. Finally, based on the theoretical properties, we propose a two-phase approach to efficiently mine these new episodes. In Phase I, we adopt a level-wise episode shrinking framework to discover the candidate frequent closed episodes with the same prefixes, and in Phase II, we match the candidates with different prefixes to find the frequent close episodes. Experiments on simulated and real datasets demonstrate that the proposed episode mining strategy has good mining effectiveness and efficiency.
Ključne riječi
closed episodes; episode mining; frequent episodes; multiple sequences
Hrčak ID:
199150
URI
Datum izdavanja:
21.4.2018.
Posjeta: 1.558 *