Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.17559/TV-20180218021747

Mining Weighted Frequent Closed Episodes over Multiple Sequences

Guoqiong Liao ; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China
Xiaoting Yang ; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China
Sihong Xie ; Computer Science and Engineering Department, Lehigh University, Bethlehem, PA 18015, USA
Philip S. Yu ; Department of Computer Science, University of Illinois at Chicago, Chicago, IL, 60607, USA
Changxuan Wan ; School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China


Puni tekst: engleski pdf 579 Kb

str. 510-518

preuzimanja: 630

citiraj


Sažetak

Frequent episode discovery is introduced to mine useful and interesting temporal patterns from sequential data. The existing episode mining methods mainly focused on mining from a single long sequence consisting of events with time constraints. However, there can be multiple sequences of different importance as the persons or entities associated with each sequence can be of different importance. Aiming to mine episodes in multiple sequences of different importance, we first define a new kind of episodes, i.e., the weighted frequent closed episodes, to take sequence importance, episode distribution and occurrence frequency into account together. Secondly, to facilitate the mining of such new episodes, we present a new concept called maximal duration serial episodes to cut a whole sequence into multiple maximum episodes using duration constraints, and discuss its properties for episode shrinking processing. Finally, based on the theoretical properties, we propose a two-phase approach to efficiently mine these new episodes. In Phase I, we adopt a level-wise episode shrinking framework to discover the candidate frequent closed episodes with the same prefixes, and in Phase II, we match the candidates with different prefixes to find the frequent close episodes. Experiments on simulated and real datasets demonstrate that the proposed episode mining strategy has good mining effectiveness and efficiency.

Ključne riječi

closed episodes; episode mining; frequent episodes; multiple sequences

Hrčak ID:

199150

URI

https://hrcak.srce.hr/199150

Datum izdavanja:

21.4.2018.

Posjeta: 1.558 *