Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.24138/jcomss-2022-0065

Signature-based Tree for Finding Frequent Itemsets

Mohamed El Hadi Benelhadj ; Faculty of Science and Technologies, Tamanrasset University, Algeria
Mohamed Mahmoud Deye ; Cheikh Anta Diop University of Dakar, Senegal
Yahya Slimani ; Institute of Multimedia Art of Manouba (ISAMM), University of Manouba, Tunisia


Puni tekst: engleski pdf 1.455 Kb

str. 70-80

preuzimanja: 79

citiraj


Sažetak

The efficiency of a data mining process depends on the data structure used to find frequent itemsets. Two approaches are possible: use the original transaction dataset or transform it into another more compact structure. Many algorithms use trees as compact structure, like FP-Tree and the associated algorithm FP-Growth. Although this structure reduces the number of scans (only 2), its efficiency depends on two criteria: (i) the size of the support (small or large); (ii) the type of transaction dataset (sparse or dense). But these two criteria can generate very large trees. In this paper, we propose a new tree-based structure that emphasizes on transactions and not on itemsets. Hence, we avoid the problem of support values that have a negative impact on the generated tree.

Ključne riječi

Data Mining; Data compression; Data storage; Tree structure; Signature

Hrčak ID:

299773

URI

https://hrcak.srce.hr/299773

Datum izdavanja:

31.3.2023.

Posjeta: 217 *