Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets

Essalmi, Houda; El Affar, Anass

doi:10.32985/ijeces.17.1.5

International journal of electrical and computer engineering systems, Vol. 17 No. 1, 2026.

Izvorni znanstveni članak

https://doi.org/10.32985/ijeces.17.1.5

Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets

Houda Essalmi ; Laboratory of Engineering Sciences, Polydisciplinary Faculty of Taza, University of Sidi Mohamed Ben Abdellah Fez, Morocco *
Anass El Affar ; Laboratory of Engineering Sciences, Polydisciplinary Faculty of Taza, University of Sidi Mohamed Ben Abdellah Fez, Morocco

* Dopisni autor.

Puni tekst: engleski pdf 1.407 Kb

str. 49-64

preuzimanja: 178

citiraj

APA 6th Edition

Essalmi, H. i El Affar, A. (2026). Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets. International journal of electrical and computer engineering systems, 17 (1), 49-64. https://doi.org/10.32985/ijeces.17.1.5

MLA 8th Edition

Essalmi, Houda i Anass El Affar. "Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets." International journal of electrical and computer engineering systems, vol. 17, br. 1, 2026, str. 49-64. https://doi.org/10.32985/ijeces.17.1.5. Citirano 08.07.2026.

Chicago 17th Edition

Essalmi, Houda i Anass El Affar. "Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets." International journal of electrical and computer engineering systems 17, br. 1 (2026): 49-64. https://doi.org/10.32985/ijeces.17.1.5

Harvard

Essalmi, H., i El Affar, A. (2026). 'Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets', International journal of electrical and computer engineering systems, 17(1), str. 49-64. https://doi.org/10.32985/ijeces.17.1.5

Vancouver

Essalmi H, El Affar A. Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets. International journal of electrical and computer engineering systems [Internet]. 2026 [pristupljeno 08.07.2026.];17(1):49-64. https://doi.org/10.32985/ijeces.17.1.5

IEEE

H. Essalmi i A. El Affar, "Parallel and Distributed Multi-level Entropy- Based Approach for Adaptive Global Frequent Pattern Mining in Large Datasets", International journal of electrical and computer engineering systems, vol.17, br. 1, str. 49-64, 2026. [Online]. https://doi.org/10.32985/ijeces.17.1.5

Sažetak

Frequent pattern mining in distributed settings remains a significant challenge due to predominantly high computational expenses and high communication overhead. This paper presents AGFPM (Adaptive Global Frequent Pattern Mining), a novel solution that integrates an extensible Master-Slave architecture with an advanced pruning technique that relies on binary entropy and statistical quartiles. AGFPM proposes two primary data structures: the LP-Tree (Local Prefix Tree) and the GP-Tree (Global Prefix Tree). A single pass of each local Slave site is used to build one LP-Tree, and low information value branches are pruned early on by entropy and quartile thresholds. Rather than transferring complete trees, only succinct metadata is sent to the Master site, where the GP-Tree is built from globally sorted items in order of their entropy rankings. A significant aspect of AGFPM is the flexible pruning approach: either the GP-Tree is pruned or not pruned, based on user criteria. This provides a dynamic adjustment between the performance and generality of results, thereby allowing control over the level of compression applied when generating global patterns. Global frequent patterns are then recursively mined from the GP-Tree based on conditional sub-GP-Trees. Frequent patterns are extended at each level of the hierarchy by intersecting the common prefix paths, guided by a Global Header Table. AGFM demonstrates improved performance in execution time, scalability, and robustness against low support thresholds relative to existing methods.

Ključne riječi

Data mining; Distributed Datasets; FP-tree; Communication Overhead; Frequent patterns mining; Binary Entropy; Quartile-based Pruning;

Hrčak ID:

342322

URI

https://hrcak.srce.hr/342322

Datum izdavanja:

5.1.2026.

Posjeta: 374 *

Prijava i registracija

International journal of electrical and computer engineering systems, Vol. 17 No. 1, 2026.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: