Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.31803/tg-20250225100638

Replacing Backpropagation with the Forward-Forward (FF) Algorithm in Transformer Models: A Theoretical and Empirical Study on Scalable and Efficient Gradient-Free Training

Hyun Jung Kim orcid id orcid.org/0000-0003-3845-0560 ; Sang-Huh College and the Graduate School of Information & Communication, Department of Convergence Information Technology (Artificial Intelligence Major), Konkuk University, Republic of Korea
Sang Hyun Yoo orcid id orcid.org/0009-0008-9199-8238 ; Department of Computer Software, Kyungmin University, 545, Seo-ro, Uijeongbu-si, 11618 Gyeonggi-do, Republic of Korea *

* Dopisni autor.


Puni tekst: engleski pdf 1.082 Kb

str. 452-460

preuzimanja: 111

citiraj


Sažetak

This study proposes a novel integration of the Forward-Forward (FF) algorithm into Transformer architectures as an efficient and gradient-free alternative to Backpropagation (BP). Motivated by the computational limitations of BP-such as high memory usage and gradient instability-we aim to examine whether FF can maintain comparable model performance while improving training efficiency. We present both theoretical justifications and empirical evaluations on the IMDB sentiment analysis dataset. Our experiments show that FF reduces training time by approximately 20% and memory usage by 30%, with only a marginal decrease in BLEU score (27.8 vs. 28.3) and slight increase in Perplexity (13.2 vs. 12.5). Furthermore, we extend our evaluation across varying model depths and hardware platforms (desktop GPU, cloud GPU, SoC-based laptop), and perform statistical testing and ablation studies to investigate FF’s behavior within Transformer components. These results highlight the viability of FF for scalable, rethereforeurce-efficient Transformer training and provide a foundation for future research in hybrid and distributed deep learning frameworks.

Ključne riječi

Backpropagation Alternative; Computational Efficiency; Efficient AI Training; Forward-Forward (FF) Algorithm; Training Stability; Transformer Models

Hrčak ID:

332173

URI

https://hrcak.srce.hr/332173

Datum izdavanja:

15.9.2025.

Posjeta: 191 *