Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models

Dodić, Dejan; Regodić, Dušan

doi:10.17559/TV-20231218001216

Technical gazette, Vol. 31 No. 6, 2024.

Original scientific paper

https://doi.org/10.17559/TV-20231218001216

Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models

Dejan Dodić ; The Academy of Applied Technical and Preschool Studies, Department of Information - communication technologies, Beogradska 18, Niš, Serbia *
Dušan Regodić ; MB University, Faculty of Business and Law, Department of Advanced information technologies, Teodora Drajzera 27, Belgrade, Serbia

* Corresponding author.

Full text: english pdf 1.839 Kb

versions

page 1995-2002

downloads: 846

cite

APA 6th Edition

Dodić, D. & Regodić, D. (2024). Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models. Tehnički vjesnik, 31 (6), 1995-2002. https://doi.org/10.17559/TV-20231218001216

MLA 8th Edition

Dodić, Dejan and Dušan Regodić. "Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models." Tehnički vjesnik, vol. 31, no. 6, 2024, pp. 1995-2002. https://doi.org/10.17559/TV-20231218001216. Accessed 2 Jun. 2026.

Chicago 17th Edition

Dodić, Dejan and Dušan Regodić. "Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models." Tehnički vjesnik 31, no. 6 (2024): 1995-2002. https://doi.org/10.17559/TV-20231218001216

Harvard

Dodić, D., and Regodić, D. (2024). 'Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models', Tehnički vjesnik, 31(6), pp. 1995-2002. https://doi.org/10.17559/TV-20231218001216

Vancouver

Dodić D, Regodić D. Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models. Tehnički vjesnik [Internet]. 2024 [cited 2026 June 02];31(6):1995-2002. https://doi.org/10.17559/TV-20231218001216

IEEE

D. Dodić and D. Regodić, "Tokenization and Memory Optimization for Reducing GPU Load in NLP Deep Learning Models", Tehnički vjesnik, vol.31, no. 6, pp. 1995-2002, 2024. [Online]. https://doi.org/10.17559/TV-20231218001216

Abstract

In the current landscape of advanced natural language processing (NLP), managing GPU memory effectively is crucial. This paper delves into new tokenization methods and data handling to enhance NLP model efficiency, focusing on avoiding "CUDA out of memory" errors. It examines how sophisticated tokenization and managing text lengths in large datasets can boost model performance. These insights are vital for optimizing resources and scaling NLP models, especially with limited GPU memory. The paper also contextualizes NLP challenges, underlining the significance of memory optimization amidst growing language model complexities. It reviews key NLP technologies, including transformer models, and addresses their memory optimization challenges. Moreover, it underscores the paper's role in developing innovative techniques for more effective memory optimization, linking it to ongoing research and trends in NLP. This work aims to progress natural language processing methods and make AI technologies more accessible.

Keywords

data tokenization; deep learning; cuda out of memory; gpu memory optimization; machine learning; natural language processing (nlp)

Hrčak ID:

321922

URI

https://hrcak.srce.hr/321922

Publication date:

31.10.2024.

Visits: 1.609 *

Login and registration

Technical gazette, Vol. 31 No. 6, 2024.

Abstract

Keywords

Hrčak ID:

URI

Publication date: