THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH

Poliscuk, Jaroslav E.

Journal of Information and Organizational Sciences, Vol. 27 No. 1, 2003.

Izvorni znanstveni članak

THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH

Jaroslav E. Poliscuk ; Department of Electrical Engineering, University of Montenegro, Podgorica, Montenegro

Puni tekst: engleski pdf 245 Kb

str. 29-42

preuzimanja: 340

citiraj

APA 6th Edition

Poliscuk, J.E. (2003). THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH. Journal of Information and Organizational Sciences, 27 (1), 29-42. Preuzeto s https://hrcak.srce.hr/78379

MLA 8th Edition

Poliscuk, Jaroslav E.. "THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH." Journal of Information and Organizational Sciences, vol. 27, br. 1, 2003, str. 29-42. https://hrcak.srce.hr/78379. Citirano 18.12.2024.

Chicago 17th Edition

Poliscuk, Jaroslav E.. "THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH." Journal of Information and Organizational Sciences 27, br. 1 (2003): 29-42. https://hrcak.srce.hr/78379

Harvard

Poliscuk, J.E. (2003). 'THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH', Journal of Information and Organizational Sciences, 27(1), str. 29-42. Preuzeto s: https://hrcak.srce.hr/78379 (Datum pristupa: 18.12.2024.)

Vancouver

Poliscuk JE. THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH. Journal of Information and Organizational Sciences [Internet]. 2003 [pristupljeno 18.12.2024.];27(1):29-42. Dostupno na: https://hrcak.srce.hr/78379

IEEE

J.E. Poliscuk, "THE ANALYSIS OF EXPERIMENTAL RESULTS OF MACHINE LEARNING APPROACH", Journal of Information and Organizational Sciences, vol.27, br. 1, str. 29-42, 2003. [Online]. Dostupno na: https://hrcak.srce.hr/78379. [Citirano: 18.12.2024.]

Sažetak

In this article is analyzed a reinforcement learning method, in which is defined a subject of learning. The essence of this method is the selection of activities by a try and fail process and awarding deferred rewards. If an environment is characterized by the Markov property, then step-by-step dynamics will enable forecasting of subsequent conditions and awarding subsequent rewards on the basis of the present known conditions and actions, relatively to the Markov decision making process. The relationship between the present conditions and values and the potential future conditions are defined by the Bellman equation. Also, the article discussed a method of temporal difference learning, mechanism of eligibility traces, as wel as theirs algorithms TD(0) and TD(Lambda). Theoretical analyses were supplemented by the practical studies, with reference to implementation of the Sarsa(Lambda) algorithm, with replacing eligibility traces and the Epsilon greedy policy.

Ključne riječi

algorithm TD(0); algorithm TD(Lambda); Bellman equation; Markov decision making process; mechanism of eligibility traces; method of temporal difference learning; reinforcement learning method

Hrčak ID:

78379

URI

https://hrcak.srce.hr/78379

Datum izdavanja:

13.6.2003.

Posjeta: 980 *

Prijava i registracija

Journal of Information and Organizational Sciences, Vol. 27 No. 1, 2003.

Sažetak

Ključne riječi

Hrčak ID:

URI

Datum izdavanja: