Skip to the main content

Original scientific paper

https://doi.org/10.31803/tg-20230715000112

Identification of LSA Data Retrieval Method and Temporal Graph for Document Retrieval

Shahla Rezvani ; Department of Information Science and Knowledge Management, Faculty of Management, University of Tehran, Tehran, Iran
Nader Naghshineh ; Department of Information Science and Knowledge Management, Faculty of Management, University of Tehran, Tehran, Iran *
Ahmad Khalilijafarabad ; Department of IT Management, Faculty of Management, University of Tehran, Tehran, Iran

* Corresponding author.


Full text: english pdf 1.190 Kb

versions

page 9-16

downloads: 305

cite


Abstract

The field of expert finding has seen a large number of approaches proposed both in universities and in industries, using a variety of new techniques in relevant data fields. This study tends to identify information retrieval method of latent semantic analysis and temporal graph for document retrieval. In this study, citation occurrence and author occurrence are independent variables and scales of expert author finding are dependent variables. The method used to evaluate judgment of document and author relevance in the test set formation phase is more similar to survey methods. Library method is used to study theoretical foundations and judge literature. This study has three populations: a) test set documents; b) people who make queries and judge relevance of retrieved documents; c) people who judge relevance of the retrieved experts. To measure judgments of document relevance, a method similar to peer tests is used. Among the retrieved results, repeated results are placed to determine accuracy and reliability of the judge. The degree of correlation obtained in this method is very high (0.98), indicating the reliability of the results. Regarding the results of the current study on application of latent semantic indexing (LSA) information retrieval model, which was ultimately used to retrieve expert authors, the performance of LSA-based retrieval model outperformed the baseline model. This was evident from the obtained metrics, including precision at the top 5 results (p@5) with a value of 0.895, mean average precision (MAP) of 0.839, and mean reciprocal rank (MRR) of 0.909. The improved retrieval performance can be attributed to the superior performance of the dimension reduction method compared to keyword matching.

Keywords

expert finding; expert retrieval; information retrieval; information systems; temporal graph; test set

Hrčak ID:

327589

URI

https://hrcak.srce.hr/327589

Publication date:

14.3.2025.

Visits: 774 *