Skip to the main content

Original scientific paper

https://doi.org/10.24138/jcomss.v12i4.78

An Approach to Page Ranking Based on Discourse Structures

Subalalitha Chinnaudayar Navaneethakrishnan ; Department of Computer Science and Engineering SRM University, Kattankulathur Chennai, India
Anita Ramalingam ; Department of Computer Science and Engineering SRM University, Kattankulathur Chennai, India


Full text: english pdf 1.968 Kb

page 195-200

downloads: 395

cite


Abstract

World Wide Web (WWW) which is predominant source for Information Retrieval today (IR) is essentially a set of hyperlinked documents. A web page containing more number of related hyperlinks satisfy the user needs in a single page. The IR systems should give high priority to such web pages. While assigning a rank for a web page, existing web mining techniques such as Hypertext Induced Topic Selection (HITS) and Page Ranking algorithms focus on the number of in links and out links present in the web page. Instead of just relying on the number of links present in the web page, the discovery of semantic relations between the web page and the hyperlinks present in the web page can improve the quality of the IR systems. The Rhetorical Structure Theory (RST) is widely used to find the semantic relations between text fragments by analysing the discourse structure of a text. In this paper, we propose a novel approach to find the semantic relation between a web page and the links present in the web page using RST. The proposed approach uses RST based discourse relations to find the relation between a web page and the hyperlinks present in the web page. We have implemented and evaluated our approach on an IR system using 500 Tamil language and 50 English tourism domain specific web pages. A comparison between the proposed approach and an existing page ranking algorithm has also been done.

Keywords

Discourse structure; Link Analysis and Rhetorical Structure Theory

Hrčak ID:

179757

URI

https://hrcak.srce.hr/179757

Publication date:

22.12.2016.

Visits: 1.114 *