Charles Explorer logo
🇬🇧

Page Content Rank: an Approach to the Web Content Mining

Publication at Faculty of Mathematics and Physics |
2005

Abstract

The method, we call it Page Content Rank (PCR) in the paper, combines a number of heuristics that seem to be important for analyzing the content of Web pages. The page importance is determined on the base of the importance of terms which the page contains.

The importance of a term is specified with respect to a given query and it is based on its statistical and linguistic features. PCR uses a neural network as its inner classification structure.