International Journal of Soft Computing

Year: 2013
Volume: 8
Issue: 3
Page No. 149 - 153

Web Page Clustering Based on Novel Latent Semantic Approach

Authors : P. Manimaran and K. Duraiswamy

Abstract: Clustering algorithms are usually based on the Bag-of-Words (BOW) approach. A tarnished hindrance of the BOW prototypical is that it ignores the semantic relationship among words. As a result, if two documents use different collections of core words to represent the same topic, they may be assigned to different clusters even though the core words they use are probably synonyms or semantically associated in other form and other disadvantage of conventional web page clustering technique is often utilized to reveal the functional similarity of WebPages. Tagging can be beneficial to improve the clustering performance. Several efforts have been made to explore social tagging for clustering. But there is some drawbacks of tagging web based clustering. All the existing approaches exploiting tag information for web page clustering assume that all the WebPages are tagged which is a somewhat restrictive assumption. In a more realistic setting, one can only expect that the tags will be available for only a small number of WebPages. Researchers propose a new web page grouping approach based on Probabilistic Latent Semantic Analysis (PLSA) Model. An iterative set of rules based on maximum likelihood principle is employed to overcome the aforementioned computational shortcoming.

How to cite this article:

P. Manimaran and K. Duraiswamy , 2013. Web Page Clustering Based on Novel Latent Semantic Approach. International Journal of Soft Computing, 8: 149-153.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved