Authors : G. Naveen Sundar, D. Narmadha and A.P. Haran
Abstract: Every individual is provided with access to plenty of information with the help of world wide web but it becomes progressively more difficult to discover the significant pieces of information. In web mining tries to tackle this problem by applying data mining techniques to web data and documents. The data available on the web is so heterogeneous and huge that it becomes a crucial factor to extract this accessible data to make it pertinent to a particular problem. Web mining uses data mining techniques to extract knowledge from web sources. This study focuses on detecting and extracting templates from web pages that are heterogeneous in nature by means of an algorithm. Locality sensitive hashing finds the similarity between the input web documents and provides good performance compared to the Minimum Description Length (MDL) principle and hash cluster process in terms of execution time.
G. Naveen Sundar, D. Narmadha and A.P. Haran, 2014. Performance Intensification for Automatic Template Using World Wide Web. Research Journal of Applied Sciences, 9: 288-294.