Abstract: Web pages classification is one of the main and challenging subjects in the field of data mining. Web page classification knowledge helps users to obtain useful information from massive data sets on the Internet automatically and efficiently. Many efforts have been made by researchers for web page classification, however, there is still opportunity to improve current approaches. Source of one of the main challenges in the educational categories is that the current data set is unbalanced. Because the size of pages in one subject is not the same with the other subject and its distribution is not uniform. Standard machine learning algorithms are influenced by main and big classes (groups) and secondary groups are ignored so accuracy standard for grouping is reduced. In this research, for solving this problem and for grouping web page a new approach based on collective grouping of support vector machine is proposed. To reduce and select features, principal components analysis and independent component analysis tools have been used respectively. Results show that proposed methods in better than other methods (which are widely used on web pages categories).
Reyhaneh Khademi and Mahdi Afzali, 2016. Features Selection from Data in Order to Improve Classification Methods Performance. Journal of Engineering and Applied Sciences, 11: 1859-1865.