uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Prediction of the Type for Web Page: A Practical Application in Classification
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Informatics and Media, Information Systems.
2013 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

As more and more data are generated in daily life, traditional data analysis methods reach their bottoms and often fail to discover unknown factors deep inside the data, which cause the adoption of data mining. Classification means mapping data into known groups, and it is one of primary tasks of data mining. The study in this thesis is about finding an automated solution to predict whether the value of the web page doesn’t decrease as time goes on, in other words evergreen or not. When recommending web pages to users according to their interests, it is valuable to know which pages are evergreen. There is no doubt this study belongs to the area of classification. In order to solve this problem, the knowledge and techniques involved in machine learning and web text mining are required to implement the solution. A number of models or classifiers are built during the implementation based on different features and optimizations, and they are evaluated by a method called cross validation. The best solution in this thesis is an ensemble of some simple models, which achieves highest accuracy in prediction. Moreover, limitations of solution are also presented and future improvements are suggested.

Place, publisher, year, edition, pages
2013. , 40 p.
National Category
Social Sciences Information Systems, Social aspects
URN: urn:nbn:se:uu:diva-212963OAI: oai:DiVA.org:uu-212963DiVA: diva2:679885
Subject / course
Information Systems
Educational program
Master Programme in Social Sciences
2013-12-13, A311, Ekonomikum (plan 3), Kyrkogårdsg. 10, Uppsala, 16:00 (English)
Available from: 2013-12-17 Created: 2013-12-17 Last updated: 2013-12-17Bibliographically approved

Open Access in DiVA

fulltext(908 kB)608 downloads
File information
File name FULLTEXT01.pdfFile size 908 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Information Systems
Social SciencesInformation Systems, Social aspects

Search outside of DiVA

GoogleGoogle Scholar
Total: 608 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 360 hits
ReferencesLink to record
Permanent link

Direct link