uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Event-Centric Clustering of News Articles
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Entertainity AB plans to build a news service to provide news to end-users in an innovative way. The service must include a way to automatically group series of news from different sources and publications, based on the stories they are covering.This thesis include three contributions: a survey of known clustering methods, an evaluation of human versus human results when grouping news articles in an event-centric manner, and last an evaluation of an incremental clustering algorithm to see if it is possible to consider a reduced input size and still get a sufficient result.The conclusions are that the result of the human evaluation indicates that users are different enough to warrant a need to take that into account when evaluating algorithms. It is also important that this difference is considered when conducting cluster analysis to avoid overfitting. The evaluation of an incremental event-centric algorithm shows it is desirable to adjust the similarity threshold, depending on what result one want. When running tests with different input sizes, the result implies that a short summary of a news article is a natural feature selection when performing cluster analysis.

Place, publisher, year, edition, pages
2013.
Series
IT, 13 072
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-209654OAI: oai:DiVA.org:uu-209654DiVA: diva2:658886
Educational program
Master Programme in Computer Science
Supervisors
Examiners
Available from: 2013-10-23 Created: 2013-10-23 Last updated: 2013-12-02Bibliographically approved

Open Access in DiVA

fulltext(897 kB)1816 downloads
File information
File name FULLTEXT03.pdfFile size 897 kBChecksum SHA-512
d1f4c104333a40133bb6637acb22ea806805c3217b157511b3bb14b35e62367651931975f9d23ffc602578bf53dbb21857968550b5c1a0ffac1b932b0cde922f
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1818 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 737 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf