uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Non-parametric anomaly detection in sentiment time series data
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences.
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences.
2015 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The importance of finding extreme events or unexpected patterns has increased over the last two decades, mainly due rapid advancements in technology. These events or patterns are referred to as anomalies. This thesis focuses on detecting anomalies in form of sudden peaks occurring in time series generated from online text analysis in Gavagai’s live environment. To our knowledge there exist a limited number of sequential peak detection models applicable in this domain. We introduce a novel technique using the Local Outlier Factor model as well as a model built on simple linear regression with a Bayesian error function, both operating in real-time. We also study a model based on linear Poisson regression. With the constraint from Gavagai that the models should be easy to setup for different targets, it requires them to be non-parametric. The Local Outlier Factor model and the simple linear regression model show promising results comparing them to Gavagai’s current working model. All models were tested on 3 datasets representing 3 different sentiment targets; positivity, negativity and frequency. Not only do our models superiorly succeed to detect the anomalies, but also they do so with fixed parameters independent of target looked at. This means that our models have lower error rate even though they are non-parametric constructed, compared to Gavagai’s current model that requires tuning per target of interest to operate with sufficient accuracy. 

Place, publisher, year, edition, pages
2015.
Series
UPTEC F, ISSN 1401-5757 ; 15015
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-251645OAI: oai:DiVA.org:uu-251645DiVA: diva2:807192
Educational program
Master Programme in Engineering Physics
Supervisors
Examiners
Available from: 2015-05-04 Created: 2015-04-23 Last updated: 2015-05-04Bibliographically approved

Open Access in DiVA

fulltext(12910 kB)428 downloads
File information
File name FULLTEXT01.pdfFile size 12910 kBChecksum SHA-512
c8952031cbb82a82d7ad14b6478202870fd91b818673c3b95bb9d1dfbb00ad86fcd86a04fa3a9930e610fa623fe22437fcdcf92dcf72275288eab6524f182822
Type fulltextMimetype application/pdf

By organisation
Department of Engineering Sciences
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 428 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 836 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf