uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classification of Hate Tweets and Their Reasons using SVM
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computing Science.
2016 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [sv]

Denna studie fokuserar på att klassificera hat-meddelanden riktade mot mobiloperatörerna Verizon,  AT&T and Sprint. Huvudsyftet är att med hjälp av maskininlärningsalgoritmen Support Vector Machines (SVM) klassificera meddelanden i fyra kategorier - Hat, Orsak, Explicit och Övrigt - för att kunna identifiera ett hat-meddelande och dess orsak.

Studien resulterade i två metoder: en "naiv" metod (the Naive Method, NM) och en mer "avancerad" metod (the Partial Timeline Method, PTM). NM är en binär metod i den bemärkelsen att den ställer frågan: "Tillhör denna tweet klassen Hat?". PTM ställer samma fråga men till en begränsad mängd av tweets, dvs bara de som ligger inom ± 30 min från publiceringen av hat-tweeten.

Sammanfattningsvis indikerade studiens resultat att PTM är noggrannare än NM. Dock tar den inte hänsyn till samtliga tweets på användarens tidslinje. Därför medför valet av metod en avvägning: PTM erbjuder en noggrannare klassificering och NM erbjuder en mer utförlig klassificering.

Abstract [en]

This study focused on finding the hate tweets posted by the customers of three mobileoperators Verizon, AT&T and Sprint and identifying the reasons for their dissatisfaction. The timelines with a hate tweet were collected and studied for the presence of an explanation.

A machine learning approach was employed using four categories: Hate, Reason, Explanatory and Other. The classication was conducted with one-versus-all approach using Support Vector Machines algorithm implemented in a LIBSVM tool.

The study resulted in two methodologies: the Naive method (NM) and the Partial Time-line Method (PTM). The Naive Method relied only on the feature space consisting of the most representative words chosen with Akaike Information Criterion. PTM utilized the fact that the majority of the explanations were posted within a one-hour time window of the posting of a hate tweet.

We found that the accuracy of PTM is higher than for NM. In addition, PTM saves time and memory by analysing fewer tweets. At the same time this implies a trade-off between relevance and completeness.

Place, publisher, year, edition, pages
2016. , 37 p.
Series
UPTEC F, ISSN 1401-5757 ; 16001
Keyword [en]
Support Vector Machines, classification, Akaike Information Criteria, machine learning, Twitter, hate tweets
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-275782OAI: oai:DiVA.org:uu-275782DiVA: diva2:901098
Educational program
Master Programme in Engineering Physics
Presentation
2016-01-18, Å4006, Lägerhydsvägen 1, 752 37, Uppsala, 13:15 (English)
Supervisors
Examiners
Note

Opponent: Kristina Wettainen

Available from: 2016-02-10 Created: 2016-02-06 Last updated: 2016-02-10Bibliographically approved

Open Access in DiVA

fulltext(623 kB)302 downloads
File information
File name FULLTEXT02.pdfFile size 623 kBChecksum SHA-512
00a2e2a7cac97d9793646c58bb18b0cfd7d57013b0db0250c513c211f19eac3508bdf6691653692b36afccef000081d074980559221c9a7ca55751321e1d68d9
Type fulltextMimetype application/pdf

By organisation
Division of Computing Science
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 302 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1668 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf