uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Micro Architecture Independent Data Locality Analysis of Multi Threaded Applications on Multi Core Processors
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. (Uppsala Architecture Research Team)
2016 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In today's computing a significant amount of energy is spent on the movement of data and data related stalls. In order to understand how this energy is spent and can be reduced, we need efficient models of the cache hierarchy. This thesis builds on previous work to create a tool to aid developers in quickly estimating cache behavior of multi-threaded programs on multi-core architectures.

The tool consists of a profiler that sparsely collects data of a program's memory references and thread interactions, and an analyzer that uses the collected data to estimate cache miss ratios. The tool is able to both model shared and separate caches. For the shared caches a new way of modeling constructive coherence is created.

Micro-benchmarks and benchmarks from SPLASH-2 and PARSEC benchmark suites are used to validate the functionality of the tool. The tool estimates the miss ratios accurately in most configurations, but is less accurate for small cache sizes. This is consistent with single-threaded studies of the analytical model technique which the tool is based on. The new constructive coherence modeling shows a significant improvement for shared caches, where it detects when threads with shared data-set helps to keeping data inside the cache.

The performance of the profiler scales well for both the number of threads and size of benchmark, while the analyzer faces scaling difficulties with the number of threads.

Place, publisher, year, edition, pages
2016. , 61 p.
Series
UPTEC F, ISSN 1401-5757 ; 16051
Keyword [en]
Cache modeling, Data locality, Multi-threaded, Multi-core
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:uu:diva-311742OAI: oai:DiVA.org:uu-311742DiVA: diva2:1061251
Educational program
Master Programme in Engineering Physics
Supervisors
Examiners
Available from: 2017-01-03 Created: 2017-01-01 Last updated: 2017-01-03Bibliographically approved

Open Access in DiVA

No full text

By organisation
Division of Computer Systems
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar

Total: 103 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf