uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Fix the code. Don't tweak the hardware: A new compiler approach to Voltage–Frequency scaling
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
Show others and affiliations
2014 (English)In: Proc. 12th International Symposium on Code Generation and Optimization, New York: ACM Press, 2014, 262-272 p.Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
New York: ACM Press, 2014. 262-272 p.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-212778ISBN: 978-1-4503-2670-4 (print)OAI: oai:DiVA.org:uu-212778DiVA: diva2:679211
Conference
CGO 2014, February 15-19, Orlando, FL
Projects
UPMARC
Available from: 2014-02-19 Created: 2013-12-13 Last updated: 2016-09-02Bibliographically approved
In thesis
1. Efficient Execution Paradigms for Parallel Heterogeneous Architectures
Open this publication in new window or tab >>Efficient Execution Paradigms for Parallel Heterogeneous Architectures
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. The end of Dennard scaling is threatening the effectiveness of DVFS in future nodes; therefore, new execution paradigms are required to exploit the non-linear relationship between performance and energy efficiency of memory-bound application-regions. To attack this problem, we propose the decoupled access-execute (DAE) paradigm. DAE transforms regions of interest (at program-level) in two coarse-grain phases: the access-phase and the execute-phase, which we can independently DVFS. The access-phase is intended to prefetch the data in the cache, and is therefore expected to be predominantly memory-bound, while the execute-phase runs immediately after the access-phase (that has warmed-up the cache) and is therefore expected to be compute-bound.

DAE, achieves good energy savings (on average 25% lower EDP) without performance degradation, as opposed to other DVFS techniques. Furthermore, DAE increases the memory level parallelism (MLP) of memory-bound regions, which results in performance improvements of memory-bound applications. To automatically transform application-regions to DAE, we propose compiler techniques to automatically generate and incorporate the access-phase(s) in the application. Our work targets affine, non-affine, and even complex, general-purpose codes. Furthermore, we explore the benefits of software multi-versioning to optimize DAE in dynamic environments, and handle codes with statically unknown access-phase overheads. In general, applications automatically-transformed to DAE by our compiler, maintain (or even exceed in some cases) the good performance and energy efficiency of manually-optimized DAE codes.

Finally, to ease the programming environment of heterogeneous systems (with integrated GPUs), we propose a novel system-architecture that provides unified virtual memory with low overhead. The underlying insight behind our work is that existing data-parallel programming models are a good fit for relaxed memory consistency models (e.g., the heterogeneous race-free model). This allows us to simplify the coherency protocol between the CPU – GPU, as well as the GPU memory management unit. On average, we achieve 45% speedup and 45% lower EDP over the corresponding SC implementation.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2016. 54 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1405
Keyword
Decoupled Execution, Performance, Energy, DVFS, Compiler Optimizations, Heterogeneous Coherence
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:uu:diva-300831 (URN)978-91-554-9654-8 (ISBN)
External cooperation:
Public defence
2016-09-30, ITC/1111, Lägerhyddsvägen 2, Uppsala, 13:00 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, FP7-ICT-288653Swedish Research Council
Available from: 2016-09-07 Created: 2016-08-15 Last updated: 2016-09-13

Open Access in DiVA

fulltext(484 kB)296 downloads
File information
File name FULLTEXT02.pdfFile size 484 kBChecksum SHA-512
2ea306e595bfdeaefddd0f81f5fb4783ed8c8c9c724194a952c11b2da6e89f919bd6b813cd9e46d666701c5cc6b46edb4fd9b00911b09996ed7bbc206a7be785
Type fulltextMimetype application/pdf

Other links

URL

Authority records BETA

Jimborean, AlexandraKoukos, KonstantinosSpiliopoulos, VasileiosBlack-Schaffer, DavidKaxiras, Stefanos

Search in DiVA

By author/editor
Jimborean, AlexandraKoukos, KonstantinosSpiliopoulos, VasileiosBlack-Schaffer, DavidKaxiras, Stefanos
By organisation
Computing ScienceComputer Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 296 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1288 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf