uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Clairvoyance: Look-ahead compile-time scheduling
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)ORCID iD: 0000-0002-9460-1290
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)ORCID iD: 0000-0003-4232-6976
Show others and affiliations
2017 (English)In: Proc. 15th International Symposium on Code Generation and Optimization, Piscataway, NJ: IEEE Press, 2017, p. 171-184Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Piscataway, NJ: IEEE Press, 2017. p. 171-184
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-316480ISI: 000402548700015ISBN: 978-1-5090-4931-8 (print)OAI: oai:DiVA.org:uu-316480DiVA, id: diva2:1077941
Conference
CGO 2017, February 4–8, Austin, TX
Projects
UPMARC
Funder
Swedish Research Council, 2010-4741Available from: 2017-02-04 Created: 2017-03-01 Last updated: 2020-01-17Bibliographically approved
In thesis
1. Static instruction scheduling for high performance on energy-efficient processors
Open this publication in new window or tab >>Static instruction scheduling for high performance on energy-efficient processors
2018 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

New trends such as the internet-of-things and smart homes push the demands for energy-efficiency. Choosing energy-efficient hardware, however, often comes as a trade-off to high-performance. In order to strike a good balance between the two, we propose software solutions to tackle the performance bottlenecks of small and energy-efficient processors.

One of the main performance bottlenecks of processors is the discrepancy between processor and memory speed, known as the memory wall. While the processor executes instructions at a high pace, the memory is too slow to provide data in a timely manner, if data has not been cached in advance. Load instructions that require an access to memory are thereby referred to as long-latency or delinquent loads. Long latencies caused by delinquent loads are putting a strain on small processors, which have few or no resources to effectively hide the latencies. As a result, the processor may stall.

In this thesis we propose compile-time transformation techniques to mitigate the penalties of delinquent loads on small out-of-order processors, with the ultimate goal to avoid processor stalls as much as possible. Our code transformation is applicable for general-purpose code, including unknown memory dependencies, complex control flow and pointers. We further propose a software-hardware co-design that combines the code transformation technique with lightweight hardware support to hide latencies on a stall-on-use in-order processor.

Place, publisher, year, edition, pages
Uppsala University, 2018
Series
Information technology licentiate theses: Licentiate theses from the Department of Information Technology, ISSN 1404-5117 ; 2018-001
National Category
Computer Engineering
Research subject
Computer Science
Identifiers
urn:nbn:se:uu:diva-349420 (URN)
Supervisors
Projects
UPMARC
Available from: 2017-12-18 Created: 2018-04-26 Last updated: 2019-02-25Bibliographically approved
2. Finding and Exploiting Memory-Level-Parallelism in Constrained Speculative Architectures
Open this publication in new window or tab >>Finding and Exploiting Memory-Level-Parallelism in Constrained Speculative Architectures
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

One of the main performance bottlenecks of processors today is the discrepancy between processor and memory speed, known as the memory wall. While the processor executes instructions at a high pace, the memory is too slow to provide data in a timely manner. Load instructions that require an access to memory are referred to as long-latency or delinquent loads. To prevent the processor from stalling, independent instruction past the load may execute, including independent loads. Overlapping load operations and thus their latency is referred to as memory-level parallelism. Memory-level parallelism (MLP) can significantly improve performance. Today's out-of-order processors are therefore equipped with complex hardware that allows them to look into the future and to select independent loads that can be overlapped. However, the ability to choose future instructions and speculatively execute them in advance introduces complexity, increased power consumption and potential security risks. In this thesis we look at constrained speculative architectures that struggle to hide memory latencies as they are constrained by design, by their resources, or by security. We investigate ways for the compiler to help them in finding MLP, with the ultimate goal to avoid processor stalls as much as possible. This includes small energy-efficient processors that lack the ability to look-ahead far enough to find independent loads, but also large processors that are disallowed to speculatively execute independent loads due to enforced security measures to circumvent side-channel attacks. We identify the reason for their limitation and propose software transformations and hardware extensions to overcome their restrictions.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2020. p. 50
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1897
Keywords
Memory-level-parallelism, Energy-efficiency, Performance, Compiler, Instruction Scheduling, SW/HW Co-Design
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:uu:diva-402642 (URN)978-91-513-0860-9 (ISBN)
Opponent
Supervisors
Available from: 2020-02-19 Created: 2020-01-17 Last updated: 2020-03-13

Open Access in DiVA

fulltext(938 kB)667 downloads
File information
File name FULLTEXT01.pdfFile size 938 kBChecksum SHA-512
2852e8fdcbc13e029a3f99298a1644ee7068e81638a1f26dc5dbac0ecd446623b480bcc58b7604ccc95c069ea9577d23833f651b4d21844d48496afc0ba934b3
Type fulltextMimetype application/pdf

Authority records BETA

Tran, Kim-AnhCarlson, Trevor E.Koukos, KonstantinosSjälander, MagnusSpiliopoulos, VasileiosKaxiras, StefanosJimborean, Alexandra

Search in DiVA

By author/editor
Tran, Kim-AnhCarlson, Trevor E.Koukos, KonstantinosSjälander, MagnusSpiliopoulos, VasileiosKaxiras, StefanosJimborean, Alexandra
By organisation
Computer Architecture and Computer Communication
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 667 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2846 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf