uu.seUppsala universitets publikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
National University of Singapore, Singapore.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.ORCID-id: 0000-0002-9460-1290
Vise andre og tillknytning
2018 (engelsk)Inngår i: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, Association for Computing Machinery (ACM), 2018, s. 328-343Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Increasing demands for energy efficiency constrain emerging hardware. These new hardware trends challenge the established assumptions in code generation and force us to rethink existing software optimization techniques. We propose a cross-layer redesign of the way compilers and the underlying microarchitecture are built and interact, to achieve both performance and high energy efficiency.

In this paper, we address one of the main performance bottlenecks — last-level cache misses — through a software-hardware co-design. Our approach is able to hide memory latency and attain increased memory and instruction level parallelism by orchestrating a non-speculative, execute-ahead paradigm in software (SWOOP). While out-of-order (OoO) architectures attempt to hide memory latency by dynamically reordering instructions, they do so through expensive, power-hungry, speculative mechanisms.We aim to shift this complexity into software, and we build upon compilation techniques inherited from VLIW, software pipelining, modulo scheduling, decoupled access-execution, and software prefetching. In contrast to previous approaches we do not rely on either software or hardware speculation that can be detrimental to efficiency. Our SWOOP compiler is enhanced with lightweight architectural support, thus being able to transform applications that include highly complex control-flow and indirect memory accesses.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2018. s. 328-343
HSV kategori
Identifikatorer
URN: urn:nbn:se:uu:diva-361359DOI: 10.1145/3192366.3192393ISI: 000452469600023ISBN: 978-1-4503-5698-5 (digital)OAI: oai:DiVA.org:uu-361359DiVA, id: diva2:1250305
Konferanse
PLDI 2018 the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 18-22 2018, Philadelphia, USA
Prosjekter
UPMARC
Forskningsfinansiär
Swedish Research Council, 2016-05086Tilgjengelig fra: 2018-09-23 Laget: 2018-09-23 Sist oppdatert: 2020-01-17bibliografisk kontrollert
Inngår i avhandling
1. Finding and Exploiting Memory-Level-Parallelism in Constrained Speculative Architectures
Åpne denne publikasjonen i ny fane eller vindu >>Finding and Exploiting Memory-Level-Parallelism in Constrained Speculative Architectures
2020 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

One of the main performance bottlenecks of processors today is the discrepancy between processor and memory speed, known as the memory wall. While the processor executes instructions at a high pace, the memory is too slow to provide data in a timely manner. Load instructions that require an access to memory are referred to as long-latency or delinquent loads. To prevent the processor from stalling, independent instruction past the load may execute, including independent loads. Overlapping load operations and thus their latency is referred to as memory-level parallelism. Memory-level parallelism (MLP) can significantly improve performance. Today's out-of-order processors are therefore equipped with complex hardware that allows them to look into the future and to select independent loads that can be overlapped. However, the ability to choose future instructions and speculatively execute them in advance introduces complexity, increased power consumption and potential security risks. In this thesis we look at constrained speculative architectures that struggle to hide memory latencies as they are constrained by design, by their resources, or by security. We investigate ways for the compiler to help them in finding MLP, with the ultimate goal to avoid processor stalls as much as possible. This includes small energy-efficient processors that lack the ability to look-ahead far enough to find independent loads, but also large processors that are disallowed to speculatively execute independent loads due to enforced security measures to circumvent side-channel attacks. We identify the reason for their limitation and propose software transformations and hardware extensions to overcome their restrictions.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2020. s. 50
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1897
Emneord
Memory-level-parallelism, Energy-efficiency, Performance, Compiler, Instruction Scheduling, SW/HW Co-Design
HSV kategori
Forskningsprogram
Datavetenskap
Identifikatorer
urn:nbn:se:uu:diva-402642 (URN)978-91-513-0860-9 (ISBN)
Opponent
Veileder
Tilgjengelig fra: 2020-02-19 Laget: 2020-01-17 Sist oppdatert: 2020-03-13

Open Access i DiVA

fulltext(665 kB)207 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 665 kBChecksum SHA-512
d296fb7543f68078c92c6a7bc5ba2028bf8aea9c5c18c75bca3bf2c57e0beed145a4d61034bf557a7eb114670c651bc63bc79fb1baf8df234f1bc3eca4b2e0d8
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekst

Personposter BETA

Tran, Kim-AnhJimborean, AlexandraKoukos, KonstantinosKaxiras, Stefanos

Søk i DiVA

Av forfatter/redaktør
Tran, Kim-AnhJimborean, AlexandraKoukos, KonstantinosKaxiras, Stefanos
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 207 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 191 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf