Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Dependence-aware Slice Execution to Boost MLP in Slice-out-of-order Cores
Norwegian Univ Sci & Technol NTNU, Sem Saelands Vei 9, N-7034 Trondheim, Norway..
Ericsson Res, Mobilvagen 12, S-22362 Lund, Sweden..
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.ORCID iD: 0000-0001-5375-4058
2022 (English)In: ACM Transactions on Architecture and Code Optimization (TACO), ISSN 1544-3566, E-ISSN 1544-3973, Vol. 19, no 2, article id 25Article in journal (Refereed) Published
Abstract [en]

Exploiting memory-level parallelism (MLP) is crucial to hide long memory and last-level cache access latencies. While out-of-order (OoO) cores, and techniques building on them, are effective at exploiting MLP, they deliver poor energy efficiency due to their complex and energy-hungry hardware. This work revisits slice-out-of-order (sOoO) cores as an energy-efficient alternative for MLP exploitation. sOoO cores achieve energy efficiency by constructing and executing slices of MLP-generating instructions out-of-order only with respect to the rest of instructions; the slices and the remaining instructions, by themselves, execute in-order. However, we observe that existing sOoO cores miss significant MLP opportunities due to their dependence-oblivious in-order slice execution, which causes dependent slices to frequently block MLP generation. To boost MLP generation, we introduce Freeway, a sOoO core based on a new dependence-aware slice execution policy that tracks dependent slices and keeps them from blocking subsequent independent slices and MLP extraction. The proposed core incurs minimal area and power overheads, yet approaches the MLP benefits of fully OoO cores. Our evaluation shows that Freeway delivers 12% better performance than the state-of-the-art sOoO core and is within 7% of the MLP limits of full OoO execution.

Place, publisher, year, edition, pages
ASSOC COMPUTING MACHINERY Association for Computing Machinery (ACM), 2022. Vol. 19, no 2, article id 25
Keywords [en]
Microarchitecture, memory level parallelism, instruction scheduling
National Category
Computer Engineering Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-473200DOI: 10.1145/3506704ISI: 000775454600010OAI: oai:DiVA.org:uu-473200DiVA, id: diva2:1654253
Funder
Knut and Alice Wallenberg FoundationEU, Horizon 2020, 715283Available from: 2022-04-26 Created: 2022-04-26 Last updated: 2024-01-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Black-Schaffer, David

Search in DiVA

By author/editor
Black-Schaffer, David
By organisation
Computer Architecture and Computer CommunicationDivision of Computer Systems
In the same journal
ACM Transactions on Architecture and Code Optimization (TACO)
Computer EngineeringComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 164 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf