Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Faster Functional Warming with Cache Merging
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. (Uppsala Architecture Research Group)ORCID iD: 0000-0001-7833-4412
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.ORCID iD: 0000-0002-1527-734X
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. (Uppsala Architecture Research Group)ORCID iD: 0000-0001-5375-4058
2022 (English)Report (Other academic)
Abstract [en]

Smarts-like sampled hardware simulation techniques achieve good accuracy by simulating many small portions of an application in detail. However, while this reduces the detailed simulation time, it results in extensive cache warming times, as each of the many simulation points requires warming the whole memory hierarchy. Adaptive Cache Warming reduces this time by iteratively increasing warming until achieving sufficient accuracy. Unfortunately, each time the warming increases, the previous warming must be redone, nearly doubling the required warming. We address re-warming by developing a technique to merge the cache states from the previous and additional warming iterations.

We address re-warming by developing a technique to merge the cache states from the previous and additional warming iterations. We demonstrate our merging approach on multi-level LRU cache hierarchy and evaluate and address the introduced errors. By removing warming redundancy, we expect an ideal 2× warming speedup when using our Cache Merging solution together with Adaptive Cache Warming. Experiments show that Cache Merging delivers an average speedup of 1.44×, 1.84×, and 1.87× for 128kB, 2MB, and 8MB L2 caches, respectively, with 95-percentile absolute IPC errors of only 0.029, 0.015, and 0.006, respectively. These results demonstrate that Cache Merging yields significantly higher simulation speed with minimal losses.

Place, publisher, year, edition, pages
2022. , p. 22
Keywords [en]
functional warming, cache warming, cache merging
National Category
Computer Sciences
Research subject
Computer Systems Sciences
Identifiers
URN: urn:nbn:se:uu:diva-484367ISRN: 2022-007OAI: oai:DiVA.org:uu-484367DiVA, id: diva2:1694733
Funder
Knut and Alice Wallenberg Foundation, 2015.0153EU, Horizon 2020, 715283National Supercomputer Centre (NSC), Sweden, 2021/22-435Swedish National Infrastructure for Computing (SNIC), 2021/23-626Swedish Research Council, 2018-05973Available from: 2022-09-10 Created: 2022-09-10 Last updated: 2022-12-16Bibliographically approved
In thesis
1. Making Sampled Simulations Faster by Minimizing Warming Time
Open this publication in new window or tab >>Making Sampled Simulations Faster by Minimizing Warming Time
2022 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

A computer system simulator is a fundamental tool for computer architects to try out brand new ideas or explore the system’s response to different configurations when executing different program codes. However, even simulating the CPU core in detail is time-consuming as the execution rate slows down by several orders of magnitude compared to native execution. To solve this problem, previous work, namely SMARTS, demonstrates a statistical sampling methodology that records measurements only from tiny samples throughout the simulation. It spends only a fraction of the full simulation time on these sample measurements. In-between detailed sample simulations, SMARTS fast-forwards in the simulation using a greatly simplified and much faster simulation model (compared to full detail), which maintains only necessary parts of the architecture, such as cache memory. This maintenance process is called warming. While warming is mandatory to keep the simulation accuracy high, caches may be sufficiently warm for an accurate simulation long before reaching the sample. In other words, much time may be wasted on warming in SMARTS.

In this work, we show that caches can be kept in an accurate state with much less time spent on warming. The first paper presents Adaptive Cache Warming, a methodology for identifying the minimum amount of warming in an iterative process for every SMARTS sample. The rest of the simulation time, previously spent on warming, can be skipped by fast-forwarding between samples using native hardware execution of the code. Doing so will thus result in significantly faster statistically sampled simulation while maintaining accuracy. The second paper presents Cache Merging, which mitigates the redundant warmings introduced in Adaptive Cache Warming. We solve this issue by going back in time and merging the existing warming with a cache warming session that comes chronologically before the existing warming. By removing the redundant warming, we yield even more speedup. Together, Adaptive Cache Warming and Cache Merging is a powerful boost for statistically sampled simulations.

Place, publisher, year, edition, pages
Uppsala: Uppsala University, 2022. p. 107
Series
Information technology licentiate theses: Licentiate theses from the Department of Information Technology, ISSN 1404-5117
National Category
Computer Sciences
Research subject
Computer Systems Sciences
Identifiers
urn:nbn:se:uu:diva-484368 (URN)
Presentation
2022-10-28, 10:15 (English)
Opponent
Supervisors
Available from: 2022-10-07 Created: 2022-09-10 Last updated: 2022-10-07Bibliographically approved

Open Access in DiVA

Faster_Functional_Warming_with_Cache_Merging.2022.Technical-Report(857 kB)189 downloads
File information
File name FULLTEXT01.pdfFile size 857 kBChecksum SHA-512
1906220359b8350e5d2c2bf77b41d9a2fb18a8961a04d718f3276b884acfde3a15e61699efe184e98487a978b73457f782486d9c677f2a21c06334ecf0a80a8d
Type fulltextMimetype application/pdf

Authority records

Borgström, GustafRohner, ChristianBlack-Schaffer, David

Search in DiVA

By author/editor
Borgström, GustafRohner, ChristianBlack-Schaffer, David
By organisation
Computer Architecture and Computer CommunicationDivision of Computer Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 190 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 193 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf