Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Every Walk's a Hit: Making Page Walks Single-Access Cache Hits
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (Uppsala Architecture Research Team)ORCID iD: 0000-0002-8250-8574
Arm Research.ORCID iD: 0000-0003-1444-4326
Arm Research.ORCID iD: 0000-0001-9349-5791
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. (Uppsala Architecture Research Team)ORCID iD: 0000-0001-5375-4058
2022 (English)In: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22), February 28 – March 4, 2022, Lausanne, Switzerland, Association for Computing Machinery (ACM) Association for Computing Machinery (ACM), 2022Conference paper, Published paper (Refereed)
Abstract [en]

As memory capacity has outstripped TLB coverage, large data applications suffer from frequent page table walks. We investigate two complementary techniques for addressing this cost: reducing the number of accesses required and reducing the latency of each access. The first approach is accomplished by opportunistically "flattening" the page table: merging two levels of traditional 4 KB page table nodes into a single 2 MB node, thereby reducing the table's depth and the number of indirections required to traverse it. The second is accomplished by biasing the cache replacement algorithm to keep page table entries during periods of high TLB miss rates, as these periods also see high data miss rates and are therefore more likely to benefit from having the smaller page table in the cache than to suffer from increased data cache misses.

We evaluate these approaches for both native and virtualized systems and across a range of realistic memory fragmentation scenarios, describe the limited changes needed in our kernel implementation and hardware design, identify and address challenges related to self-referencing page tables and kernel memory allocation, and compare results across server and mobile systems using both academic and industrial simulators for robustness.

We find that flattening does reduce the number of accesses required on a page walk (to 1.0), but its performance impact (+2.3%) is small due to Page Walker Caches (already 1.5 accesses). Prioritizing caching has a larger effect (+6.8%), and the combination improves performance by +9.2%. Flattening is more effective on virtualized systems (4.4 to 2.8 accesses, +7.1% performance), due to 2D page walks. By combining the two techniques we demonstrate a state-of-the-art +14.0% performance gain and -8.7% dynamic cache energy and -4.7% dynamic DRAM energy for virtualized execution with very simple hardware and software changes.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM) Association for Computing Machinery (ACM), 2022.
Keywords [en]
Flattened page table, page table cache prioritization
National Category
Computer Sciences
Research subject
Computer Systems Sciences
Identifiers
URN: urn:nbn:se:uu:diva-466738DOI: 10.1145/3503222.3507718ISI: 000810486300010OAI: oai:DiVA.org:uu-466738DiVA, id: diva2:1633977
Conference
27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22), February 28-March 4 2022, Lausanne
Funder
Knut and Alice Wallenberg Foundation, 2015.0153EU, Horizon 2020, 715283Available from: 2022-02-01 Created: 2022-02-01 Last updated: 2024-01-15Bibliographically approved

Open Access in DiVA

fulltext(1302 kB)1684 downloads
File information
File name FULLTEXT02.pdfFile size 1302 kBChecksum SHA-512
3ce4bd0965ad27ae274f1e7b9b6839515d370d9d399ae47841eacdbed9ca8536ff3c7c5b32d63111ca79b9c15263e6ab2544e8de7f9336de0c48360324243d21
Type fulltextMimetype application/pdf
fulltext(4004 kB)311 downloads
File information
File name FULLTEXT03.pdfFile size 4004 kBChecksum SHA-512
2900d8f0a9d41129070e184e80875bac8436997c4d162736339f2a622f96dbdd16a9f5b02dd950cffdbdf05bce836a691f91c1728e62ed4a716d12832a650fa4
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Park, Chang HyunBlack-Schaffer, David

Search in DiVA

By author/editor
Park, Chang HyunVougioukas, IliasSandberg, AndreasBlack-Schaffer, David
By organisation
Division of Computer SystemsComputer Architecture and Computer Communication
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2004 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1074 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf