Logotyp: till Uppsala universitets webbplats

uu.sePublikationer från Uppsala universitet
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Cache Pirating: Measuring the Curse of the Shared Cache
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. (UART)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. (UART)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. (UART)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. (UART)
2011 (Engelska)Ingår i: Proc. 40th International Conference on Parallel Processing, IEEE Computer Society, 2011, s. 165-175Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
IEEE Computer Society, 2011. s. 165-175
Nationell ämneskategori
Datorteknik
Identifikatorer
URN: urn:nbn:se:uu:diva-181254DOI: 10.1109/ICPP.2011.15ISBN: 978-1-4577-1336-1 (tryckt)OAI: oai:DiVA.org:uu-181254DiVA, id: diva2:555540
Konferens
ICPP 2011
Projekt
UPMARCCoDeR-MPTillgänglig från: 2011-10-17 Skapad: 2012-09-20 Senast uppdaterad: 2018-12-14Bibliografiskt granskad
Ingår i avhandling
1. Profiling Methods for Memory Centric Software Performance Analysis
Öppna denna publikation i ny flik eller fönster >>Profiling Methods for Memory Centric Software Performance Analysis
2012 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with deep memory hierarchies including several levels of caches. For such microprocessors, both the latency and the bandwidth to off-chip memory are typically about two orders of magnitude worse than the latency and bandwidth to the fastest on-chip cache. Consequently, the performance of many applications is largely determined by how well they utilize the caches and bandwidths in the memory hierarchy. For such applications, there are two principal approaches to improve performance: optimize the memory hierarchy and optimize the software. In both cases, it is important to both qualitatively and quantitatively understand how the software utilizes and interacts with the resources (e.g., cache and bandwidths) in the memory hierarchy.

This thesis presents several novel profiling methods for memory-centric software performance analysis. The goal of these profiling methods is to provide general, high-level, quantitative information describing how the profiled applications utilize the resources in the memory hierarchy, and thereby help software and hardware developers identify opportunities for memory related hardware and software optimizations. For such techniques to be broadly applicable the data collection should have minimal impact on the profiled application, while not being dependent on custom hardware and/or operating system extensions. Furthermore, the resulting profiling information should be accurate and easy to interpret.

While several use cases are presented, the main focus of this thesis is the design and evaluation of the core profiling methods. These core profiling methods measure and/or estimate how high-level performance metrics, such as miss-and fetch ratio; off-chip bandwidth demand; and execution rate are affected by the amount of resources the profiled applications receive. This thesis shows that such high-level profiling information can be accurately obtained with very little impact on the profiled applications and without requiring costly simulations or custom hardware support.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2012. s. 51
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1000
Nationell ämneskategori
Datorteknik
Forskningsämne
Datavetenskap
Identifikatorer
urn:nbn:se:uu:diva-182594 (URN)978-91-554-8541-2 (ISBN)
Disputation
2012-12-21, Room 2446, Polacksbacken, Lägerhyddsvägen 2, Uppsala, 13:00 (Engelska)
Opponent
Handledare
Projekt
UPMARC
Tillgänglig från: 2012-11-29 Skapad: 2012-10-11 Senast uppdaterad: 2018-01-12Bibliografiskt granskad
2. Efficient Memory Modeling During Simulation and Native Execution
Öppna denna publikation i ny flik eller fönster >>Efficient Memory Modeling During Simulation and Native Execution
2019 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Application performance on computer processors depends on a number of complex architectural and microarchitectural design decisions. Consequently, computer architects rely on performance modeling to improve future processors without building prototypes. This thesis focuses on performance modeling and proposes methods that quantify the impact of the memory system on application performance.

Detailed architectural simulation, a common approach to performance modeling, can be five orders of magnitude slower than execution on the actual processor. At this rate, simulating realistic workloads requires years of CPU time. Prior research uses sampling to speed up simulation. Using sampled simulation, only a number of small but representative portions of the workload are evaluated in detail. To fully exploit the speed potential of sampled simulation, the simulation method has to efficiently reconstruct the architectural and microarchitectural state prior to the simulation samples. Practical approaches to sampled simulation use either functional simulation at the expense of performance or checkpoints at the expense of flexibility. This thesis proposes three approaches that use statistical cache modeling to efficiently address the problem of cache warm up and speed up sampled simulation, without compromising flexibility. The statistical cache model uses sparse memory reuse information obtained with native techniques to model the performance of the cache. The proposed sampled simulation framework evaluates workloads 150 times faster than approaches that use functional simulation to warm up the cache.

Other approaches to performance modeling use analytical models based on data obtained from execution on native hardware. These native techniques allow for better understanding of the performance bottlenecks on existing hardware. Efficient resource utilization in modern multicore processors is necessary to exploit their peak performance. This thesis proposes native methods that characterize shared resource utilization in modern multicores. These methods quantify the impact of cache sharing and off-chip memory sharing on overall application performance. Additionally, they can quantify scalability bottlenecks for data-parallel, symmetric workloads.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2019. s. 73
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1756
Nyckelord
performance analysis, cache performance, multicore performance, memory system, memory bandwidth, memory contention, performance prediction, multi-threading, multiprocessing systems, program diagnostics, commodity multicores, multithreaded program resource requirements, performance counters, scalability bottleneck, scalability improvement
Nationell ämneskategori
Datorsystem
Forskningsämne
Datavetenskap
Identifikatorer
urn:nbn:se:uu:diva-369490 (URN)978-91-513-0538-7 (ISBN)
Disputation
2019-02-15, Sal VIII, Universitetshuset, Biskopsgatan 3, Uppsala, 09:15 (Engelska)
Opponent
Handledare
Projekt
UPMARC
Tillgänglig från: 2019-01-23 Skapad: 2018-12-14 Senast uppdaterad: 2019-12-02

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Person

Eklöv, DavidNikoleris, NikosBlack-Schaffer, DavidHagersten, Erik

Sök vidare i DiVA

Av författaren/redaktören
Eklöv, DavidNikoleris, NikosBlack-Schaffer, DavidHagersten, Erik
Av organisationen
Datorteknik
Datorteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 1594 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf