Logotyp: till Uppsala universitets webbplats

uu.sePublikationer från Uppsala universitet
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Low-Overhead Memory Access Sampler: An Efficient Method for Data-Locality Profiling
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
2011 (Engelska)Självständigt arbete på avancerad nivå (yrkesexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
Abstract [en]

There is an ever widening performance gap between processors and main memory, a gap bridged by small intermediate memories, cache memories, storing recently referenced data. A miss in the cache is an expensive operation because it requires data to be fetched from main memory. It is therefore crucial to understand application cache behavior. Caches only work well for applications with good data locality; insufficient data locality leads to poor cache utilization which quickly becomes a major performance bottleneck. Analysing and understanding the cache behavior helps in improving data locality and identifying such bottlenecks.

In this thesis, we study a method for efficiently analysing application cache behavior. We implement the method in a cache analysis tool. The method uses a statistical cache model that only requires a sparse data locality fingerprint as input. The input is based on reuse distances between cache lines. By adjusting architecture-specific parameters, such as cache line size, the tool can output working-set graphs for a wide range of architectures. Readily available hardware performance counters combined with intelligent sampling are used to enable an implementation with low overhead.

We evaluate our cache analysis tool using the SPEC CPU2006 benchmarks and our results show good accuracy and performance. The difference between the cache miss ratio estimated by our tool and a reference tool was nearly always below one percentage point. The run-time overhead was on average 17%. We also do an analysis of the overhead to identify the components of our implementation that are most costly and should be the focus for optimizations.

We propose a number of optimizations that could reduce the overhead further. Phase-guided sampling is proposed as a key optimization where application phase behavior is used to determine when to sample memory references. We also build a prototype implementation of this optimization and the preliminary results were promising.

Ort, förlag, år, upplaga, sidor
2011.
Serie
UPTEC IT, ISSN 1401-5749 ; 11 003
Identifikatorer
URN: urn:nbn:se:uu:diva-146664OAI: oai:DiVA.org:uu-146664DiVA, id: diva2:398696
Uppsök
teknik
Handledare
Examinatorer
Tillgänglig från: 2011-02-18 Skapad: 2011-02-18 Senast uppdaterad: 2011-02-18Bibliografiskt granskad

Open Access i DiVA

fulltext(782 kB)841 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 782 kBChecksumma SHA-512
1e01b7f93b52c148245a4691e26c66296dd15df661a29d2c2382d7cb5e8f3b45d66b1da875ab2f4391127861609e9876bb41b00c6b7f16b91c80de90a2ca94f3
Typ fulltextMimetyp application/pdf

Av organisationen
Institutionen för informationsteknologi

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 846 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 1546 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf