uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
SIP: Performance Tuning through Source Code Interdependence
Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology.
2002 In: Proceedings of the 8th International Euro-Par ConferenceArticle in journal (Refereed) Published
Place, publisher, year, edition, pages
URN: urn:nbn:se:uu:diva-93582OAI: oai:DiVA.org:uu-93582DiVA: diva2:167106
Available from: 2005-10-19 Created: 2005-10-19Bibliographically approved
In thesis
1. Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
Open this publication in new window or tab >>Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
2005 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Data locality is central to modern computer designs. The widening gap between processor speed and memory latency has introduced the need for a deep hierarchy of caches. Thus, the performance of an application is to a large extent dependent on the amount of data locality the caches can exploit. Some data locality comes naturally from the way most programs are written and the way their data is allocated in the memory. Compilers further try to create data locality by loop transformations and optimized data layout. Different ways of writing a program and/or laying out its data may improve an application’s locality even more. However, it is far from obvious how such a locality optimization can be achieved, especially since the optimizing compiler may have left the optimization job half done. Thus, efficient tools are needed to guide the software developers on their quest for data locality.

The main contribution of this dissertation is a sample-based novel method for analyzing the data locality of an application. Very sparse data is collected during a single execution of the studied application. The sparse sampling adds a minimum overhead to the execution time, which enables complex applications running realistic data sets to be studied. The architecturalindependent information collected during the execution is fed to a mathematical cache model for predicting the cache miss ratio. The sparsely-collected data can be used to characterize the application’s data locality in respect to almost any possible cache hierarchy, such as complicated multiprocessor memory systems with multilevel cache hierarchies. Any combination of cache size, cache line size and degree of sharing can be modeled. Each new modeled design point takes only a fraction of a second to evaluate, even though the application from which the sampled data was collected may have executed for hours. This makes the tool not just usable for software developers, but also for hardware developers who need to evaluate a huge memory-system design space.

We also discuss different ways of presenting data-locality information to a programmer in an intuitive and easily interpreted way. Some of the locality metrics we introduce utilize the flexibility of our algorithm and its ability to vary different cache parameters for one run. The dissertation also presents several prototype implementations of tools for profiling the memory system.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2005. 30 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 101
National Category
Computer Science
urn:nbn:se:uu:diva-6012 (URN)91-554-6363-0 (ISBN)
Public defence
2005-11-10, Häggsalen 10132, Ångströmlaboratoriet, Lägerhyddsvägen 1, Polacksbacken, Uppsala, 13:15
Available from: 2005-10-19 Created: 2005-10-19 Last updated: 2011-02-18Bibliographically approved

Open Access in DiVA

No full text

By organisation
Department of Information Technology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 170 hits
ReferencesLink to record
Permanent link

Direct link