uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
TMA: A Trap-Based Memory Architecture
Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology.
2006 In: Proceedings of the 20th ACM International Conference on Supercomputing, 2006, 259-268 p.Chapter in book (Other academic) Published
Place, publisher, year, edition, pages
2006. 259-268 p.
URN: urn:nbn:se:uu:diva-94836OAI: oai:DiVA.org:uu-94836DiVA: diva2:168832
Available from: 2006-09-21 Created: 2006-09-21Bibliographically approved
In thesis
1. Towards Low-Complexity Scalable Shared-Memory Architectures
Open this publication in new window or tab >>Towards Low-Complexity Scalable Shared-Memory Architectures
2006 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Plentiful research has addressed low-complexity software-based shared-memory systems since the idea was first introduced more than two decades ago. However, software-coherent systems have not been very successful in the commercial marketplace. We believe there are two main reasons for this: lack of performance and/or lack of binary compatibility.

This thesis studies multiple aspects of how to design future binary-compatible high-performance scalable shared-memory servers while keeping the hardware complexity at a minimum. It starts with a software-based distributed shared-memory system relying on no specific hardware support and gradually moves towards architectures with simple hardware support.

The evaluation is made in a modern chip-multiprocessor environment with both high-performance compute workloads and commercial applications. It shows that implementing the coherence-violation detection in hardware while solving the interchip coherence in software allows for high-performing binary-compatible systems with very low hardware complexity. Our second-generation hardware-software hybrid performs on par with, and often better than, traditional hardware-only designs.

Based on our results, we conclude that it is not only possible to design simple systems while maintaining performance and the binary-compatibility envelope, it is often possible to get better performance than in traditional and more complex designs.

We also explore two new techniques for evaluating a new shared-memory design throughout this work: adjustable simulation fidelity and statistical multiprocessor cache modeling.

Place, publisher, year, edition, pages
Uppsala: Universitetsbiblioteket, 2006. 48 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 217
shared memory, distributed shared memory, hardware-software trade-off, software coherence, coherence profiling, remote access cache, chip multiprocessor, simultaneous multi threading, simulation, workload characterization, statistical cache model
National Category
Computer Engineering
urn:nbn:se:uu:diva-7135 (URN)91-554-6647-8 (ISBN)
Public defence
2006-10-13, Auditorium Minus, Museum Gustavianum, Akademigatan 3, Uppsala, 14:15
Available from: 2006-09-21 Created: 2006-09-21 Last updated: 2011-02-18Bibliographically approved

Open Access in DiVA

No full text

By organisation
Department of Information Technology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 194 hits
ReferencesLink to record
Permanent link

Direct link