uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A hybrid static–dynamic classification for dual-consistency cache coherence
Univ Murcia, Dept Comp Engn, E-30100 Murcia, Spain.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)
2016 (English)In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 27, no 11, 3101-3115 p.Article in journal (Refereed) Published
Resource type
Text
Abstract [en]

Traditional cache coherence protocols manage all memory accesses equally and ensure the strongest memory model, namely, sequential consistency. Recent cache coherence protocols based on self-invalidation advocate for the model sequential consistency for data-race-free, which enables powerful optimizations for race-free code. However, for racy code these cache coherence protocols provide sub-optimal performance compared to traditional protocols. This paper proposes SPEL++, a dual-consistency cache coherence protocol that supports two execution modes: a traditional sequential-consistent protocol and a protocol that provides weak consistency (or sequential consistency for data-race-free). SPEL++ exploits a static-dynamic hybrid classification of memory accesses based on (i) a compile-time identification of extended data-race-free code regions for OpenMP applications and (ii) a runtime classification of accesses based on the operating system's memory page management. By executing racy code under the sequential-consistent protocol and race-free code under the cache coherence protocol that provides sequential consistency for data-race-free, the end result is an efficient execution of the applications while still providing sequential consistency. Compared to a traditional protocol, we show improvements in performance from 19 to 38 percent and reductions in energy consumption from 47 to 53 percent, on average for different benchmark suites, on a 64-core chip multiprocessor.

Place, publisher, year, edition, pages
2016. Vol. 27, no 11, 3101-3115 p.
Keyword [en]
Multiprocessors; cache coherence; classification of accesses; runtime; compiler; consistency model; data races
National Category
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-283201DOI: 10.1109/TPDS.2016.2528241ISI: 000386247000002OAI: oai:DiVA.org:uu-283201DiVA: diva2:918768
Projects
UPMARC
Funder
Swedish Research Council, 106201305/C0533201EU, FP7, Seventh Framework Programme, EU ICT-287759
Available from: 2016-02-11 Created: 2016-04-11 Last updated: 2017-11-30Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Jimborean, Alexandra

Search in DiVA

By author/editor
Jimborean, Alexandra
By organisation
Computer Architecture and Computer Communication
In the same journal
IEEE Transactions on Parallel and Distributed Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1342 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf