uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A case for resource efficient prefetching in multicores
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (UART)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (UART)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (UART)
2014 (English)In: Proc. International Symposium on Performance Analysis of Systems and Software: ISPASS 2014, IEEE Computer Society, 2014, 137-138 p.Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Hardware prefetching has proven very effective for hiding memory latency and can speed up some applications by more than 40%. However, this speedup comes at the cost of often prefetching a significant volume of useless data which wastes shared last level cache space and off-chip bandwidth. This directly impacts the performance of co-scheduled applications which compete for shared resources in multicores. This paper explores how a resource-efficient prefetching scheme can benefit performance by conserving shared resources in multicores. We present a framework that uses fast cache modeling to accurately identify memory instructions that benefit most from prefetching. The framework inserts software prefetches in the application only when they benefit performance, and employs cache bypassing whenever possible. These properties help reduce off-chip bandwidth consumption and last-level cache pollution. While single-thread performance remains comparable to hardware prefetching, the full advantage of the scheme is realized when several cores are used and demand for shared resources grows.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014. 137-138 p.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-234546DOI: 10.1109/ISPASS.2014.6844473ISBN: 978-1-4799-3604-5 (print)OAI: oai:DiVA.org:uu-234546DiVA: diva2:757003
Conference
ISPASS 2014, March 23-25, Monterey, CA
Projects
UPMARC
Available from: 2014-05-06 Created: 2014-10-20 Last updated: 2016-01-26Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Khan, MuneebSandberg, AndreasHagersten, Erik

Search in DiVA

By author/editor
Khan, MuneebSandberg, AndreasHagersten, Erik
By organisation
Computer Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 644 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf