uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Behind the Scenes: Memory Analysis of Graphical Workloads on Tile-based GPUs
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)ORCID iD: 0000-0003-2314-7307
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. (UART)
2018 (English)In: Proc. International Symposium on Performance Analysis of Systems and Software: ISPASS 2018, IEEE Computer Society, 2018, p. 1-11Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE Computer Society, 2018. p. 1-11
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:uu:diva-361214DOI: 10.1109/ISPASS.2018.00009ISBN: 978-1-5386-5010-3 (electronic)OAI: oai:DiVA.org:uu-361214DiVA, id: diva2:1250103
Conference
ISPASS 2018, April 2–4, Belfast, UK
Projects
UPMARCAvailable from: 2018-09-21 Created: 2018-09-21 Last updated: 2018-11-16Bibliographically approved
In thesis
1. Understanding Task Parallelism: Providing insight into scheduling, memory, and performance for CPUs and Graphics
Open this publication in new window or tab >>Understanding Task Parallelism: Providing insight into scheduling, memory, and performance for CPUs and Graphics
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Maximizing the performance of computer systems while making them more energy efficient is vital for future developments in engineering, medicine, entertainment, etc. However, the increasing complexity of software, hardware, and their interactions makes this task difficult. Software developers have to deal with complex memory architectures such as multilevel caches on modern CPUs and keeping thousands of cores busy in GPUs, which makes the programming process harder.

Task-based programming provides high-level abstractions to simplify the development process. In this model, independent tasks (functions) are submitted to a runtime system, which orchestrates their execution across hardware resources. This approach has become popular and successful because the runtime can distribute the workload across hardware resources automatically, and has the potential to optimize the execution to minimize data movement (e.g., being aware of the cache hierarchy).

However, to build better runtime systems, we now need to understand bottlenecks in the performance of current and future multicore architectures. Unfortunately, since most current work was designed for sequential or thread-based workloads, there is an overall lack of tools and methods to gain insight about the execution of these applications, allowing both the runtime and the programmers to detect potential optimizations.

In this thesis, we address this lack of tools by providing fast, accurate and mathematically-sound models to understand the execution of task-based applications. In particular, we center these models around three key aspects of the execution: memory behavior (data locality), scheduling, and performance. Our contributions provide insight into the interplay between the schedule's behavior, data reuse through the cache hierarchy, and the resulting performance. These contributions lay the groundwork for improving runtime systems. We first apply these methods to analyze a diverse set of CPU applications, and then leverage them to one of the most common workloads in current systems: graphics rendering on GPUs.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2018. p. 67
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1737
Keywords
Task-based programming, Task Scheduling, Analytical Cache Model, Scheduling, Runtime Systems, Computer Graphics (rendering)
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:uu:diva-363924 (URN)978-91-513-0485-4 (ISBN)
Public defence
2018-12-04, 2446, ITC, Lägerhyddsvägen 2, Uppsala, 09:15 (English)
Opponent
Supervisors
Available from: 2018-11-15 Created: 2018-10-21 Last updated: 2018-11-30Bibliographically approved

Open Access in DiVA

fulltext(4560 kB)70 downloads
File information
File name FULLTEXT02.pdfFile size 4560 kBChecksum SHA-512
7c53940e5ff923bf2a9a6514b2c15ac3eb78e0c76084d2e7ae9e3a2e1f4d5c1bc79b3b683f77603b1399802ee9b7286ec181659424166b345eac9b07810b9115
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Ceballos, GermánSembrant, AndreasCarlson, Trevor E.Black-Schaffer, David

Search in DiVA

By author/editor
Ceballos, GermánSembrant, AndreasCarlson, Trevor E.Black-Schaffer, David
By organisation
Computer Architecture and Computer Communication
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 70 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 42 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf