uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Hagersten, Erik
Alternative names
Publications (10 of 135) Show all publications
Ceballos, G., Hagersten, E. & Black-Schaffer, D. (2018). Tail-PASS: Resource-based Cache Management for Tiled Graphics Rendering Hardware. In: Proc. 16th International Conference on Parallel and Distributed Processing with Applications: . Paper presented at ISPA 2018, December 11–13, Melbourne, Australia. IEEE
Open this publication in new window or tab >>Tail-PASS: Resource-based Cache Management for Tiled Graphics Rendering Hardware
2018 (English)In: Proc. 16th International Conference on Parallel and Distributed Processing with Applications, IEEE, 2018Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE, 2018
National Category
Computer Systems Computer Sciences
Identifiers
urn:nbn:se:uu:diva-363920 (URN)
Conference
ISPA 2018, December 11–13, Melbourne, Australia
Funder
EU, European Research Council, 715283
Available from: 2018-10-21 Created: 2018-10-21 Last updated: 2018-11-16Bibliographically approved
Sembrant, A., Carlson, T. E., Hagersten, E. & Black-Schaffer, D. (2017). A graphics tracing framework for exploring CPU+GPU memory systems. In: Proc. 20th International Symposium on Workload Characterization: . Paper presented at IISWC 2017, October 1–3, Seattle, WA (pp. 54-65). IEEE
Open this publication in new window or tab >>A graphics tracing framework for exploring CPU+GPU memory systems
2017 (English)In: Proc. 20th International Symposium on Workload Characterization, IEEE, 2017, p. 54-65Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE, 2017
National Category
Computer Engineering
Identifiers
urn:nbn:se:uu:diva-357055 (URN)10.1109/IISWC.2017.8167756 (DOI)000428206700006 ()978-1-5386-1233-0 (ISBN)
Conference
IISWC 2017, October 1–3, Seattle, WA
Available from: 2017-12-07 Created: 2018-08-17 Last updated: 2018-09-24Bibliographically approved
Sembrant, A., Hagersten, E. & Black-Schaffer, D. (2017). A split cache hierarchy for enabling data-oriented optimizations. In: Proc. 23rd International Symposium on High Performance Computer Architecture: . Paper presented at HPCA 2017, February 4–8, Austin, TX (pp. 133-144). IEEE Computer Society
Open this publication in new window or tab >>A split cache hierarchy for enabling data-oriented optimizations
2017 (English)In: Proc. 23rd International Symposium on High Performance Computer Architecture, IEEE Computer Society, 2017, p. 133-144Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE Computer Society, 2017
National Category
Computer Engineering
Identifiers
urn:nbn:se:uu:diva-306368 (URN)10.1109/HPCA.2017.25 (DOI)000403330300012 ()978-1-5090-4985-1 (ISBN)
Conference
HPCA 2017, February 4–8, Austin, TX
Available from: 2017-05-08 Created: 2016-10-27 Last updated: 2018-01-14Bibliographically approved
Ceballos, G., Hugo, A., Hagersten, E. & Black-Schaffer, D. (2017). Exploring scheduling effects on task performance with TaskInsight. Supercomputing frontiers and innovations, 4(3), 91-98
Open this publication in new window or tab >>Exploring scheduling effects on task performance with TaskInsight
2017 (English)In: Supercomputing frontiers and innovations, ISSN 2214-3270, E-ISSN 2313-8734, Vol. 4, no 3, p. 91-98Article in journal (Refereed) Published
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-335528 (URN)10.14529/jsfi170306 (DOI)
Projects
UPMARC
Funder
Swedish Foundation for Strategic Research , FFL12-0051
Available from: 2017-12-06 Created: 2017-12-06 Last updated: 2018-11-16Bibliographically approved
Sembrant, A., Carlson, T. E., Hagersten, E. & Black-Schaffer, D. (2017). POSTER: Putting the G back into GPU/CPU Systems Research. In: 2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT): . Paper presented at 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), SEP 09-13, 2017, Portland, OR, USA. (pp. 130-131).
Open this publication in new window or tab >>POSTER: Putting the G back into GPU/CPU Systems Research
2017 (English)In: 2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, p. 130-131Conference paper, Published paper (Refereed)
Abstract [en]

Modern SoCs contain several CPU cores and many GPU cores to execute both general purpose and highly-parallel graphics workloads. In many SoCs, more area is dedicated to graphics than to general purpose compute. Despite this, the micro-architecture research community primarily focuses on GPGPU and CPU-only research, and not on graphics (the primary workload for many SoCs). The main reason for this is the lack of efficient tools and simulators for modern graphics applications. This work focuses on the GPU's memory traffic generated by graphics. We describe a new graphics tracing framework and use it to both study graphics applications' memory behavior as well as how CPUs and GPUs affect system performance. Our results show that graphics applications exhibit a wide range of memory behavior between applications and across time, and slows down co-running SPEC applications by 59% on average.

Series
International Conference on Parallel Architectures and Compilation Techniques, ISSN 1089-795X
National Category
Computer Systems Computer Engineering
Identifiers
urn:nbn:se:uu:diva-347752 (URN)10.1109/PACT.2017.60 (DOI)000417411300011 ()978-1-5090-6764-0 (ISBN)
Conference
26th International Conference on Parallel Architectures and Compilation Techniques (PACT), SEP 09-13, 2017, Portland, OR, USA.
Available from: 2018-04-17 Created: 2018-04-17 Last updated: 2018-04-17Bibliographically approved
Davari, M., Hagersten, E. & Kaxiras, S. (2017). Scope-Aware Classification: Taking the hierarchical private/shared data classification to the next level.
Open this publication in new window or tab >>Scope-Aware Classification: Taking the hierarchical private/shared data classification to the next level
2017 (English)Report (Other academic)
Series
Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203 ; 2017-008
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-320324 (URN)
Available from: 2017-04-27 Created: 2017-04-19 Last updated: 2017-07-03Bibliographically approved
Davari, M., Hagersten, E. & Kaxiras, S. (2017). The best of both works: A hybrid data-race-free cache coherence scheme.
Open this publication in new window or tab >>The best of both works: A hybrid data-race-free cache coherence scheme
2017 (English)Report (Other academic)
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-320320 (URN)
Available from: 2017-04-19 Created: 2017-04-19 Last updated: 2017-11-15
Ceballos, G., Hagersten, E. & Black-Schaffer, D. (2017). Understanding the interplay between task scheduling, memory and performance. In: Proc. Companion 8th ACM International Conference on Systems, Programming, Languages, and Applications: Software for Humanity. Paper presented at SPLASH 2017, October 22–27, Vancouver, Canada (pp. 21-23). New York: ACM Press
Open this publication in new window or tab >>Understanding the interplay between task scheduling, memory and performance
2017 (English)In: Proc. Companion 8th ACM International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, New York: ACM Press, 2017, p. 21-23Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
New York: ACM Press, 2017
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-335556 (URN)10.1145/3135932.3135942 (DOI)978-1-4503-5514-8 (ISBN)
Conference
SPLASH 2017, October 22–27, Vancouver, Canada
Projects
UPMARC
Funder
Swedish Foundation for Strategic Research , FFL12-0051
Available from: 2017-10-22 Created: 2017-12-06 Last updated: 2018-11-16Bibliographically approved
Spiliopoulos, V., Sembrant, A., Keramidas, G., Hagersten, E. & Kaxiras, S. (2016). A unified DVFS-cache resizing framework.
Open this publication in new window or tab >>A unified DVFS-cache resizing framework
Show others...
2016 (English)Report (Other academic)
Series
Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203 ; 2016-014
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-300840 (URN)
Available from: 2016-08-15 Created: 2016-08-15 Last updated: 2018-01-10Bibliographically approved
Van den Steen, S., Eyerman, S., De Pestel, S., Mechri, M., Carlson, T. E., Black-Schaffer, D., . . . Eeckhout, L. (2016). Analytical Processor Performance and Power Modeling Using Micro-Architecture Independent Characteristics. I.E.E.E. transactions on computers (Print), 65(12), 3537-3551
Open this publication in new window or tab >>Analytical Processor Performance and Power Modeling Using Micro-Architecture Independent Characteristics
Show others...
2016 (English)In: I.E.E.E. transactions on computers (Print), ISSN 0018-9340, E-ISSN 1557-9956, Vol. 65, no 12, p. 3537-3551Article in journal (Refereed) Published
Abstract [en]

Optimizing processors for (a) specific application(s) can substantially improve energy-efficiency. With the end of Dennard scaling, and the corresponding reduction in energy-efficiency gains from technology scaling, such approaches may become increasingly important. However, designing application-specific processors requires fast design space exploration tools to optimize for the targeted application(s). Analytical models can be a good fit for such design space exploration as they provide fast performance and power estimates and insight into the interaction between an application's characteristics and the micro-architecture of a processor. Unfortunately, prior analytical models for superscalar out-of-order processors require micro-architecture dependent inputs, such as cache miss rates, branch miss rates and memory-level parallelism. This requires profiling the applications for each cache and branch predictor configuration of interest, which is far more time-consuming than evaluating the analytical performance models. In this work we present a micro-architecture independent profiler and associated analytical models that allow us to produce performance and power estimates across a large superscalar out-of-order processor design space almost instantaneously. We show that using a micro-architecture independent profile leads to a speedup of 300x compared to detailed simulation for our evaluated design space. Over a large design space, the model has a 9.3 percent average error for performance and a 4.3 percent average error for power, compared to detailed cycle-level simulation. The model is able to accurately determine the optimal processor configuration for different applications under power or performance constraints, and provides insight into performance through cycle stacks.

Keywords
Modeling, micro-architecture, performance, power
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-311173 (URN)10.1109/TC.2016.2547387 (DOI)000388498600003 ()
Available from: 2016-12-22 Created: 2016-12-22 Last updated: 2018-01-13Bibliographically approved
Organisations

Search in DiVA

Show all publications