uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 14) Show all publications
Sakalis, C., Alipour, M., Ros, A., Jimborean, A., Kaxiras, S. & Själander, M. (2019). Ghost Loads: What is the cost of invisible speculation?. In: Proceedings of the 16th ACM International Conference on Computing Frontiers: . Paper presented at CF 2019, April 30 – May 2, Alghero, Sardinia, Italy (pp. 153-163). New York: ACM Press
Open this publication in new window or tab >>Ghost Loads: What is the cost of invisible speculation?
Show others...
2019 (English)In: Proceedings of the 16th ACM International Conference on Computing Frontiers, New York: ACM Press, 2019, p. 153-163Conference paper, Published paper (Refereed)
Abstract [en]

Speculative execution is necessary for achieving high performance on modern general-purpose CPUs but, starting with Spectre and Meltdown, it has also been proven to cause severe security flaws. In case of a misspeculation, the architectural state is restored to assure functional correctness but a multitude of microarchitectural changes (e.g., cache updates), caused by the speculatively executed instructions, are commonly left in the system.  These changes can be used to leak sensitive information, which has led to a frantic search for solutions that can eliminate such security flaws. The contribution of this work is an evaluation of the cost of hiding speculative side-effects in the cache hierarchy, making them visible only after the speculation has been resolved. For this, we compare (for the first time) two broad approaches: i) waiting for loads to become non-speculative before issuing them to the memory system, and ii) eliminating the side-effects of speculation, a solution consisting of invisible loads (Ghost loads) and performance optimizations (Ghost Buffer and Materialization). While previous work, InvisiSpec, has proposed a similar solution to our latter approach, it has done so with only a minimal evaluation and at a significant performance cost. The detailed evaluation of our solutions shows that: i) waiting for loads to become non-speculative is no more costly than the previously proposed InvisiSpec solution, albeit much simpler, non-invasive in the memory system, and stronger security-wise; ii) hiding speculation with Ghost loads (in the context of a relaxed memory model) can be achieved at the cost of 12% performance degradation and 9% energy increase, which is significantly better that the previous state-of-the-art solution.

Place, publisher, year, edition, pages
New York: ACM Press, 2019
Keywords
speculation, security, side-channel attacks, caches
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-383173 (URN)10.1145/3310273.3321558 (DOI)000474686400019 ()978-1-4503-6685-4 (ISBN)
Conference
CF 2019, April 30 – May 2, Alghero, Sardinia, Italy
Funder
Swedish Research Council, 2015-05159Swedish National Infrastructure for Computing (SNIC)
Note

Available from: 2019-05-10 Created: 2019-05-10 Last updated: 2019-08-23Bibliographically approved
Tran, K.-A., Carlson, T. E., Koukos, K., Själander, M., Spiliopoulos, V., Kaxiras, S. & Jimborean, A. (2018). Static instruction scheduling for high performance on limited hardware. IEEE Transactions on Computers, 67(4), 513-527
Open this publication in new window or tab >>Static instruction scheduling for high performance on limited hardware
Show others...
2018 (English)In: IEEE Transactions on Computers, ISSN 0018-9340, Vol. 67, no 4, p. 513-527Article in journal (Refereed) Published
Abstract [en]

Complex out-of-order (OoO) processors have been designed to overcome the restrictions of outstanding long-latency misses at the cost of increased energy consumption. Simple, limited OoO processors are a compromise in terms of energy consumption and performance, as they have fewer hardware resources to tolerate the penalties of long-latency loads. In worst case, these loads may stall the processor entirely. We present Clairvoyance, a compiler based technique that generates code able to hide memory latency and better utilize simple OoO processors. By clustering loads found across basic block boundaries, Clairvoyance overlaps the outstanding latencies to increases memory-level parallelism. We show that these simple OoO processors, equipped with the appropriate compiler support, can effectively hide long-latency loads and achieve performance improvements for memory-bound applications. To this end, Clairvoyance tackles (i) statically unknown dependencies, (ii) insufficient independent instructions, and (iii) register pressure. Clairvoyance achieves a geomean execution time improvement of 14 percent for memory-bound applications, on top of standard O3 optimizations, while maintaining compute-bound applications' high-performance.

National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-334011 (URN)10.1109/TC.2017.2769641 (DOI)000427420800005 ()
Projects
UPMARC
Funder
Swedish Research Council, 2016-05086
Available from: 2017-11-03 Created: 2017-11-20 Last updated: 2018-05-17Bibliographically approved
Tran, K.-A., Carlson, T. E., Koukos, K., Själander, M., Spiliopoulos, V., Kaxiras, S. & Jimborean, A. (2017). Clairvoyance: Look-ahead compile-time scheduling. In: Proc. 15th International Symposium on Code Generation and Optimization: . Paper presented at CGO 2017, February 4–8, Austin, TX (pp. 171-184). Piscataway, NJ: IEEE Press
Open this publication in new window or tab >>Clairvoyance: Look-ahead compile-time scheduling
Show others...
2017 (English)In: Proc. 15th International Symposium on Code Generation and Optimization, Piscataway, NJ: IEEE Press, 2017, p. 171-184Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Piscataway, NJ: IEEE Press, 2017
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-316480 (URN)000402548700015 ()978-1-5090-4931-8 (ISBN)
Conference
CGO 2017, February 4–8, Austin, TX
Projects
UPMARC
Funder
Swedish Research Council, 2010-4741
Available from: 2017-02-04 Created: 2017-03-01 Last updated: 2018-04-26Bibliographically approved
Carlson, T. E., Tran, K.-A., Jimborean, A., Koukos, K., Själander, M. & Kaxiras, S. (2017). Transcending hardware limits with software out-of-order processing. IEEE Computer Architecture Letters, 16(2), 162-165
Open this publication in new window or tab >>Transcending hardware limits with software out-of-order processing
Show others...
2017 (English)In: IEEE Computer Architecture Letters, ISSN 1556-6056, Vol. 16, no 2, p. 162-165Article in journal (Refereed) Published
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-334012 (URN)10.1109/LCA.2017.2672559 (DOI)000418870500018 ()
Projects
UPMARC
Available from: 2017-02-22 Created: 2017-11-20 Last updated: 2018-04-26Bibliographically approved
Voigt, T., Själander, M., Hermans, F., Jimborean, A., Hagersten, E., Gunningberg, P. & Kaxiras, S. (2016). Approximation: A New Paradigm also for Wireless Sensing. In: : . Paper presented at International Conference on Embedded Wireless Systems and Networks.
Open this publication in new window or tab >>Approximation: A New Paradigm also for Wireless Sensing
Show others...
2016 (English)Conference paper, Poster (with or without abstract) (Refereed)
National Category
Computer Engineering
Identifiers
urn:nbn:se:uu:diva-287456 (URN)
Conference
International Conference on Embedded Wireless Systems and Networks
Available from: 2016-04-25 Created: 2016-04-25 Last updated: 2018-01-10
Moreau, D., Bardizbanyan, A., Själander, M., Whalley, D. & Larsson-Edefors, P. (2016). Practical way halting by speculatively accessing halt tags. In: Proc. 19th Conference on Design, Automation and Test in Europe: . Paper presented at DATE 2016, March 14–18, Dresden, Germany (pp. 1375-1380). Piscataway, NJ: IEEE
Open this publication in new window or tab >>Practical way halting by speculatively accessing halt tags
Show others...
2016 (English)In: Proc. 19th Conference on Design, Automation and Test in Europe, Piscataway, NJ: IEEE, 2016, p. 1375-1380Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2016
National Category
Computer Engineering
Identifiers
urn:nbn:se:uu:diva-306077 (URN)000382679200253 ()978-3-9815-3707-9 (ISBN)
Conference
DATE 2016, March 14–18, Dresden, Germany
Available from: 2016-04-28 Created: 2016-10-24 Last updated: 2018-01-14Bibliographically approved
Sanchez, C., Gavin, P., Moreau, D., Själander, M., Whalley, D., Larsson-Edefors, P. & McKee, S. A. (2016). Redesigning a tagless access buffer to require minimal ISA changes. In: Proc. 19th International Conference on Compilers, Architectures and Synthesis for Embedded Systems: . Paper presented at CASES 2016, October 1–7, Pittsburgh, PA. , Article ID 19.
Open this publication in new window or tab >>Redesigning a tagless access buffer to require minimal ISA changes
Show others...
2016 (English)In: Proc. 19th International Conference on Compilers, Architectures and Synthesis for Embedded Systems, 2016, article id 19Conference paper, Published paper (Refereed)
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-310317 (URN)10.1145/2968455.2968504 (DOI)000390612900019 ()9781450344821 (ISBN)
Conference
CASES 2016, October 1–7, Pittsburgh, PA
Projects
UPMARC
Available from: 2016-10-01 Created: 2016-12-13 Last updated: 2017-02-08Bibliographically approved
Själander, M., Borgström, G., Klymenko, M. V., Remacle, F. & Kaxiras, S. (2016). Techniques for modulating error resilience in emerging multi-value technologies. In: Proc. 13th International Conference on Computing Frontiers: . Paper presented at CF 2016, May 16–19, Como, Italy (pp. 55-63). New York: ACM Press
Open this publication in new window or tab >>Techniques for modulating error resilience in emerging multi-value technologies
Show others...
2016 (English)In: Proc. 13th International Conference on Computing Frontiers, New York: ACM Press, 2016, p. 55-63Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
New York: ACM Press, 2016
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-287672 (URN)10.1145/2903150.2903154 (DOI)978-1-4503-4128-8 (ISBN)
Conference
CF 2016, May 16–19, Como, Italy
Funder
EU, FP7, Seventh Framework Programme, 318397
Available from: 2016-05-16 Created: 2016-04-26 Last updated: 2016-07-01Bibliographically approved
Bardizbanyan, A., Själander, M., Whalley, D. & Larsson-Edefors, P. (2015). Improving data access efficiency by using context-aware loads and stores. In: Proc. 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems: . Paper presented at LCTES 2015, June 18–19, Portland, OR (pp. 27-36). New York: ACM Press
Open this publication in new window or tab >>Improving data access efficiency by using context-aware loads and stores
2015 (English)In: Proc. 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, New York: ACM Press, 2015, p. 27-36Conference paper, Published paper (Refereed)
Abstract [en]

Memory operations have a significant impact on both performance and energy usage even when an access hits in the level-one data cache (L1 DC). Load instructions in particular affect performance as they frequently result in stalls since the register to be loaded is often referenced before the data is available in the pipeline. L1 DC accesses also impact energy usage as they typically require significantly more energy than a register file access. Despite their impact on performance and energy usage, L1 DC accesses on most processors are performed in a general fashion without regard to the context in which the load or store operation is performed. We describe a set of techniques where the compiler enhances load and store instructions so that they can be executed with fewer stalls and/or enable the L1 DC to be accessed in a more energy-efficient manner. We show that using these techniques can simultaneously achieve a 6% gain in performance and a 43% reduction in L1 DC energy usage.

Place, publisher, year, edition, pages
New York: ACM Press, 2015
Keywords
Algorithms; Measurements; Performance; Energy; Data Caches; Compiler Optimizations
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-260543 (URN)10.1145/2670529.2754960 (DOI)000370875500003 ()978-1-4503-3257-6 (ISBN)
Conference
LCTES 2015, June 18–19, Portland, OR
Funder
Swedish Research Council, 2009-4566
Available from: 2015-06-04 Created: 2015-08-20 Last updated: 2016-04-05Bibliographically approved
Baird, R., Gavin, P., Själander, M., Whalley, D. & Uh, G.-R. (2015). Optimizing transfers of control in the static pipeline architecture. In: Proc. 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems: . Paper presented at LCTES 2015, June 18–19, Portland, OR (pp. 7-16). New York: ACM Press
Open this publication in new window or tab >>Optimizing transfers of control in the static pipeline architecture
Show others...
2015 (English)In: Proc. 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, New York: ACM Press, 2015, p. 7-16Conference paper, Published paper (Refereed)
Abstract [en]

Statically pipelined processors offer a new way to improve the performance beyond that of a traditional in-order pipeline while simultaneously reducing energy usage by enabling the compiler to control more fine-grained details of the program execution. This paper describes how a compiler can exploit the features of the static pipeline architecture to apply optimizations on transfers of control that are not possible on a conventional architecture. The optimizations presented in this paper include hoisting the target address calculations for branches, jumps, and calls out of loops, performing branch chaining between calls and jumps, hoisting the setting of return addresses out of loops, and exploiting conditional calls and returns. The benefits of performing these transfer of control optimizations include a 6.8% reduction in execution time and a 3.6% decrease in estimated energy usage.

Place, publisher, year, edition, pages
New York: ACM Press, 2015
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-260544 (URN)10.1145/2670529.2754952 (DOI)000370875500001 ()978-1-4503-3257-6 (ISBN)
Conference
LCTES 2015, June 18–19, Portland, OR
Available from: 2015-06-04 Created: 2015-08-20 Last updated: 2016-04-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4232-6976

Search in DiVA

Show all publications