uu.seUppsala universitets publikasjoner
Endre søk
Begrens søket
1 - 14 of 14
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Forsberg, Bjoern
    et al.
    Swiss Fed Inst Technol Zurich, D ITET, Zurich, Switzerland..
    Lampka, Kai
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    An Online Overclocking Scheme for Bursty Real-time Tasks and an Evaluation of its Thermal Impact2016Inngår i: 14Th ACM/IEEE Symposium On Embedded Systems For Real-Time Multimedia (ESTIMEDIA 2016), 2016, s. 104-113Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper proposes a scheme which drives a processor beyond its rated operation frequency, e.g., by exploiting Intel's boost technology, to digest the peak workload of the system in time. In the setting of deadline constrained workloads, this is far from trivial: the boost mode can only be used during short time spans, therefore it can only help to digest the peak workload, rather than serving the normal case. A lowered processor frequency, used outside the peak workload time, yields a backlog of not completed jobs. This backlog may result in deadline violations or buffer overflows, if the next burst of job arrivals appears too early. To overcome the above problem, we propose a peak workload aware speed assignment strategy, which only allows the system to build up computation backlog if the absence of high computation demands is assured. Contrasting the existing body of work, we take advantage of bursty arrival patterns of compute jobs, thereby progressing over the standard (non-bursty sporadic) job release model. Together with our scheme, we also present a tool chain and simulations of synthetic workloads for investigating the thermal effects of different speed assignment strategies.

  • 2.
    Jimborean, Alexandra
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datalogi.
    Koukos, Konstantinos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Black-Schaffer, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Fix the code. Don't tweak the hardware: A new compiler approach to Voltage–Frequency scaling2014Inngår i: Proc. 12th International Symposium on Code Generation and Optimization, New York: ACM Press, 2014, s. 262-272Konferansepaper (Fagfellevurdert)
    Fulltekst (pdf)
    fulltext
  • 3.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Black-Schaffer, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Towards more efficient execution: a decoupled access-execute approach2013Inngår i: Proc. 27th ACM International Conference on Supercomputing, New York: ACM Press, 2013, s. 253-262Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The end of Dennard scaling is expected to shrink the range of DVFS in future nodes, limiting the energy savings of this technique. This paper evaluates how much we can increase the effectiveness of DVFS by using a software decoupled access-execute approach. Decoupling the data access from execution allows us to apply optimal voltage-frequency selection for each phase and therefore improve energy efficiency over standard coupled execution.

    The underlying insight of our work is that by decoupling access and execute we can take advantage of the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency, while maintaining good performance. To demonstrate this we built a task based parallel execution infrastructure consisting of: (1) a runtime system to orchestrate the execution, (2) power models to predict optimal voltage-frequency selection at runtime, (3) a modeling infrastructure based on hardware measurements to simulate zero-latency, per-core DVFS, and (4) a hardware measurement infrastructure to verify our model's accuracy.

    Based on real hardware measurements we project that the combination of decoupled access-execute and DVFS has the potential to improve EDP by 25% without hurting performance. On memory-bound applications we significantly improve performance due to increased MLP in the access phase and ILP in the execute phase. Furthermore we demonstrate that our method can achieve high performance both in presence or absence of a hardware prefetcher.

    Fulltekst (pdf)
    fulltext
  • 4.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Black-Schaffer, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Towards Power Efficiency on Task-Based, Decoupled Access-Execute Models2013Inngår i: PARMA 2013, 4th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This work demonstrates the potential of hardware and software optimization to improve theeffectiveness of dynamic voltage and frequency scaling (DVFS). For software, we decouple data prefetch (access) and computation (execute) to enable optimal DVFS selectionfor each phase. For hardware, we use measurements from state-of-the-art multicore processors to accurately model the potential of per-core, zero-latency DVFS. We demonstrate that the combinationof decoupled access-execute and precise DVFS has the potential to decrease EDP by 25-30% without reducing performance.

    The underlying insight in this work is that by decoupling access and execute we can take advantageof the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency. For the memory-bound access phase, where we prefetch data into the cachefrom main memory, we can run at a reduced frequency and voltage without hurting performance. Thereafter, the execute phase can run much faster, thanks to the prefetching of the access phase, and achieve higher performance. This decoupled program behavior allows us to achieve more effective use of DVFS than standard coupled executions which mix data access and compute.

    To understand the potential of this approach, we measure application performance and power consumption on a modern multicore system across a range of frequencies and voltages. From this data we build a model that allows us to analyze the effects of per-core, zero-latency DVFS. The results of this work demonstrate the significant potential for finer-grain DVFS in combination with DVFS-optimized software.

    Fulltekst (pdf)
    fulltext
  • 5.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Ekemark, Per
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
    Zacharopoulos, Georgios
    Switzerland Univ Svizzera Italiana, Lugano, Switzerland.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Jimborean, Alexandra
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Multiversioned decoupled access-execute: The key to energy-efficient compilation of general-purpose programs2016Inngår i: Proc. 25th International Conference on Compiler Construction, New York: ACM Press, 2016, s. 121-131Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Computer architecture design faces an era of great challenges in an attempt to simultaneously improve performance and energy efficiency. Previous hardware techniques for energy management become severely limited, and thus, compilers play an essential role in matching the software to the more restricted hardware capabilities. One promising approach is software decoupled access-execute (DAE), in which the compiler transforms the code into coarse-grain phases that are well-matched to the Dynamic Voltage and Frequency Scaling (DVFS) capabilities of the hardware. While this method is proved efficient for statically analyzable codes, general purpose applications pose significant challenges due to pointer aliasing, complex control flow and unknown runtime events. We propose a universal compile-time method to decouple general-purpose applications, using simple but efficient heuristics. Our solutions overcome the challenges of complex code and show that automatic decoupled execution significantly reduces the energy expenditure of irregular or memory-bound applications and even yields slight performance boosts. Overall, our technique achieves over 20% on average energy-delay-product (EDP) improvements (energy over 15% and performance over 5%) across 14 bench-marks from SPEC CPU 2006 and Parboil benchmark suites, with peak EDP improvements surpassing 70%.

    Fulltekst (pdf)
    fulltext
  • 6.
    Lampka, Kai
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Forsberg, Björn
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Keep it cool and in time: With runtime monitoring to thermal-aware execution speeds for deadline constrained systems2016Inngår i: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 95, s. 79-91Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The Dynamic Power and Thermal Management (DPTM) system of Dynamic Voltage Frequency Scaling (DVFS) enabled processors compensates peak temperatures by slowing or even powering parts of the system down. While ensuring the integrity of computations, this comes with the drawback of losing performance. In the context of hard real-time systems, such unpredictable losses in performance are unacceptable, as they may lead to deadline misses which may yet compromise the integrity of the system. To safely execute hard real-time workloads on such systems, this article presents an online scheme for assigning speeds in such a way that (a) the system executes at low clock speed as often as possible, while (b) deadline violations are strictly ruled out. The proposed scheme is compared with an offline scheme which has complete knowledge about arrival times and execution demands of the workload. The benchmarking shows that for a workload which is always very close to the modelled maximum, our approach performs on-par with the offline scheme. In case of a workload which diverges from the modelled maximum more often, the speed assignments produced by our scheme become more pessimistic, as to ensure that all deadlines are met.

  • 7.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datorteknik.
    Improving Energy-Efficiency of Multicores using First-Order Modeling2016Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    In the recent decades, power consumption has evolved to one of the most critical resources in a computer system. In the form of electricity bill in data centers, battery life in mobile devices, or thermal constraints in desktops and laptops, power consumption imposes several limitations in today’s processors and improving power and energy efficiency is one of the most urgent research topics of Computer Architecture.

    Dynamic Voltage and Frequency Scaling (DVFS) and Cache Resizing are among the most popular energy saving techniques. Previous work, however, has focused on developing heuristics and trial-and-error methods that yield acceptable savings, but fail to provide insight and understanding of how these techniques affect power and performance of a computer system. In contrast, this Thesis proposes the use of first-order modeling to improve the energy efficiency of computer systems. A first-order model needs to be (i) accurate enough to efficiently drive DVFS and Cache Resizing decisions, and (ii) simple enough to eliminate the overhead of collecting the required inputs to the model. We show that such models can be constructed and successfully applied in modern systems.

    For DVFS, we propose to scale frequency down to exploit applications’ memory slack, i.e., periods that the processor spends waiting for data to be fetched from the main memory. In such cases, the processor frequency can be scaled down to save energy without inordinate performance penalty. Our DVFS models can detect slack and predict the impact of DVFS in both power and performance with great accuracy. Cache Resizing, on the other hand, relies on the fact that many applications do not benefit from the vast amount of cache that modern processors are equipped with. In such cases, the cache can be resized to save static energy consumption at limited performance cost. Since both techniques are related with the memory behavior of applications, we propose a unified model to manage the two techniques in tandem and maximize energy efficiency through synergistic DVFS and Cache Resizing.

    Finally, our experience with DVFS in real systems motivated us to contribute to the integration of DVFS into the gem5 simulator. Unlike other simulators that ignore the role of OS in DVFS, we extend the gem5 simulator by developing the hardware and software components that allow existing Linux DVFS infrastructure to be seamlessly integrated in the simulator.

    Delarbeid
    1. Interval-based models for run-time DVFS orchestration in superscalar processors
    Åpne denne publikasjonen i ny fane eller vindu >>Interval-based models for run-time DVFS orchestration in superscalar processors
    2010 (engelsk)Inngår i: Proc. 7th International Conference on Computing Frontiers, New York: ACM Press , 2010, s. 287-296Konferansepaper, Publicerat paper (Fagfellevurdert)
    sted, utgiver, år, opplag, sider
    New York: ACM Press, 2010
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-142687 (URN)10.1145/1787275.1787338 (DOI)978-1-4503-0044-5 (ISBN)
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2010-05-19 Laget: 2011-01-14 Sist oppdatert: 2018-01-12bibliografisk kontrollert
    2. Green governors: A framework for continuously adaptive DVFS
    Åpne denne publikasjonen i ny fane eller vindu >>Green governors: A framework for continuously adaptive DVFS
    2011 (engelsk)Inngår i: Proc. International Green Computing Conference and Workshops: IGCC 2011, Piscataway, NJ: IEEE , 2011, s. 1-8Konferansepaper, Publicerat paper (Fagfellevurdert)
    sted, utgiver, år, opplag, sider
    Piscataway, NJ: IEEE, 2011
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-165805 (URN)10.1109/IGCC.2011.6008552 (DOI)978-1-4577-1222-7 (ISBN)
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2012-01-10 Laget: 2012-01-09 Sist oppdatert: 2018-01-12bibliografisk kontrollert
    3. Power-Sleuth: A Tool for Investigating your Program's Power Behavior
    Åpne denne publikasjonen i ny fane eller vindu >>Power-Sleuth: A Tool for Investigating your Program's Power Behavior
    2012 (engelsk)Inngår i: International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'12), 2012, s. 241-250Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    Modern processors support aggressive power saving techniques to reduce energy consumption. However, traditional profiling techniques have mainly focused on performance, which does not accurately reflect the power behavior of applications. For example, the longest running function is not always the most energy-hungry function. Thus software developers cannot always take full advantage of these power-saving features.

    We present \powersleuth, a power/performance estimation tool which is able to provide a full description of an application's behavior for any frequency from a single profiling run. The tool combines three techniques: a power and a performance estimation model with a program phase detection technique to deliver accurate, per-phase, per-frequency analysis.

    Our evaluation (against real power measurements) shows that we can accurately predict power and performance across different frequencies with average errors of 3.5% and 3.9% respectively.

    HSV kategori
    Forskningsprogram
    Datavetenskap; Datorteknik
    Identifikatorer
    urn:nbn:se:uu:diva-180147 (URN)10.1109/MASCOTS.2012.36 (DOI)978-1-4673-2453-3 (ISBN)
    Konferanse
    International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), August 2012, Washington, DC, USA
    Prosjekter
    CoDeR-MPUPMARCVRERA
    Tilgjengelig fra: 2012-08-30 Laget: 2012-08-30 Sist oppdatert: 2018-01-12bibliografisk kontrollert
    4. Introducing DVFS-Management in a Full-System Simulator
    Åpne denne publikasjonen i ny fane eller vindu >>Introducing DVFS-Management in a Full-System Simulator
    Vise andre…
    2013 (engelsk)Inngår i: Proc. 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, IEEE Computer Society, 2013Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    Dynamic Voltage and Frequency Scaling (DVFS) is an essential part of controlling the power consumption of any computer system, ranging from mobile phones to servers. DVFS efficiency relies on hardware-software co-optimization, thus using existing hardware cannot reveal the full optimization potential beyond the current implementation’s characteristics. To explore the vast design space for DVFS efficiency, that straddles software and hardware, a simulation infrastructure must provide features that are not readily available today, for example: software controllable clock and voltage domains, support for the OS and the frequency scaling module of it, and an online power estimation methodology. As the main contribution,this work enables DVFS studies in a full-system simulator. We extend the gem5 simulator to support full-system DVFS modeling. By doing so, we enable energy-efficiency experiments to be performed in gem5 and we showcase such studies. Finally, we show that both existing and novel frequency governors for Linux and Android can be effortlessly integrated in the framework, and we evaluate the efficiency of different DVFS schemes.

    sted, utgiver, år, opplag, sider
    IEEE Computer Society, 2013
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-212809 (URN)
    Konferanse
    MASCOTS 2013
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2013-12-15 Laget: 2013-12-15 Sist oppdatert: 2016-09-02
    5. A unified DVFS-cache resizing framework
    Åpne denne publikasjonen i ny fane eller vindu >>A unified DVFS-cache resizing framework
    Vise andre…
    2016 (engelsk)Rapport (Annet vitenskapelig)
    Serie
    Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203 ; 2016-014
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-300840 (URN)
    Tilgjengelig fra: 2016-08-15 Laget: 2016-08-15 Sist oppdatert: 2018-01-10bibliografisk kontrollert
    Fulltekst (pdf)
    fulltext
    Download (jpg)
    preview image
  • 8.
    Spiliopoulos, Vasileios
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Bagdia, Akash
    Hansson, Andreas
    Aldworth, Peter
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Introducing DVFS-Management in a Full-System Simulator2013Inngår i: Proc. 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, IEEE Computer Society, 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Dynamic Voltage and Frequency Scaling (DVFS) is an essential part of controlling the power consumption of any computer system, ranging from mobile phones to servers. DVFS efficiency relies on hardware-software co-optimization, thus using existing hardware cannot reveal the full optimization potential beyond the current implementation’s characteristics. To explore the vast design space for DVFS efficiency, that straddles software and hardware, a simulation infrastructure must provide features that are not readily available today, for example: software controllable clock and voltage domains, support for the OS and the frequency scaling module of it, and an online power estimation methodology. As the main contribution,this work enables DVFS studies in a full-system simulator. We extend the gem5 simulator to support full-system DVFS modeling. By doing so, we enable energy-efficiency experiments to be performed in gem5 and we showcase such studies. Finally, we show that both existing and novel frequency governors for Linux and Android can be effortlessly integrated in the framework, and we evaluate the efficiency of different DVFS schemes.

  • 9.
    Spiliopoulos, Vasileios
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Keramidas, Georgios
    Green governors: A framework for continuously adaptive DVFS2011Inngår i: Proc. International Green Computing Conference and Workshops: IGCC 2011, Piscataway, NJ: IEEE , 2011, s. 1-8Konferansepaper (Fagfellevurdert)
  • 10.
    Spiliopoulos, Vasileios
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Keramidas, Georgios
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Efstathiou, Konstantinos
    Power-performance adaptation in Intel core i72011Inngår i: Proc. 2nd Workshop on Computer Architecture and Operating System co-design, Cambridge, MA: Computer Science and Artificial Intelligence Laboratory, MIT , 2011, s. 10-Konferansepaper (Annet vitenskapelig)
  • 11.
    Spiliopoulos, Vasileios
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Sembrant, Andreas
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Power-Sleuth: A Tool for Investigating your Program's Power Behavior2012Inngår i: International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'12), 2012, s. 241-250Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Modern processors support aggressive power saving techniques to reduce energy consumption. However, traditional profiling techniques have mainly focused on performance, which does not accurately reflect the power behavior of applications. For example, the longest running function is not always the most energy-hungry function. Thus software developers cannot always take full advantage of these power-saving features.

    We present \powersleuth, a power/performance estimation tool which is able to provide a full description of an application's behavior for any frequency from a single profiling run. The tool combines three techniques: a power and a performance estimation model with a program phase detection technique to deliver accurate, per-phase, per-frequency analysis.

    Our evaluation (against real power measurements) shows that we can accurately predict power and performance across different frequencies with average errors of 3.5% and 3.9% respectively.

  • 12.
    Spiliopoulos, Vasileios
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Sembrant, Andreas
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Keramidas, Georgios
    Hagersten, Erik
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    A unified DVFS-cache resizing framework2016Rapport (Annet vitenskapelig)
  • 13.
    Tran, Kim-Anh
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Carlson, Trevor E.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Koukos, Konstantinos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Själander, Magnus
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Jimborean, Alexandra
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Clairvoyance: Look-ahead compile-time scheduling2017Inngår i: Proc. 15th International Symposium on Code Generation and Optimization, Piscataway, NJ: IEEE Press, 2017, s. 171-184Konferansepaper (Fagfellevurdert)
    Fulltekst (pdf)
    fulltext
  • 14.
    Tran, Kim-Anh
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Carlson, Trevor E.
    Koukos, Konstantinos
    Själander, Magnus
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation. Norwegian Univ Sci & Technol, S-75236 Uppsala, Sweden.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Jimborean, Alexandra
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Static instruction scheduling for high performance on limited hardware2018Inngår i: IEEE Transactions on Computers, ISSN 0018-9340, Vol. 67, nr 4, s. 513-527Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Complex out-of-order (OoO) processors have been designed to overcome the restrictions of outstanding long-latency misses at the cost of increased energy consumption. Simple, limited OoO processors are a compromise in terms of energy consumption and performance, as they have fewer hardware resources to tolerate the penalties of long-latency loads. In worst case, these loads may stall the processor entirely. We present Clairvoyance, a compiler based technique that generates code able to hide memory latency and better utilize simple OoO processors. By clustering loads found across basic block boundaries, Clairvoyance overlaps the outstanding latencies to increases memory-level parallelism. We show that these simple OoO processors, equipped with the appropriate compiler support, can effectively hide long-latency loads and achieve performance improvements for memory-bound applications. To this end, Clairvoyance tackles (i) statically unknown dependencies, (ii) insufficient independent instructions, and (iii) register pressure. Clairvoyance achieves a geomean execution time improvement of 14 percent for memory-bound applications, on top of standard O3 optimizations, while maintaining compute-bound applications' high-performance.

1 - 14 of 14
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf