uu.seUppsala University Publications
Change search
Refine search result
45678 301 - 350 of 390
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 301. Piskac, Ruzica
    et al.
    Rümmer, PhilippUppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Verified Software. Theories, Tools, and Experiments: Revised Selected Papers2018Conference proceedings (editor) (Refereed)
  • 302.
    Popov, Mihail
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Jimborean, Alexandra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Black-Schaffer, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Efficient thread/page/parallelism autotuning for NUMA systems2019In: International Conference on Supercomputing / [ed] ACM, New York, NY, USA: Association for Computing Machinery (ACM), 2019, , p. 12Conference paper (Refereed)
    Abstract [en]

    Current multi-socket systems have complex memory hierarchies with significant Non-Uniform Memory Access (NUMA) effects: memory performance depends on the location of the data and the thread. This complexity means that thread- and data-mappings have a significant impact on performance. However, it is hard to find efficient data mappings and thread configurations due to the complex interactions between applications and systems.

    In this paper we explore the combined search space of thread mappings, data mappings, number of NUMA nodes, and degreeof-parallelism, per application phase, and across multiple systems. We show that there are significant performance benefits from optimizing this wide range of parameters together. However, such an optimization presents two challenges: accurately modeling the performance impact of configurations across applications and systems, and exploring the vast space of configurations. To overcome the modeling challenge, we use native execution of small, representative codelets, which reproduce the system and application interactions. To make the search practical, we build a search space by combining a range of state of the art thread- and data-mapping policies.

    Combining these two approaches results in a tractable search space that can be quickly and accurately evaluated without sacrificing significant performance. This search finds non-intuitive configurations that perform significantly better than previous works. With this approach we are able to achieve an average speedup of 1.97× on a four node NUMA system

    Download full text (pdf)
    fulltext
  • 303. Porter, Leo
    et al.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Member spotlight part 22017In: ACM SIGCSE Bulletin, ISSN 0097-8418, Vol. 49, no 2, p. 11-14Article in journal (Other (popular science, discussion, etc.))
  • 304.
    Pérez-Penichet, Carlos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Daglaridis, Georgios Theodoros
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Piumwardane, Dilushi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Voigt, Thiemo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. RISE SICS, Sweden.
    Modelling Battery-free Communications for the Cooja Simulator2019In: Proceedings of the 2019 International Conference on Embedded Wireless Systems and Networks, 2019, p. 47-58Conference paper (Refereed)
    Abstract [en]

    Recent progress on backscatter communications enable devices that, assisted by an unmodulated carrier, receive and transmit standard wireless protocols such as IEEE 802.15.4 with sub-milliwatt power consumption. This paradigm, that we call carrier-assisted communications, enables batteryfree devices due to its reduced power consumption. To develop at scale, and integrate seamlessly into networks of unmodified conventional nodes, we need novel protocols at the MAC layer and above that can coordinate the carrier generators with receivers and transmitters while maintaining energy and spectral efficiency. A highly effective tool to develop such protocols is a network simulator. We introduce models for the communication range, energy consumption and other characteristics of carrier-assisted links based on parameters gathered from real-world experiments. We implement the models in Cooja, a well-known simulator, creating the first carrier-assisted communications framework to simulate interoperable battery-free devices alongside conventional sensor nodes. We illustrate how such a tool can offer valuable insights in the development and evaluation of efficient protocols for carrier-assisted communications.

  • 305.
    Pérez-Penichet, Carlos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Piumwardane, Dilushi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Rohner, Christian
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Voigt, Thiemo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    A Fast Carrier Scheduling Algorithm for Battery-free Sensor Tags in Commodity Wireless Networks2020Conference paper (Refereed)
    Abstract [en]

    New battery-free sensor tags that interoperate with unmodified standard IoT devices and protocols can extend a sensor network’s capabilities in a scalable and cost-effective manner. The tags achieve battery-free operation through backscatter-related techniques, while the standard IoT devices avoid additional dedicated infrastructure by providing the unmodulated carrier that tags need to communicate. However, this approach requires coordination between devices transmitting, receiving and generating carrier, adds extra latency and energy consumption to already constrained devices, and increases interference and contention in the shared spectrum. We present a scheduling mechanism that optimizes the use of carrier generators, minimizing any disruptions to the regular nodes. We employ timeslots to coordinate the unmodulated carrier while minimizing latency, energy consumption and overhead radio emissions. We propose an efficient scheduling algorithm that parallelizes communications with battery-free tags when possible and shares carriers among multiple tags concurrently. In our evaluation we demonstrate the feasibility and reliability of our approach in testbed experiments. We find that we can significantly reduce the excess latency and energy consumption caused by the addition of sensor tags when compared to sequential interrogation. We show that the gains tend to improve with the network size and that our solution is close to optimal on average.

    Download full text (pdf)
    fulltext
  • 306.
    Pérez-Penichet, Carlos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala Univ, Uppsala, Sweden.
    Voigt, Thiemo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. RISE SICS, Lulea, Sweden.
    Carrier Scheduling in IoT Networks with Interoperable Battery-free Backscatter Tags2019In: IPSN '19: Proceedings of the 2019 International Conference on Information Processing in Sensor Networks, Association for Computing Machinery (ACM), 2019, p. 329-330Conference paper (Refereed)
    Abstract [en]

    New battery-free backscatter tags that integrate with unmodified standard IoT devices can extend the latter's sensing capabilities in a scalable and cost effective way. Existing IoT nodes can provide the unmodulated carrier needed by the new nodes, avoiding the need for additional infrastructure. This, however, puts extra energetic demands on constrained IoT nodes while increasing interference and contention in the network. We use a slotted MAC protocol to guarantee synchronization between transmitters, receivers and carrier generators. We then express the slot allocation problem as a Constraint Optimization Problem (COP) that parallelizes interrogations to battery-free tags when they do not collide with each other and reuses carriers for multiple tags looking to minimize the total time and the number of carrier generators needed to interrogate a set of tags. In networks with sufficient battery-free nodes we obtain a 25% reduction in the number of necessary carriers and a 50% decrease in interrogation time in most cases; leading to significant energy savings, reduced collisions and improved latency.

  • 307.
    Rezine, Othmane
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Verification of networks of communicating processes: Reachability problems and decidability issues2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Computer systems are used in almost all aspects of our lives and our dependency on them keeps on increasing. When computer systems are used to handle critical tasks, any software failure can cause severe human and/or material losses. Therefore, for such applications, it is important to detect software errors at an early stage of software development. Furthermore, the growing use of concurrent and distributed programs exponentially increases the complexity of computer systems, making the problem of detecting software errors even harder (if not impossible). This calls for defining systematic and efficient techniques to evaluate the safety and the correctness of programs. The aim of Model-Checking is to analyze automatically whether a given program satisfies its specification. Early applications of Model-Checking were restricted to systems whose behaviors can be captured by finite graphs, so called finite-state systems. Since many computer systems cannot be modeled as finite-state machines, there has been a growing interest in extending the applicability of Model-Checking to infinite-state systems.

    The goal of this thesis is to extend the applicability of Model Checking for three instances of infinite-state systems: Ad-Hoc Networks, Dynamic Register Automata and Multi Pushdown Systems. Each one of these instances models challenging types of networks of communicating processes. In both Ad-Hoc Networks and Dynamic Register Automata, communication is carried through message passing. In each type of network, a graph topology models the communication links between processes in the network. The graph topology is static in the case of Ad-Hoc Networks while it is dynamic in the case of Dynamic Register Automata. The number of processes in both types of networks is unbounded. Finally, we consider Multi Pushdown Systems, a model used to study the behaviors of concurrent programs composed of sequential recursive sequential programs communicating through a shared memory.

    List of papers
    1. Budget-bounded model-checking pushdown systems
    Open this publication in new window or tab >>Budget-bounded model-checking pushdown systems
    2014 (English)In: Formal methods in system design, ISSN 0925-9856, E-ISSN 1572-8102, Vol. 45, no 2, p. 273-301Article in journal (Refereed) Published
    Abstract [en]

    We address the verification problem for concurrent programs modeled as multi-pushdown systems (MPDS). In general, MPDS are Turing powerful and hence come along with undecidability of all basic decision problems. Because of this, several subclasses of MPDS have been proposed and studied in the literature (Atig et al. in LNCS, Springer, Berlin, 2005; La Torre et al. in LICS, IEEE, 2007; Lange and Lei in Inf Didact 8, 2009; Qadeer and Rehof in TACAS, LNCS, Springer, Berlin, 2005). In this paper, we propose the class of bounded-budget MPDS, which are restricted in the sense that each stack can perform an unbounded number of context switches only if its depth is below a given bound, and a bounded number of context switches otherwise. We show that the reachability problem for this subclass is Pspace-complete and that LTL-model-checking is Exptime-complete. Furthermore, we propose a code-to-code translation that inputs a concurrent program and produces a sequential program such that running under the budget-bounded restriction yields the same set of reachable states as running . Moreover, detecting (fair) non-terminating executions in can be reduced to LTL-Model-Checking of . By leveraging standard sequential analysis tools, we have implemented a prototype tool and applied it on a set of benchmarks, showing the feasibility of our translation.

    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-234422 (URN)10.1007/s10703-014-0207-y (DOI)000343210700007 ()
    Projects
    UPMARCConcurrent recursive programs
    Funder
    Swedish Research Council
    Available from: 2014-04-25 Created: 2014-10-17 Last updated: 2018-01-11
    2. Verification of Directed Acyclic Ad Hoc Networks
    Open this publication in new window or tab >>Verification of Directed Acyclic Ad Hoc Networks
    2013 (English)In: Formal Techniques for Distributed Systems: FORTE 2013, Springer Berlin/Heidelberg, 2013, p. 193-208Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743 ; 7892
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-211409 (URN)10.1007/978-3-642-38592-6_14 (DOI)978-3-642-38591-9 (ISBN)
    Conference
    Formal Techniques for Distributed Systems (FORTE 2013), June 3-5, 2013, Florence, Italy
    Projects
    ProFuN
    Funder
    Swedish Foundation for Strategic Research
    Available from: 2013-11-27 Created: 2013-11-22 Last updated: 2017-11-27Bibliographically approved
    3. Verification of Dynamic Register Automata
    Open this publication in new window or tab >>Verification of Dynamic Register Automata
    2014 (English)In: Leibniz International Proceedings in Informatics: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2014), 2014Conference paper, Published paper (Refereed)
    Abstract [en]

    We consider the verification problem for Dynamic Register Automata (Dra). Dra extend classical register automata by process creation. In this setting, each process is equipped with a finite number of registers in which the process IDs of other processes can be stored. A process can communicate with processes whose IDs are stored in its registers and can send them the content of its registers. The state reachability problem asks whether a Dra reaches a configuration where at least one process is in an error state. We first show that this problem is in general undecidable. This result holds even when we restrict the analysis to configurations where the maximal length of the simple paths in their underlying (un)directed communication graphs are bounded by some constant. Then we introduce the model of degenerative Dra which allows non-deterministic reset of the registers. We prove that for every given Dra, its corresponding degenerative one has the same set of reachable states. While the state reachability of a degenerative Dra remains undecidable, we show that the problem becomes decidable with nonprimitive-recursive complexity when we restrict the analysis to strongly bounded configurations, i.e. configurations whose underlying undirected graphs have bounded simple paths. Finally, we consider the class of strongly safe Dra, where all the reachable configurations are assumed to be strongly bounded. We show that for strongly safe Dra, the state reachability problem becomes decidable. 

    Keywords
    Register Automata, State Reachability, Formal Verification
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-237854 (URN)
    Conference
    IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, New Delhi, India, December 15–17 2014.
    Projects
    ProFuNUPMARC
    Available from: 2014-12-05 Created: 2014-12-05 Last updated: 2018-01-11Bibliographically approved
    4. Verification of buffered dynamic register automata
    Open this publication in new window or tab >>Verification of buffered dynamic register automata
    2015 (English)In: Networked Systems: NETYS 2015, Springer, 2015, p. 15-31Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science ; 9466
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-247828 (URN)10.1007/978-3-319-26850-7_2 (DOI)978-3-319-26849-1 (ISBN)
    Conference
    NETYS 2015, May 13–15, Agadir, Morocco
    Projects
    ProFuNUPMARC
    Funder
    Swedish Foundation for Strategic Research , RIT08-0065
    Available from: 2016-03-23 Created: 2015-03-24 Last updated: 2018-01-11Bibliographically approved
    Download full text (pdf)
    fulltext
    Download (jpg)
    presentationsbild
  • 308.
    Ros, Alberto
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    The Superfluous Load Queue2018In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE, 2018, p. 95-107Conference paper (Refereed)
    Abstract [en]

    In an out-of-order core, the load queue (LQ), the store queue (SQ), and the store buffer (SB) are responsible for ensuring: i) correct forwarding of stores to loads and ii) correct ordering among loads (with respect to external stores). The first requirement safeguards the sequential semantics of program execution and applies to both serial and parallel code; the second requirement safeguards the semantics of coherence and consistency (e.g., TSO). In particular, loads search the SQ/SB for the latest value that may have been produced by a store, and stores and invalidations search the LQ to find speculative loads in case they violate uniprocessor or multiprocessor ordering. To meet timing constraints the LQ and SQ/SB system is composed of CAM structures that are frequently searched. This results in high complexity, cost, and significant difficulty to scale, but is the current state of the art. Prior research demonstrated the feasibility of a non-associative LQ by replaying loads at commit. There is a steep cost however: a significant increase in L1 accesses and contention for L1 ports. This is because prior work assumes Sequential Consistency and completely ignores the existence of a SB in the system. In contrast, we intentionally delay stores in the SB to achieve a total management of stores and loads in a core, while still supporting TSO. Our main result is that we eliminate the LQ without burdening the L1 with extra accesses. Store forwarding is achieved by delaying our own stores until speculatively issued loads are validated on commit, entirely in-core; TSO load -> load ordering is preserved by delaying remote external stores in their SB until our own speculative reordered loads commit. While the latter is inspired by recent work on non-speculative load reordering, our contribution here is to show that this can be accomplished without having a load queue. Eliminating the LQ results in both energy savings and performance improvement from the elimination of LQ-induced stalls.

    Download full text (pdf)
    fulltext
  • 309. Ros, Alberto
    et al.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics2017In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 28, no 12, p. 3413-3425Article in journal (Refereed)
  • 310. Ros, Alberto
    et al.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics2016In: Proc. International Conference on Parallel Architectures and Compilation: PACT 2016, New York: ACM Press, 2016, p. 433-434Conference paper (Refereed)
  • 311.
    Rümmer, Philipp
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    JayHorn: A Java Model Checker2019In: PROCEEDINGS OF THE 21ST WORKSHOP ON FORMAL TECHNIQUES FOR JAVA-LIKE PROGRAMS (FTFJP 2019), ASSOC COMPUTING MACHINERY , 2019, article id 1Conference paper (Refereed)
    Abstract [en]

    This talk will give an overview of the JayHorn verification tool, a model checker for sequential Java programs annotated with assertions expressing safety conditions. JayHorn is fully automatic and based to a large degree on standard infrastructure for compilation and verification: it uses the Soot library as front-end to read Java bytecode and translate it to the Jimple three-address format, and the state-of-the-art Horn solvers SPACER and Eldarica as back-ends that infer loop invariants, object and class invariants, and method contracts. Since JayHorn uses an invariant-based representation of heap data-structures, it is particularly useful for analysing programs with unbounded data-structures and unbounded run-time, while at the same time avoiding the use of logical theories, like the theory of arrays, often considered hard for Horn solvers. The development of JayHorn is ongoing, and the talk will also cover some of the future features of JayHorn, in particular the handling of strings.

  • 312.
    Rümmer, Philipp
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hojjat, Hossein
    Kuncak, Viktor
    On recursion-free Horn clauses and Craig interpolation2015In: Formal methods in system design, ISSN 0925-9856, E-ISSN 1572-8102, Vol. 47, no 1, p. 1-25Article in journal (Refereed)
  • 313.
    Rümmer, Philipp
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Characterization of simulation by probabilistic testing2016In: Theory and Practice of Formal Methods, Springer, 2016, p. 360-372Chapter in book (Refereed)
  • 314.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Securing the Memory Hierarchy from Speculative Side-Channel Attack2020Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Modern high-performance CPUs depend on speculative out-of-order execution in order to offer high performance while also remaining energy efficient. However, with the introduction of Meltdown and Spectre in the beginning of 2018, speculative execution has been under attack. These exploits, and the many that followed, take advantage of the unchecked nature of speculative execution and the microarchitectural changes it causes in order to mount speculative side-channel attacks. Such attacks can bypass software and hardware barriers and gain access to sensitive information while remaining invisible to the application. In this thesis we will describe our work on preventing speculative side-channel attacks that exploit the memory hierarchy as their side-channel. Specifically, we will discuss two different approaches, one were we do not restrict speculative execution but try to keep its microarchitectural side-effects hidden, and one where we delay speculative memory accesses if we determine that they might lead to information leakage. We will discuss the advantages and disadvantages of both approaches, compare them against other state-of-the-art solutions, and show that it is possible to achieve secure, invisible speculation while at the same time maintaining high performance and efficiency.

    List of papers
    1. Ghost Loads: What is the cost of invisible speculation?
    Open this publication in new window or tab >>Ghost Loads: What is the cost of invisible speculation?
    Show others...
    2019 (English)In: Proceedings of the 16th ACM International Conference on Computing Frontiers, New York: ACM Press, 2019, p. 153-163Conference paper, Published paper (Refereed)
    Abstract [en]

    Speculative execution is necessary for achieving high performance on modern general-purpose CPUs but, starting with Spectre and Meltdown, it has also been proven to cause severe security flaws. In case of a misspeculation, the architectural state is restored to assure functional correctness but a multitude of microarchitectural changes (e.g., cache updates), caused by the speculatively executed instructions, are commonly left in the system.  These changes can be used to leak sensitive information, which has led to a frantic search for solutions that can eliminate such security flaws. The contribution of this work is an evaluation of the cost of hiding speculative side-effects in the cache hierarchy, making them visible only after the speculation has been resolved. For this, we compare (for the first time) two broad approaches: i) waiting for loads to become non-speculative before issuing them to the memory system, and ii) eliminating the side-effects of speculation, a solution consisting of invisible loads (Ghost loads) and performance optimizations (Ghost Buffer and Materialization). While previous work, InvisiSpec, has proposed a similar solution to our latter approach, it has done so with only a minimal evaluation and at a significant performance cost. The detailed evaluation of our solutions shows that: i) waiting for loads to become non-speculative is no more costly than the previously proposed InvisiSpec solution, albeit much simpler, non-invasive in the memory system, and stronger security-wise; ii) hiding speculation with Ghost loads (in the context of a relaxed memory model) can be achieved at the cost of 12% performance degradation and 9% energy increase, which is significantly better that the previous state-of-the-art solution.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2019
    Keywords
    speculation, security, side-channel attacks, caches
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-383173 (URN)10.1145/3310273.3321558 (DOI)000474686400019 ()978-1-4503-6685-4 (ISBN)
    Conference
    CF 2019, April 30 – May 2, Alghero, Sardinia, Italy
    Funder
    Swedish Research Council, 2015-05159Swedish National Infrastructure for Computing (SNIC)
    Note

    Available from: 2019-05-10 Created: 2019-05-10 Last updated: 2020-01-30Bibliographically approved
    2. Efficient invisible speculative execution through selective delay and value prediction
    Open this publication in new window or tab >>Efficient invisible speculative execution through selective delay and value prediction
    Show others...
    2019 (English)In: Proc. 46th International Symposium on Computer Architecture, New York: ACM Press, 2019, p. 723-735Conference paper, Published paper (Refereed)
    Abstract [en]

    Speculative execution, the base on which modern high-performance general-purpose CPUs are built on, has recently been shown to enable a slew of security attacks.  All these attacks are centered around a common set of behaviors: During speculative execution, the architectural state of the system is kept unmodified, until the speculation can be verified.  In the event that a misspeculation occurs, then anything that can affect the architectural state is reverted (squashed) and re-executed correctly.  However, the same is not true for the microarchitectural state.  Normally invisible to the user, changes to the microarchitectural state can be observed through various side-channels, with timing differences caused by the memory hierarchy being one of the most common and easy to exploit.  The speculative side-channels can then be exploited to perform attacks that can bypass software and hardware checks in order to leak information.  These attacks, out of which the most infamous are perhaps Spectre and Meltdown, have led to a frantic search for solutions.In this work, we present our own solution for reducing the microarchitectural state-changes caused by speculative execution in the memory hierarchy.  It is based on the observation that if we only allow accesses that hit in the L1 data cache to proceed, then we can easily hide any microarchitectural changes until after the speculation has been verified.  At the same time, we propose to prevent stalls by value predicting the loads that miss in the L1.  Value prediction, though speculative, constitutes an invisible form of speculation, not seen outside the core.  We evaluate our solution and show that we can prevent observable microarchitectural changes in the memory hierarchy while keeping the performance and energy costs at 11% and 7%, respectively.  In comparison, the current state of the art solution, InvisiSpec, incurs a 46% performance loss and a 51% energy increase.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2019
    Keywords
    caches, side-channel attacks, speculative execution
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-387329 (URN)10.1145/3307650.3322216 (DOI)000521059600056 ()978-1-4503-6669-4 (ISBN)
    Conference
    ISCA 2019, June 22–26, Phoenix, AZ, USA
    Funder
    Swedish Research Council, 2015-05159Swedish Foundation for Strategic Research , SM17-0064
    Note

    Available from: 2019-06-22 Created: 2019-06-21 Last updated: 2020-04-27Bibliographically approved
    3. Understanding Selective Delay as a Method for Efficient Secure Speculative Execution
    Open this publication in new window or tab >>Understanding Selective Delay as a Method for Efficient Secure Speculative Execution
    Show others...
    (English)In: Article in journal (Refereed) Submitted
    Abstract [en]

    Since the introduction of Meltdown and Spectre, the academic and industry research communities have been tirelessly working on speculative side-channel attacks and on how to shield computer systems from them. To ensure that a system is protected not only from all the currently known attacks but also from future, yet to be discovered, attacks, the solutions developed need to be general in nature, covering a wide array of system components, while at the same time keeping the performance, energy, area, and implementation complexity costs at a minimum. One such solution is our own delay-on-miss, which efficiently protects the memory hierarchy by i) selectively delaying speculative load instructions and ii) utilizing value prediction as an invisible form of speculation. In this work we dive deeper into delay-on-miss, offering insights into why and how it affects the performance of the system. We also reevaluate value prediction as an invisible form of speculation. Specifically, we focus on the implications that delaying memory loads has in the memory level parallelism of the system and how this affects the value predictor and the overall performance of the system. We present new, updated results but more importantly, we also offer deeper insight into why delay-on-miss works so well and what this means for the future of secure speculative execution.

    National Category
    Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-404312 (URN)
    Note

    Under submission

    Available from: 2020-02-17 Created: 2020-02-17 Last updated: 2020-02-17
    Download full text (pdf)
    fulltext
  • 315.
    Sakalis, Christos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Jimborean, Alexandra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Själander, Magnus
    Norwegian University of Science and Technology.
    Evaluating the Potential Applications of Quaternary Logic for Approximate Computing2019In: ACM Journal on Emerging Technologies in Computing Systems (JETC), ISSN 1550-4832, Vol. 16, no 1, article id 5Article in journal (Refereed)
    Abstract [en]

    There exist extensive ongoing research efforts on emerging atomic-scale technologies that have the potential to become an alternative to today’s complementary metal--oxide--semiconductor technologies. A common feature among the investigated technologies is that of multi-level devices, particularly the possibility of implementing quaternary logic gates and memory cells. However, for such multi-level devices to be used reliably, an increase in energy dissipation and operation time is required. Building on the principle of approximate computing, we present a set of combinational logic circuits and memory based on multi-level logic gates in which we can trade reliability against energy efficiency. Keeping the energy and timing constraints constant, important data are encoded in a more robust binary format while error-tolerant data are encoded in a quaternary format. We analyze the behavior of the logic circuits when exposed to transient errors caused as a side effect of this encoding. We also evaluate the potential benefit of the logic circuits and memory by embedding them in a conventional computer system on which we execute jpeg, sobel, and blackscholes approximately. We demonstrate that blackscholes is not suitable for such a system and explain why. However, we also achieve dynamic energy reductions of 10% and 13% for jpeg and sobel, respectively, and improve execution time by 38% for sobel, while maintaining adequate output quality.

  • 316.
    Sakalis, Christos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Ros, Alberto
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jimborean, Alexandra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Själander, Magnus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Understanding Selective Delay as a Method for Efficient Secure Speculative ExecutionIn: Article in journal (Refereed)
    Abstract [en]

    Since the introduction of Meltdown and Spectre, the academic and industry research communities have been tirelessly working on speculative side-channel attacks and on how to shield computer systems from them. To ensure that a system is protected not only from all the currently known attacks but also from future, yet to be discovered, attacks, the solutions developed need to be general in nature, covering a wide array of system components, while at the same time keeping the performance, energy, area, and implementation complexity costs at a minimum. One such solution is our own delay-on-miss, which efficiently protects the memory hierarchy by i) selectively delaying speculative load instructions and ii) utilizing value prediction as an invisible form of speculation. In this work we dive deeper into delay-on-miss, offering insights into why and how it affects the performance of the system. We also reevaluate value prediction as an invisible form of speculation. Specifically, we focus on the implications that delaying memory loads has in the memory level parallelism of the system and how this affects the value predictor and the overall performance of the system. We present new, updated results but more importantly, we also offer deeper insight into why delay-on-miss works so well and what this means for the future of secure speculative execution.

  • 317.
    Sakalis, Christos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Ros, Alberto
    Splash-3: A properly synchronized benchmark suite for contemporary research2016In: Proc. International Symposium on Performance Analysis of Systems and Software: ISPASS 2016, IEEE Computer Society, 2016, p. 101-111Conference paper (Refereed)
    Abstract [en]

    Benchmarks are indispensable in evaluating the performance implications of new research ideas. However, their usefulness is compromised if they do not work correctly on a system under evaluation or, in general, if they cannot be used consistently to compare different systems. A well-known benchmark suite of parallel applications is the Splash-2 suite. Since its creation in the context of the DASH project, Splash-2 benchmarks have been widely used in research. However, Splash-2 was released over two decades ago and does not adhere to the recent C memory consistency model. This leads to unexpected and often incorrect behavior when some Splash-2 benchmarks are used in conjunction with contemporary compilers and hardware (simulated or real). Most importantly, we discovered critical performance bugs that may question some of the reported benchmark results. In this work, we analyze the Splash-2 benchmarks and expose data races and related performance bugs. We rectify the problematic benchmarks and evaluate the resulting performance. Our work contributes to the community a new sanitized version of the Splash-2 benchmarks, called the Splash-3 benchmark suite.

  • 318.
    Sathyamoorthy, Peramanathan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Ngai, Edith C.-H.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hu, Xiping
    Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China.; Chinese Univ Hong Kong, Shatin, Hong Kong, Peoples R China.
    Leung, Victor C. M.
    Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada.
    Profiling energy efficiency and data communications for mobile Internet of Things2017In: Wireless Communications & Mobile Computing, ISSN 1530-8669, E-ISSN 1530-8677, Vol. 17, article id 6562915Article in journal (Refereed)
    Download full text (pdf)
    fulltext
  • 319.
    Sathyamoorthy, Peramanathan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Ngai, Edith
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hu, Xiping
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V5Z 1M9, Canada.
    Leung, Victor
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V5Z 1M9, Canada.
    Energy Efficiency as an Orchestration Service for Mobile Internet of Things2015Conference paper (Refereed)
    Abstract [en]

    This paper proposes a novel power management solution for resource-constrained devices in the context of Internet of Things (IoT). We focus on smartphones in the IoT, as they are getting increasingly popular and equipped with strong sensing capabilities. Smartphones have complex and asynchronous power consumption incurred by heterogeneous components including their on-board sensors. Their interaction with the cloud allows them to offload computation tasks and access remote data storage. In this work, we aim at monitoring the power consumption behaviours of the smartphones, profiling both individual applications and the system as a whole, to make better decisions in power management. We design a cloud orchestration architecture as an epic predictor of behaviours of smart devices by extracting their application characteristics and resource utilization. We design and implement this architecture to perform energy profiling and data analysis on massive data logs. This cloud orchestration architecture coordinates a number of cloud-based services and supports dynamic workflows between service components, which can reduce energy consumption in the energy profiling process itself. Experimental results showed that small portion of applications dominate the energy consumption of smartphones. Heuristic profiling can effectively reduce energy consumption in data logging and communications without scarifying the accuracy of power monitoring.

  • 320.
    Schwartz-Narbonne, Daniel
    et al.
    NYU, New York, NY USA..
    Schaef, Martin
    SRI Int, Menlo Pk, CA 94025 USA..
    Jovanovic, Dejan
    SRI Int, Menlo Pk, CA 94025 USA..
    Rümmer, Philipp
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Wies, Thomas
    NYU, New York, NY USA..
    Conflict-Directed Graph Coverage2015In: NASA FORMAL METHODS (NFM 2015), 2015, p. 327-342Conference paper (Refereed)
    Abstract [en]

    Many formal method tools for increasing software reliability apply Satisfiability Modulo Theories (SMT) solvers to enumerate feasible paths in a program subject to certain coverage criteria. Examples include inconsistent code detection tools and concolic test case generators. These tools have in common that they typically treat the SMT solver as a black box, relying on its ability to efficiently search through large search spaces. However, in practice the performance of SMT solvers often degrades significantly if the search involves reasoning about complex control-flow. In this paper, we open the black box and devise a new algorithm for this problem domain that we call conflict-directed graph coverage. Our algorithm relies on two core components of an SMT solver, namely conflict-directed learning and deduction by propagation, and applies domain-specific modifications for reasoning about control-flow graphs. We implemented conflict-directed coverage and used it for detecting code inconsistencies in several large Java open-source projects with over one million lines of code in total. The new algorithm yields significant performance gains on average compared to previous algorithms and reduces the running times on hard search instances from hours to seconds.

  • 321.
    Sembrant, Andreas
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hiding and Reducing Memory Latency: Energy-Efficient Pipeline and Memory System Techniques2016Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Memory accesses in modern processors are both far slower and vastly more energy-expensive than the actual computations. To improve performance, processors spend a significant amount of energy and resources trying to hide and reduce the memory latency. To hide the latency, processors use out-order-order execution to overlap memory accesses with independent work and aggressive speculative instruction scheduling to execute dependent instructions back-to-back. To reduce the latency, processors use several levels of caching that keep frequently used data closer to the processor. However, these optimizations are not for free. Out-of-order execution requires expensive processor resources, and speculative scheduling must re-execute instructions on incorrect speculations, and multi-level caching requires extra energy and latency to search the cache hierarchy. This thesis investigates several energy-efficient techniques for: 1) hiding the latency in the processor pipeline, and 2) reducing the latency in the memory hierarchy.

    Much of the inefficiencies of hiding latency in the processor come from two sources. First, processors need several large and expensive structures to do out-of-order execution (instructions queue, register file, etc.). These resources are typically allocated in program order, effectively giving all instructions equal priority. To reduce the size of these expensive resources without hurting performance, we propose Long Term Parking (LTP). LTP parks non-critical instructions before they allocate resources, thereby making room for critical memory accessing instructions to continue and expose more memory-level parallelism. This enables us to save energy by shrinking the resources sizes without hurting performance. Second, when a load's data returns, the load's dependent instructions need to be scheduled and executed. To execute the dependent instructions back-to-back, the processor will speculatively schedule instructions before the processor knows if the input data will be available at execution time. To save energy, we investigate different scheduling techniques that reduce the number of re-executions due to misspeculation.

    The inefficiencies of traditional memory hierarchies come from the need to do level-by-level searches to locate data. The search starts at the L1 cache, then proceeds level by level until the data is found, or determined not to be in any cache, at which point the processor has to fetch the data from main memory. This wastes time and energy for every level that is searched. To reduce the latency, we propose tracking the location of the data directly in a separate metadata hierarchy. This allows us to directly access the data without needing to search. The processor simply queries the metadata hierarchy for the location information about where the data is stored. Separating metadata into its own hierarchy brings a wide range of additional benefits, including flexibility in how we place data storages in the hierarchy, the ability to intelligently store data in the hierarchy, direct access to remote cores, and many other data-oriented optimizations that can leverage our precise knowledge of where data are located.

    List of papers
    1. Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors
    Open this publication in new window or tab >>Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors
    Show others...
    2015 (English)In: Proc. 48th International Symposium on Microarchitecture, 2015Conference paper, Published paper (Refereed)
    Abstract [en]

    Modern processors employ large structures (IQ, LSQ, register file, etc.) to expose instruction-level parallelism (ILP) and memory-level parallelism (MLP). These resources are typically allocated to instructions in program order. This wastes resources by allocating resources to instructions that are not yet ready to be executed and by eagerly allocating resources to instructions that are not part of the application’s critical path.

    This work explores the possibility of allocating pipeline resources only when needed to expose MLP, and thereby enabling a processor design with significantly smaller structures, without sacrificing performance. First we identify the classes of instructions that should not reserve resources in program order and evaluate the potential performance gains we could achieve by delaying their allocations. We then use this information to “park” such instructions in a simpler, and therefore more efficient, Long Term Parking (LTP) structure. The LTP stores instructions until they are ready to execute, without allocating pipeline resources, and thereby keeps the pipeline available for instructions that can generate further MLP.

    LTP can accurately and rapidly identify which instructions to park, park them before they execute, wake them when needed to preserve performance, and do so using a simple queue instead of a complex IQ. We show that even a very simple queue-based LTP design allows us to significantly reduce IQ (64 →32) and register file (128→96) sizes while retaining MLP performance and improving energy efficiency.

    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-272468 (URN)
    Conference
    MICRO 2015, December 5–9, Waikiki, HI
    Projects
    UPMARCUART
    Available from: 2016-01-14 Created: 2016-01-14 Last updated: 2018-01-10
    2. Cost-effective speculative scheduling in high performance processors
    Open this publication in new window or tab >>Cost-effective speculative scheduling in high performance processors
    Show others...
    2015 (English)In: Proc. 42nd International Symposium on Computer Architecture, New York: ACM Press, 2015, p. 247-259Conference paper, Published paper (Refereed)
    Abstract [en]

    To maximize performance, out-of-order execution processors sometimes issue instructions without having the guarantee that operands will be available in time; e.g. loads are typically assumed to hit in the L1 cache and dependent instructions are issued accordingly. This form of speculation - that we refer to as speculative scheduling - has been used for two decades in real processors, but has received little attention from the research community. In particular, as pipeline depth grows, and the distance between the Issue and the Execute stages increases, it becomes critical to issue instructions dependent on variable-latency instructions as soon as possible rather than wait for the actual cycle at which the result becomes available. Unfortunately, due to the uncertain nature of speculative scheduling, the scheduler may wrongly issue an instruction that will not have its source(s) available on the bypass network when it reaches the Execute stage. In that event, the instruction is canceled and replayed, potentially impairing performance and increasing energy consumption. In this work, we do not present a new replay mechanism. Rather, we focus on ways to reduce the number of replays that are agnostic of the replay scheme. First, we propose an easily implementable, low-cost solution to reduce the number of replays caused by L1 bank conflicts. Schedule shifting always assumes that, given a dual-load issue capacity, the second load issued in a given cycle will be delayed because of a bank conflict. Its dependents are thus always issued with the corresponding delay. Second, we also improve on existing L1 hit/miss prediction schemes by taking into account instruction criticality. That is, for some criterion of criticality and for loads whose hit/miss behavior is hard to predict, we show that it is more cost-effective to stall dependents if the load is not predicted critical.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2015
    National Category
    Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-272467 (URN)10.1145/2749469.2749470 (DOI)000380455700020 ()9781450334020 (ISBN)
    Conference
    ISCA 2015, June 13–17, Portland, OR
    Projects
    UPMARCUART
    Available from: 2015-06-13 Created: 2016-01-14 Last updated: 2016-12-05Bibliographically approved
    3. TLC: A tag-less cache for reducing dynamic first level cache energy
    Open this publication in new window or tab >>TLC: A tag-less cache for reducing dynamic first level cache energy
    2013 (English)In: Proceedings of the 46th International Symposium on Microarchitecture, New York: ACM Press, 2013, p. 49-61Conference paper, Published paper (Refereed)
    Abstract [en]

    First level caches are performance-critical and are therefore optimized for speed. To do so, modern processors reduce the miss ratio by using set-associative caches and optimize latency by reading all ways in parallel with the TLB and tag lookup. However, this wastes energy since only data from one way is actually used.

    To reduce energy, phased-caches and way-prediction techniques have been proposed wherein only data of the matching/predicted way is read. These optimizations increase latency and complexity, making them less attractive for first level caches.

    Instead of adding new functionality on top of a traditional cache, we propose a new cache design that adds way index information to the TLB. This allow us to: 1) eliminate ex-tra data array reads (by reading the right way directly), 2) avoid tag comparisons (by eliminating the tag array), 3) later out misses (by checking the TLB), and 4) amortize the TLB lookup energy (by integrating it with the way information). In addition, the new cache can directly replace existing caches without any modication to the processor core or software.

    This new Tag-Less Cache (TLC) reduces the dynamic energy for a 32 kB, 8-way cache by 60% compared to a VIPT cache without aecting performance.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2013
    National Category
    Computer Engineering Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-213236 (URN)10.1145/2540708.2540714 (DOI)978-1-4503-2638-4 (ISBN)
    Conference
    MICRO-46; December 7-11, 2013; Davis, CA, USA
    Projects
    UPMARCCoDeR-MP
    Available from: 2013-12-07 Created: 2013-12-19 Last updated: 2018-01-11Bibliographically approved
    4. The Direct-to-Data (D2D) Cache: Navigating the cache hierarchy with a single lookup
    Open this publication in new window or tab >>The Direct-to-Data (D2D) Cache: Navigating the cache hierarchy with a single lookup
    2014 (English)In: Proc. 41st International Symposium on Computer Architecture, Piscataway, NJ: IEEE Press, 2014, p. 133-144Conference paper, Published paper (Refereed)
    Abstract [en]

    Modern processors optimize for cache energy and performance by employing multiple levels of caching that address bandwidth, low-latency and high-capacity. A request typically traverses the cache hierarchy, level by level, until the data is found, thereby wasting time and energy in each level. In this paper, we present the Direct-to-Data (D2D) cache that locates data across the entire cache hierarchy with a single lookup.

    To navigate the cache hierarchy, D2D extends the TLB with per cache-line location information that indicates in which cache and way the cache line is located. This allows the D2D cache to: 1) skip levels in the hierarchy (by accessing the right cache level directly), 2) eliminate extra data array reads (by reading the right way directly), 3) avoid tag comparisons (by eliminating the tag arrays), and 4) go directly to DRAM on cache misses (by checking the TLB). This reduces the L2 latency by 40% and saves 5-17% of the total cache hierarchy energy.

    D2D´s lower L2 latency directly improves L2 sensitive applications´ performance by 5-14%. More significantly, we can take advantage of the L2 latency reduction to optimize other parts of the microarchitecture. For example, we can reduce the ROB size for the L2 bound applications by 25%, or we can reduce the L1 cache size, delivering an overall 21% energy savings across all benchmarks, without hurting performance.

    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE Press, 2014
    National Category
    Computer Engineering Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-235362 (URN)10.1145/2678373.2665694 (DOI)000343652800012 ()978-1-4799-4394-4 (ISBN)
    Conference
    ISCA 2014, June 14–18, Minneapolis, MN
    Projects
    UPMARCCoDeR-MP
    Available from: 2014-06-14 Created: 2014-10-31 Last updated: 2018-01-11Bibliographically approved
    5. A split cache hierarchy for enabling data-oriented optimizations
    Open this publication in new window or tab >>A split cache hierarchy for enabling data-oriented optimizations
    2017 (English)In: Proc. 23rd International Symposium on High Performance Computer Architecture, IEEE Computer Society, 2017, p. 133-144Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE Computer Society, 2017
    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-306368 (URN)10.1109/HPCA.2017.25 (DOI)000403330300012 ()978-1-5090-4985-1 (ISBN)
    Conference
    HPCA 2017, February 4–8, Austin, TX
    Projects
    UPMARC
    Available from: 2017-05-08 Created: 2016-10-27 Last updated: 2019-03-08
    Download full text (pdf)
    fulltext
    Download (jpg)
    preview image
  • 322.
    Sembrant, Andreas
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Carlson, Trevor E.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Black-Schaffer, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    POSTER: Putting the G back into GPU/CPU Systems Research2017In: 2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, p. 130-131Conference paper (Refereed)
    Abstract [en]

    Modern SoCs contain several CPU cores and many GPU cores to execute both general purpose and highly-parallel graphics workloads. In many SoCs, more area is dedicated to graphics than to general purpose compute. Despite this, the micro-architecture research community primarily focuses on GPGPU and CPU-only research, and not on graphics (the primary workload for many SoCs). The main reason for this is the lack of efficient tools and simulators for modern graphics applications. This work focuses on the GPU's memory traffic generated by graphics. We describe a new graphics tracing framework and use it to both study graphics applications' memory behavior as well as how CPUs and GPUs affect system performance. Our results show that graphics applications exhibit a wide range of memory behavior between applications and across time, and slows down co-running SPEC applications by 59% on average.

  • 323.
    Shrestha, Amendra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.
    Techniques for analyzing digital environments from a security perspective2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The development of the Internet and social media has exploded in the last couple of years. Digital environments such as social media and discussion forums provide an effective method of communication and are used by various groups in our societies.  For example, violent extremist groups use social media platforms for recruiting, training, and communicating with their followers, supporters, and donors. Analyzing social media is an important task for law enforcement agencies in order to detect activity and individuals that might pose a threat towards the security of the society.

    In this thesis, a set of different technologies that can be used to analyze digital environments from a security perspective are presented. Due to the nature of the problems that are studied, the research is interdisciplinary, and knowledge from terrorism research, psychology, and computer science are required. The research is divided into three different themes. Each theme summarizes the research that has been done in a specific area.

    The first theme focuses on analyzing digital environments and phenomena. The theme consists of three different studies. The first study is about the possibilities to detect propaganda from the Islamic State on Twitter.  The second study focuses on identifying references to a narrative containing xenophobic and conspiratorial stereotypes in alternative immigration critic media. In the third study, we have defined a set of linguistic features that we view as markers of a radicalization.

    A group consists of a set of individuals, and in some cases, individuals might be a threat towards the security of the society.  The second theme focuses on the risk assessment of individuals based on their written communication. We use different technologies including machine learning to experiment the possibilities to detect potential lone offenders.  Our risk assessment approach is implemented in the tool PRAT (Profile Risk Assessment Tool).

    Internet users have the ability to use different aliases when they communicate since it offers a degree of anonymity. In the third theme, we present a set of techniques that can be used to identify users with multiple aliases. Our research focuses on solving two different problems: author identification and alias matching. The technologies that we use are based on the idea that each author has a fairly unique writing style and that we can construct a writeprint that represents the author. In a similar manner,  we also use information about when a user communicates to create a timeprint. By combining the writeprint and the timeprint, we can obtain a set of powerful features that can be used to identify users with multiple aliases.

    To ensure that the technologies can be used in real scenarios, we have implemented and tested the techniques on data from social media. Several of the results are promising, but more studies are needed to determine how well they work in reality.

    List of papers
    1. A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
    Open this publication in new window or tab >>A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
    2017 (English)In: 2017 28th International Workshop on Database and Expert Systems Applications (DEXA) / [ed] Tjoa, AM Wagner, RR, IEEE, 2017, p. 1-5Conference paper, Published paper (Other academic)
    Abstract [en]

    In this study we try to identify extreme adopters on a discussion forum using machine learning. An extreme adopter is a user that has adopted a high level of a community-specific jargon and therefore can be seen as a user that has a high degree of identification with the community. The dataset that we consider consists of a Swedish xenophobic discussion forum where we use a machine learning approach to identify extreme adopters using a number of linguistic features that are independent on the dataset and the community. The results indicates that it is possible to separate these extreme adopters from the rest of the discussants on the discussion forum with more than 80% accuracy. Since the linguistic features that we use are highly domain independent, the results indicates that there is a possibility to use this kind of techniques to identify extreme adopters within other communities as well.

    Place, publisher, year, edition, pages
    IEEE, 2017
    Series
    International Workshop on Database and Expert Systems Applications-DEXA, ISSN 1529-4188
    Keywords
    Discussion forums, Support vector machines, Pragmatics, Manuals, Radio frequency, Electronic mail, Social network services
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-351187 (URN)10.1109/DEXA.2017.17 (DOI)000426078300001 ()978-1-5386-1051-0 (ISBN)
    Conference
    28th International Workshop on Database and Expert Systems Applications (DEXA), AUG 28-31, 2017, Lyon3 Univ, Lyon, FRANCE
    Available from: 2018-05-23 Created: 2018-05-23 Last updated: 2019-03-22Bibliographically approved
    2. Identifying warning behaviors of violent lone offenders in written communication
    Open this publication in new window or tab >>Identifying warning behaviors of violent lone offenders in written communication
    2016 (English)In: Proc. 16th ICDM Workshops, IEEE Computer Society, 2016, p. 1053-1060Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE Computer Society, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306943 (URN)10.1109/ICDMW.2016.0152 (DOI)978-1-5090-5910-2 (ISBN)
    Conference
    ICDM Workshop on Social Media and Risk, SOMERIS 2016, December 12, Barcelona, Spain
    Available from: 2017-02-02 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    3. Automatic detection of xenophobic narratives: A case study on Swedish alternative media
    Open this publication in new window or tab >>Automatic detection of xenophobic narratives: A case study on Swedish alternative media
    2016 (English)In: Proc. 14th International Conference on Intelligence and Security Informatics, IEEE, 2016, p. 121-126Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306903 (URN)10.1109/ISI.2016.7745454 (DOI)000390129600021 ()978-1-5090-3865-7 (ISBN)
    Conference
    ISI 2016, September 28–30, Tucson, AZ
    Available from: 2016-11-17 Created: 2016-11-04 Last updated: 2019-03-22Bibliographically approved
    4. Linguistic analysis of lone offender manifestos
    Open this publication in new window or tab >>Linguistic analysis of lone offender manifestos
    2016 (English)In: Proc. 4th International Conference on Cybercrime and Computer Forensics, IEEE, 2016Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306941 (URN)10.1109/ICCCF.2016.7740427 (DOI)000390123800007 ()978-1-5090-6096-2 (ISBN)
    Conference
    ICCCF 2016, June 12–14, Vancouver, Canada
    Available from: 2016-11-17 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    5. Detecting multipliers of jihadism on twitter
    Open this publication in new window or tab >>Detecting multipliers of jihadism on twitter
    2015 (English)In: Proc. 15th ICDM Workshops, IEEE Computer Society, 2015, p. 954-960Conference paper, Published paper (Refereed)
    Abstract [en]

    Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. In this paper we describe a first step towards automatically classifying twitter user accounts (tweeps) as supporters of jihadist groups who disseminate propaganda content online. We use a machine learning approach with two set of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and that can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2015
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-272243 (URN)10.1109/ICDMW.2015.9 (DOI)000380556700127 ()9781467384926 (ISBN)
    External cooperation:
    Conference
    ICDM Workshop on Intelligence and Security Informatics, ISI-ICDM 2015, November 14, Atlantic City, NJ
    Available from: 2015-11-14 Created: 2016-01-12 Last updated: 2019-03-22Bibliographically approved
    6. Detecting multiple aliases in social media
    Open this publication in new window or tab >>Detecting multiple aliases in social media
    2013 (English)In: Proc. 5th International Conference on Advances in Social Networks Analysis and Mining, New York: ACM Press, 2013, p. 1004-1011Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    New York: ACM Press, 2013
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-216568 (URN)10.1145/2492517.2500261 (DOI)978-1-4503-2240-9 (ISBN)
    Conference
    ASONAM 2013, August 25-29, Niagara Falls, Canada
    Funder
    Vinnova
    Available from: 2013-08-29 Created: 2014-01-23 Last updated: 2019-03-22Bibliographically approved
    7. Timeprints for identifying social media users with multiple aliases
    Open this publication in new window or tab >>Timeprints for identifying social media users with multiple aliases
    2015 (English)In: Security Informatics, ISSN 2190-8532, Vol. 4, p. 7:1-11, article id 7Article in journal (Refereed) Published
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-272242 (URN)10.1186/s13388-015-0022-z (DOI)
    Available from: 2015-09-24 Created: 2016-01-12 Last updated: 2019-03-22Bibliographically approved
    8. Multi-domain alias matching using machine learning
    Open this publication in new window or tab >>Multi-domain alias matching using machine learning
    2016 (English)In: Proc. 3rd European Network Intelligence Conference, IEEE, 2016, p. 77-84Conference paper, Published paper (Refereed)
    Abstract [en]

    We describe a methodology for linking aliases belonging to the same individual based on a user's writing style (stylometric features extracted from the user generated content) and her time patterns (time-based features extracted from the publishing times of the user generated content). While most previous research on social media identity linkage relies on matching usernames, our methodology can also be used for users who actively try to choose dissimilar usernames when creating their aliases. In our experiments on a discussion forum dataset and a Twitter dataset, we evaluate the performance of three different classifiers. We use the best classifier (AdaBoost) to evaluate how well it works on different datasets using different features. Experiments show that combining stylometric and time based features yield good results on our synthetic datasets and a small-scale evaluation on real-world blog data confirm these results, yielding a precision over 95%. The use of emotion-related and Twitter-related features yield no significant impact on the results.

    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer and Information Sciences
    Identifiers
    urn:nbn:se:uu:diva-306944 (URN)10.1109/ENIC.2016.019 (DOI)000399097600011 ()9781509034550 (ISBN)
    Conference
    ENIC 2016, September 5–7, Wroclaw, Poland
    Available from: 2017-02-02 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    9. Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
    Open this publication in new window or tab >>Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
    Show others...
    2018 (English)Report (Other academic)
    Place, publisher, year, edition, pages
    Belgium: EUROPOL, 2018. p. 24
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:uu:diva-367346 (URN)
    Note

    This paper was presented at the 2nd European Counter-Terrorism Centre (ECTC) Advisory Groupconference, 17-18 April 2018, at Europol Headquarters, The Hague.

    Available from: 2018-11-30 Created: 2018-11-30 Last updated: 2019-03-22Bibliographically approved
    10. Linguistic markers of a radicalized mind-set among extreme adopters
    Open this publication in new window or tab >>Linguistic markers of a radicalized mind-set among extreme adopters
    2017 (English)In: Proc. 10th ACM International Conference on Web Search and Data Mining, New York: ACM Press, 2017, p. 823-824Conference paper, Published paper (Refereed)
    Abstract [en]

    The words that we use when communicating in social media can reveal how we relate to ourselves and to others. For instance, within many online communities, the degree of adaptation to a community-specific jargon can serve as a marker of identification with the community. In this paper we single out a group of so called extreme adopters of community-specific jargon from the whole group of users of a Swedish discussion forum devoted to the topics immigration and integration. The forum is characterized by a certain xenophobic jargon, and we hypothesize that extreme adopters of this jargon also exhibit certain linguistic features that we view as markers of a radicalized mind-set. We use a Swedish translation of LIWC (linguistic inquiry word count) and find that the group of extreme adopters differs significantly from the whole group of forum users regarding six out of seven linguistic markers of a radicalized mind-set.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2017
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-379919 (URN)10.1145/3018661.3022760 (DOI)978-1-4503-4675-7 (ISBN)
    Conference
    WSDM 2017, 1st International Workshop on Cyber Deviance Detection
    Available from: 2017-02-02 Created: 2019-03-21 Last updated: 2019-04-08Bibliographically approved
    Download full text (pdf)
    fulltext
    Download (jpg)
    presentationsbild
  • 324.
    Shrestha, Amendra
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaati, Lisa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. FOI, Stockholm, Sweden..
    Cohen, Katie
    FOI, Stockholm, Sweden..
    A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities2017In: 2017 28th International Workshop on Database and Expert Systems Applications (DEXA) / [ed] Tjoa, AM Wagner, RR, IEEE, 2017, p. 1-5Conference paper (Other academic)
    Abstract [en]

    In this study we try to identify extreme adopters on a discussion forum using machine learning. An extreme adopter is a user that has adopted a high level of a community-specific jargon and therefore can be seen as a user that has a high degree of identification with the community. The dataset that we consider consists of a Swedish xenophobic discussion forum where we use a machine learning approach to identify extreme adopters using a number of linguistic features that are independent on the dataset and the community. The results indicates that it is possible to separate these extreme adopters from the rest of the discussants on the discussion forum with more than 80% accuracy. Since the linguistic features that we use are highly domain independent, the results indicates that there is a possibility to use this kind of techniques to identify extreme adopters within other communities as well.

  • 325.
    Sigurgeirsson, Daniel
    et al.
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Lárusdóttir, Marta
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Hamdaga, Mohammad
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jónsson, Björn
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Learning Outcome Outcomes: An Evaluation of Quality2018In: Proc. 48th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2018Conference paper (Refereed)
    Abstract [en]

    Learning outcomes are a standard specification of knowledge, skills and capabilities that a student is expected to acquire by attending a course or a degree program. While, in theory, the process of evaluating learning outcomes appears to be trivial, in practice it is a complicated and daunting process. In this study, we evaluate how learning outcomes can be effectively applied. The work focuses on the quality of both the specification of the learning outcomes and the assessment of whether these outcomes are reached. We discuss different abstraction levels for learning outcomes and the issue of alignment between high-level and low-level learning outcomes. We also address the criteria for assessing whether a student is meeting a learning outcome.

    Our work is focused on project-oriented courses, where assessing learning outcomes is seen as particularly challenging. In particular, we draw on an empirical study focused on systematically collecting key performance indicators of the progress towards achieving learning outcomes. The data gathering was done during the course through in-class questionnaires and individual diary notes, as a complementary process to the traditional observations made by the teacher running the course. This data serves as the basis for understanding how individual students advance towards the stated learning goals. We also conducted a focus group discussion after the course to better understand how to interpret the data collected during the course.

    An important result of our work is forming an understanding and vocabulary regarding learning outcomes and the assessment of how well students meet these learning goals in project-based educational settings. In addition to this, we make the following major contributions:

    • We present a systematic methodology to gauge how well students meet learning outcomes through in-class self-evaluation.
    • We present the results of an empirical study of a process-oriented evaluation of the students' development towards stated learning outcomes.

    We state some lessons learned from this process, that are applicable for designers of project-based courses.

  • 326. Singh, Abhishek
    et al.
    Ekberg, Pontus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Baruah, Sanjoy
    Applying Real-Time Scheduling Theory to the Synchronous Data Flow Model of Computation2017Conference paper (Refereed)
  • 327. Singh, Abhishek
    et al.
    Ekberg, Pontus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Baruah, Sanjoy
    Uniprocessor scheduling of real-time synchronous dataflow tasks2019In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 55, no 1, p. 1-31Article in journal (Refereed)
    Abstract [en]

    The synchronous dataflow graph (SDFG) model is widely used today for modeling real-time applications in safety-critical application domains. Schedulability analysis techniques that are well understood within the real-time scheduling community are applied to the analysis of recurrent real-time workloads that are represented using this model. An enhancement to the standard SDFG model is proposed, which supports the specification of a real-time latency constraint between a specified input and a specified output of an SDFG. A polynomial-time algorithm is derived for representing the computational requirement of each such enhanced SDFG task in terms of the notion of the demand bound function (dbf), which is widely used in real-time scheduling theory for characterizing computational requirements of recurrent processes represented by, e.g., the sporadic task model. By so doing, the extensive dbf-centered machinery that has been developed in real-time scheduling theory for the hard-real-time schedulability analysis of systems of recurrent tasks may be applied to the analysis of systems represented using the SDFG model as well. The applicability of this approach is illustrated by applying prior results from real-time scheduling theory to construct an exact preemptive uniprocessor schedulability test for collections of independent recurrent processes that are each represented using the enhanced SDFG model.

  • 328.
    Stigge, Martin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Combinatorial abstraction refinement for feasibility analysis of static priorities2015In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 51, no 6, p. 639-674Article in journal (Refereed)
  • 329.
    Stigge, Martin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Graph-based models for real-time workload: a survey2015In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 51, no 5, p. 602-636Article in journal (Refereed)
  • 330.
    Sun, Jinghao
    et al.
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China;Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Guan, Nan
    Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Jiang, Xu
    Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Chang, Shuangshuang
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    Guo, Zhishan
    Univ Florida, Dept Elect & Comp Engn, Gainesville, FL 32611 USA;Missouri Univ Sci & Technol, Dept Comp Sci, Rolla, MO 65409 USA.
    Deng, Qingxu
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    Wang, Yi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    A Capacity Augmentation Bound for Real-Time Constrained-Deadline Parallel Tasks Under GEDF2018In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, no 11, p. 2200-2211Article in journal (Refereed)
    Abstract [en]

    Capacity augmentation bound is a widely used quantitative metric in theoretical studies of schedulability analysis for directed acyclic graph (DAG) parallel real-time tasks, which not only quantifies the suboptimality of the scheduling algorithms, but also serves as a simple linear-time schedulability test. Earlier studies on capacity augmentation bounds of the sporadic DAG task model were either restricted to a single DAG task or a set of tasks with implicit deadlines. In this paper, we consider parallel tasks with constrained deadlines under global earliest deadline first policy. We first show that it is impossible to obtain a constant bound for our problem setting, and derive both lower and upper bounds of the capacity augmentation bound as a function with respect to the maximum ratio of task period to deadline. Our upper bound is at most 1.47 times larger than the optimal one. We conduct experiments to compare the acceptance ratio of our capacity augmentation bound with the existing schedulability test also having linear-time complexity. The results show that our capacity augmentation bound significantly outperforms the existing linear-time schedulability test under different parameter settings.

  • 331. Sun, Jinghao
    et al.
    Guan, Nan
    Wang, Yang
    Deng, Qingxu
    Zeng, Peng
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Feasibility of fork-join real-time task graph models: Hardness and algorithms2016In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 15, no 1, article id 14Article in journal (Refereed)
  • 332.
    Sun, Jinghao
    et al.
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China.;Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Guan, Nan
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Wang, Yang
    Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    He, Qingqiang
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Wang, Yi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Real-Time Scheduling and Analysis of OpenMP Task Systems with Tied Tasks2017In: 2017 IEEE Real-Time Systems Symposium (RTSS), IEEE, 2017, p. 92-103Conference paper (Refereed)
    Abstract [en]

    OpenMP is a promising framework for developing parallel real-time software on multi-cores. Although similar to the DAG task model, OpenMP task systems are significantly more difficult to analyze due to constraints posed by the OpenMP specification. An important feature in OpenMP is tied tasks, which must execute on the same thread during the whole life cycle. Although tied tasks enjoy benefits in simplicity and efficiency, it was considered to be not suitable to real-time systems due to its complex behavior. In this paper, we study the real-time scheduling and analysis of OpenMP task systems with tied tasks. First, we show that under the existing scheduling algorithms in OpenMP, tied tasks indeed may lead to extremely bad timing behaviors where the parallel workload is sequentially executed completely. To solve this problem, we proposed a new scheduling algorithm and developed two response time bounds for it, with different trade-off between simplicity and analysis precision. Experiments with both randomly generated OpenMP task systems and realistic OpenMP programs show that the response time bounds obtained by our approach for tied task systems are very close to that of untied tasks.

  • 333.
    Svensson, Maria
    et al.
    Göteborgs universitet.
    Ingerman, Åke
    Göteborgs universitet.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Teaching and learning system thinking in technology2015In: Plurality and Complementarity of Approaches in Design and Technology Education / [ed] Chatoney, Marjolaine, Marseille: Presses Universitaires de Provence , 2015, p. 404-409Conference paper (Refereed)
    Abstract [en]

    Complex technological systems have emerged during the last decade as an important strand in technology teaching in several national curricula for compulsory school. However, even though understanding the systemic aspects and connected nature of contemporary society, it remains unclear what such understanding entails in detail, and even more unclear what may constitute good teaching. We present the results from a teaching-learning design project on the topic of large societal and complex technological systems, which are seen as constituted of transformation and transport, acting on matter, energy and information. The main results are a suggested and evaluated plan of teaching developed in collaboration with a team of technology teachers, as well as descriptions of how pupils’ system thinking is constituted in terms of four basic aspects: Resource and intention of the system; System component constitution; Process and transformation in components and system; Network character. In total, a teaching plan spanning four lessons was realised in four different classrooms, with classes’ sizes ranging 15 to 25 pupils in the ages 14 and 15. The teaching design progresses through focusing specific parts of various systems, for example the transformation of polluted water to clean water in a water purification plant as part of the water supply system. There is an emphasis on the function of the part in relation to the system on the one hand, and on how the part is and can be realised technically, taking care to relate the latter to what is taken up in other curricular strands of technology. The last part focuses the examination of technological systems as constituted by interacting and meaningful parts, where their network nature may emerge.

    Download full text (pdf)
    fulltext
  • 334.
    Tang, Yue
    et al.
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Guan, Nan
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Liu, Weichen
    Nanyang Technol Univ, Singapore, Singapore..
    Phan, Linh Thi Xuan
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Revisiting GPC and AND Connector in Real-Time Calculus2017In: 2017 IEEE Real-Time Systems Symposium (RTSS), IEEE, 2017, p. 255-265Conference paper (Refereed)
    Abstract [en]

    Real-Time Calculus (RTC) is a powerful framework for modeling and worst-case performance analysis of networked systems. GPC and AND are two fundamental components in RTC, which model priority-based resource arbitration and synchronization operations, respectively. In this paper, we revisit GPC and AND. For GPC, we develop tighter output arrival curves to more precisely characterize the output event streams. For AND, we first identify a problem in the existing analysis method that may lead to negative values in the output curves, and present corrections to the problem. Then we generalize AND to synchronize more than two input event streams. We implement our new theoretical results and conduct experiments to evaluate their performance. Experiment results show significant improvement of our new methods in analysis precision and efficiency.

  • 335.
    Thota, Neena
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Connectivism and the use of technology/media in collaborative teaching and learning2015In: From the Confucian Way to Collaborative Knowledge Co-Construction / [ed] van Schalkwyk, Gertina J.; D'Amato, Rik Carl, Hoboken, NJ: John Wiley & Sons, 2015, p. 81-96Chapter in book (Refereed)
  • 336.
    Thota, Neena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    A structured approach to master thesis writing2015In: Conference for University Pedagogical Development, Uppsala, Sweden: Uppsala University, 2015Conference paper (Other academic)
  • 337.
    Thota, Neena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Integrating international students into CS programs2015In: Proc. 1st Al Baha University and Uppsala University Symposium on Quality in Computing Education, 2015, p. 6-8Conference paper (Refereed)
    Abstract [en]

    In recent years there has been a rapid increase in the intake of international students at universities. Integrating foreign students into the disciplinary and social culture prevalent at the university is a challenging task. In this paper, first, we summarize the findings of three of our studies on the experiences of Chinese students studying computer science at the Department of Information Technology, Uppsala University, Sweden. Then, based on our findings we make recommendations on how to integrate international students into academic life at Computer Science departments. We focus on the program and course levels, and also at the level of individual students in their new social and cultural environment.

  • 338.
    Thota, Neena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Learning Computer Science: Dimensions of variation within what Chinese students learn2016In: ACM Transactions on Computing Education, ISSN 1946-6226, E-ISSN 1946-6226, Vol. 16, no 3, article id 10Article in journal (Refereed)
  • 339.
    Thota, Neena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Estadieu, Gerald
    Ferrao, Antonio
    Wong, Kai Meng
    Engaging school students with tangible devices: Pilot project with .NET gadgeteer2015In: Proc. 3rd International Conference on Learning and Teaching in Computing and Engineering, Los Alamitos, CA: IEEE Computer Society, 2015, p. 112-119Conference paper (Refereed)
  • 340.
    Thota, Neena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. University of Saint Joseph, Macau, China.
    Negreiros, João G. M.
    Introducing Educational Technologies to Teachers: Experience Report2015In: Journal of University Teaching and Learning Practice, ISSN 1449-9789, E-ISSN 1449-9789, Vol. 12, no 1, p. 5:1-13Article in journal (Refereed)
  • 341. Tian, Ye
    et al.
    Li, Xiong
    Sangaiah, Arun Kumar
    Ngai, Edith
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Song, Zheng
    Zhang, Lanshan
    Wang, Wendong
    Privacy-preserving scheme in social participatory sensing based on Secure Multi-party Cooperation2018In: Computer Communications, ISSN 0140-3664, E-ISSN 1873-703X, Vol. 119, p. 167-178Article in journal (Refereed)
    Abstract [en]

    Social participant sensing has been widely used to collect location related sensory data for various applications. In order to improve the Quality of Information (QoI) of the collected data with constrained budget, the application server needs to coordinate participants with different data collection capabilities and various incentive requirements. However, existing participant coordination methods either require participants to reveal their trajectories to the server which causes privacy leakage, or tradeoff the location accuracy of participants for privacy, thereby leading to lower QoI. In this paper, we propose a privacy-preserving scheme, which allows application server to provide quasi-optimal QoI for social sensing tasks without knowing participants’ trajectories and identity. More specifically, we first suggest a Secure Multi-party Cooperation (SMC) based approach to evaluate participant’s contribution in terms of QoI without disclosing each individual’s trajectory. Second, a fuzzy decision based approach which aims to finely balance data utility gain, incentive budget and inferable privacy protection ability is adopted to coordinate participant in an incremental way. Third, sensory data and incentive are encrypted and then transferred along with participant-chain in perturbed way to protect user privacy throughout the data uploading and incentive distribution procedure. Simulation results show that our proposed method can efficiently select appropriate participants to achieve better QoI than other methods, and can protect each participant’s privacy effectively.

  • 342. Tian, Ye
    et al.
    Wang, Wendong
    Wu, Jie
    Kou, Qinli
    Song, Zheng
    Ngai, Edith C.-H.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Privacy-preserving social tie discovery based on cloaked human trajectories2017In: IEEE Transactions on Vehicular Technology, ISSN 0018-9545, E-ISSN 1939-9359, Vol. 66, no 2, p. 1619-1630Article in journal (Refereed)
  • 343.
    Tran, Kim-Anh
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jimborean, Alexandra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Carlson, Trevor E.
    National University of Singapore, Singapore.
    Koukos, Konstantinos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Själander, Magnus
    NTNU, Norway.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores2018In: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, Association for Computing Machinery (ACM), 2018, p. 328-343Conference paper (Refereed)
    Abstract [en]

    Increasing demands for energy efficiency constrain emerging hardware. These new hardware trends challenge the established assumptions in code generation and force us to rethink existing software optimization techniques. We propose a cross-layer redesign of the way compilers and the underlying microarchitecture are built and interact, to achieve both performance and high energy efficiency.

    In this paper, we address one of the main performance bottlenecks — last-level cache misses — through a software-hardware co-design. Our approach is able to hide memory latency and attain increased memory and instruction level parallelism by orchestrating a non-speculative, execute-ahead paradigm in software (SWOOP). While out-of-order (OoO) architectures attempt to hide memory latency by dynamically reordering instructions, they do so through expensive, power-hungry, speculative mechanisms.We aim to shift this complexity into software, and we build upon compilation techniques inherited from VLIW, software pipelining, modulo scheduling, decoupled access-execution, and software prefetching. In contrast to previous approaches we do not rely on either software or hardware speculation that can be detrimental to efficiency. Our SWOOP compiler is enhanced with lightweight architectural support, thus being able to transform applications that include highly complex control-flow and indirect memory accesses.

    Download full text (pdf)
    fulltext
  • 344.
    Tran, Kim-Anh
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Själander, Magnus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology. Norwegian University of Science and Technology.
    Ros, Alberto
    University of Murcia.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jimborean, Alexandra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-DesignIn: Article in journal (Other academic)
  • 345.
    Tshering, Phurpa
    et al.
    Royal University of Bhutan, Bhutan.
    Lhamo, Dekar
    Royal University of Bhutan, Bhutan.
    Yu, Lu
    Tongji University, China.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    How do first year students learn C programming in Bhutan?2017In: Proc. 5th International Conference on Learning and Teaching in Computing and Engineering, IEEE Computer Society, 2017, p. 25-29Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 346.
    Tsiftes, Nicolas
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. SICS.
    Storage-Centric System Architectures for Networked, Resource-Constrained Devices2016Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The emergence of the Internet of Things (IoT) has increased the demand for networked, resource-constrained devices tremendously. Many of the devices used for IoT applications are designed to be resource-constrained, as they typically must be small, inexpensive, and powered by batteries. In this dissertation, we consider a number of challenges pertaining to these constraints: system support for energy efficiency; flash-based storage systems; programming, testing, and debugging; and safe and secure application execution. The contributions of this dissertation are made through five research papers addressing these challenges.

    Firstly, to enhance the system support for energy-efficient storage in resource-constrained devices, we present the design, implementation, and evaluation of the Coffee file system and the Antelope DBMS. Coffee provides a sequential write throughput that is over 92% of the attainable flash driver throughput, and has a constant memory footprint for open files. Antelope is the first full-fledged relational DBMS for sensor networks, and it provides two novel indexing algorithms to enable fast and energy-efficient database queries.

    Secondly, we contribute a framework that extends the functionality and increases the performance of sensornet checkpointing, a debugging and testing technique. Furthermore, we evaluate how different data compression algorithms can be used to decrease the energy consumption and data dissemination time when reprogramming sensor networks.

    Lastly, we present Velox, a virtual machine for IoT applications. Velox can enforce application-specific resource policies. Through its policy framework and its support for high-level programming languages, Velox helps to secure IoT applications. Our experiments show that Velox monitors applications' resource usage and enforces policies with an energy overhead below 3%.

    The experimental systems research conducted in this dissertation has had a substantial impact both in the academic community and the open-source software community. Several of the produced software systems and components are included in Contiki, one of the premier open-source operating systems for the IoT and sensor networks, and they are being used both in research projects and commercial products.

    List of papers
    1. Efficient Sensor Network Reprogramming through Compression of Executable Modules
    Open this publication in new window or tab >>Efficient Sensor Network Reprogramming through Compression of Executable Modules
    2008 (English)In: Proceedings of Fifth Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON 2008): June 16-20, 2008, San Francisco, California, USA. 2008, 2008Conference paper, Published paper (Refereed)
    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-142776 (URN)
    Conference
    Fifth Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON 2008): June 16-20, 2008, San Francisco, California, USA. 2008
    Projects
    wisenet
    Available from: 2011-01-17 Created: 2011-01-17 Last updated: 2018-01-12
    2. Enabling Large- Scale Storage in Sensor Networks with the Coffee File System
    Open this publication in new window or tab >>Enabling Large- Scale Storage in Sensor Networks with the Coffee File System
    2009 (English)In: Proceedings of the 8th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN 2009), San Francisco, USA, April 2009, 2009Conference paper, Published paper (Refereed)
    Identifiers
    urn:nbn:se:uu:diva-142686 (URN)
    Conference
    8th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN 2009), San Francisco, USA, April 2009
    Projects
    wisenet
    Available from: 2011-01-14 Created: 2011-01-14 Last updated: 2016-02-12
    3. A Database in Every Sensor
    Open this publication in new window or tab >>A Database in Every Sensor
    2011 (English)Conference paper, Published paper (Refereed)
    Keywords
    Antelope, database, energy-efficiency, sensor network
    National Category
    Computer Sciences
    Research subject
    Computer Science with specialization in Database Technology
    Identifiers
    urn:nbn:se:uu:diva-267634 (URN)
    Conference
    The 9th ACM Conference on Embedded Networked Sensor Systems (SenSys 2011)
    Funder
    EU, FP7, Seventh Framework Programme, FP7-ICT-224282EU, FP7, Seventh Framework Programme, FP7-2007-2-224053Swedish Foundation for Strategic Research
    Available from: 2015-11-25 Created: 2015-11-25 Last updated: 2018-01-10
    4. Efficient and Flexible Sensornet Checkpointing
    Open this publication in new window or tab >>Efficient and Flexible Sensornet Checkpointing
    2014 (English)In: Wireless Sensor Networks, volume 8354, 2014, p. -65Conference paper, Published paper (Refereed)
    Abstract [en]

    Developing sensornet software is difficult partly because ofthe limited visibility of the system state of deployed nodes. Sensor-net checkpointing is a method that allows developers to save and restore full system state of nodes. We present four extensions to sensornetcheckpointing—compression, binary diffs, selective checkpointing, and checkpoint inspection—that reduce the time required for checkpointing operations considerably, and improve the granularity at which system state can be examined and manipulated down to the variable level. We show through an experimental evaluation that the checkpoint sizes can be reduced by 70%-93%, and the time can be reduced by at least 50% because of these improvements. The reduced time and increased granularity benefits multiple checkpointing use cases, including automated testing, network visualization, and software debugging.

    National Category
    Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-211145 (URN)0.1007/978-3-319-04651-8_4 (DOI)000340395900004 ()978-3-319-04650-1 (ISBN)978-3-319-04651-8 (ISBN)
    Conference
    EWSN 2014: The European Conference on Wireless Sensor Networks; 17-19 February 2014; University of Oxford; Oxford, UK
    Projects
    ProFuN
    Available from: 2013-11-20 Created: 2013-11-20 Last updated: 2016-02-12Bibliographically approved
    5. Velox: A Virtual Machine for IoT Software Security and Resource Protection
    Open this publication in new window or tab >>Velox: A Virtual Machine for IoT Software Security and Resource Protection
    (English)Manuscript (preprint) (Other academic)
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-268870 (URN)
    Funder
    VINNOVAKnowledge Foundation
    Available from: 2015-12-10 Created: 2015-12-10 Last updated: 2018-01-10
    Download full text (pdf)
    fulltext
    Download (jpg)
    presentationsbild
  • 347.
    Tsiftes, Nicolas
    et al.
    RISE SICS, Box 1263, SE-16429 Kista, Sweden.
    Voigt, Thiemo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. RISE SICS, Box 1263, SE-16429 Kista, Sweden.
    Velox VM: A safe execution environment for resource-constrained IoT applications2018In: Journal of Network and Computer Applications, ISSN 1084-8045, E-ISSN 1095-8592, Vol. 118, p. 61-73Article in journal (Refereed)
    Abstract [en]

    We present Velox, a virtual machine architecture that provides a safe execution environment for applications in resource-constrained IoT devices. Our goal with this architecture is to support developers in writing and deploying safe IoT applications, in a manner similar to smartphones with application stores. To this end, we provide a resource and security policy framework that enables fine-grained control of the execution environment of IoT applications. This framework allows device owners to configure, e.g., the amount of bandwidth, energy, and memory that each IoT application can use. Velox's features also include support for high-level programming languages, a compact bytecode format, and preemptive multi-threading. In the context of IoT devices, there are typically severe energy, memory, and processing constraints that make the design and implementation of a virtual machine with such features challenging. We elaborate on how Velox is implemented in a resource-efficient manner, and describe our port of Velox to the Contiki OS. Our experimental evaluation shows that we can control the resource usage of applications with a low overhead. We further show that, for typical I/O-driven IoT applications, the CPU and energy overhead of executing Velox bytecode is as low as 1-5% compared to corresponding applications compiled to machine code. Lastly, we demonstrate how application policies can be used to mitigate the possibility of exploiting vulnerable applications.

  • 348.
    Tu, Wei
    et al.
    Wuhan Univ, Coll Comp, Wuhan, Peoples R China..
    Wei, Lei
    Hunan Univ, Coll Comp Sci & Elect Engn, Changsha, Hunan, Peoples R China..
    Hu, Wenyan
    Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA..
    Sheng, Zhengguo
    Univ Sussex, Dept Engn & Design, Brighton, E Sussex, England..
    Nicanfar, Hasen
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC, Canada..
    Hu, Xiping
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC, Canada..
    Ngai, Edith C.-H.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Leung, Victor C. M.
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC, Canada..
    A Survey on Mobile Sensing Based Mood-Fatigue Detection for Drivers2016In: SMART CITY 360 / [ed] LeonGarcia, A Lenort, R Holman, D Stas, D Krutilova, V Wicher, P Caganova, D Spirkova, D Golej, J Nguyen, K, SPRINGER INT PUBLISHING AG , 2016, p. 3-15Conference paper (Refereed)
    Abstract [en]

    The rapid development of the Internet of Things (IoT) has provided innovative solutions to reduce traffic accidents caused by fatigue driving. When drivers are in bad mood or tired, their vigilance level decreases, which may prolong the reaction time to emergency situation and lead to serious accidents. With the help of mobile sensing and mood-fatigue detection, drivers' moodfatigue status can be detected while driving, and then appropriate measures can be taken to eliminate the fatigue or negative mood to increase the level of vigilance. This paper presents the basic concepts and current solutions of moodfatigue detection and some common solutions like mobile sensing and cloud computing techniques. After that, we introduce some emerging platforms which designed to promote safe driving. Finally, we summarize the major challenges in mood-fatigue detection of drivers, and outline the future research directions.

  • 349.
    Ulander, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Software Architectural Metrics for the Scania Internet of Things Platform: From a Microservice Perspectiv2017Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    There are limited tools to evaluate a microservice architecture and no common definition of how the architecture should be designed. Moreover, developing systems with microservices introduces additional complexity to the software architecture. That, together with the fact the systems are becoming more complex has led to a desire for architecture evaluation methods.

    In this thesis a set of quality attributes measured by structural metrics are used to evaluate Scania's IoT Offboard platform. By implementing a metrics evaluation program the quality of the software architecture can be improved. Also, metrics can assist developers and architects while they are becoming more efficient since they better understand how performance is measured, i.e. which quality attributes are the most important and how these are measured.

    For Scania's IoT Offboard platform the studied quality attributes are listed in decreasing importance: flexibility, reusability and understandability. All the microservices are loosely coupled in the platform, which results in a loosely coupled architecture. This indicates a flexible, reusable and understandable system, in terms of coupling. Furthermore, the architecture is decentralized, i.e. the system is unflexible and difficult to change. The other metrics were lacking a reference scale, hence they will act as a point of reference for future measurements as the architecture evolves.

    To improve the flexibility, reusability and understandability of the architecture the large microservices should be divided into several smaller microservices. Also aggregators should be utilized more to make the system more flexible.

    Download full text (pdf)
    fulltext
  • 350.
    Umuroglu, Yaman
    et al.
    Xilinx Res Labs, Dublin, Ireland.
    Conficconi, Davide
    Xilinx Res Labs, Dublin, Ireland;Politecn Milan, Milan, Italy.
    Rasnayake, Lahiru
    Norwegian Univ Sci & Technol, Trondheim, Norway.
    Preusser, Thomas B.
    Accem Technol GmbH, Dresden, Germany.
    Själander, Magnus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Norwegian Univ Sci & Technol, Trondheim, Norway.
    Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing2019In: ACM Transactions on Reconfigurable Technology and Systems, ISSN 1936-7406, E-ISSN 1936-7414, Vol. 12, no 3, article id 15Article in journal (Refereed)
    Abstract [en]

    Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations. Many matrix multiplication-dependent applications can use reduced-precision integer or fixed-point representations to increase their performance and energy efficiency while still offering adequate quality of results. However, precision requirements may vary between different application phases or depend on input data, rendering constant-precision solutions ineffective. BISMO, a vectorized bit-serial matrix multiplication overlay for reconfigurable computing, previously utilized the excellent binary-operation performance of FPGAs to offer a matrix multiplication performance that scales with required precision and parallelism. We show how BISMO can be scaled up on Xilinx FPGAs using an arithmetic architecture that better utilizes six-input LUTs. The improved BISMO achieves a peak performance of 15.4 binary TOPS on the Ultra96 board with a Xilinx UltraScale+ MPSoC.

45678 301 - 350 of 390
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf