uu.seUppsala University Publications
Change search
Refine search result
345678 251 - 300 of 351
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 251.
    Ngo, Tuan-Phong
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.
    Model Checking of Software Systems under Weak Memory Models2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    When a program is compiled and run on a modern architecture, different optimizations may be applied to gain in efficiency. In particular, the access operations (e.g., read and write) to the shared memory may be performed in an out-of-order manner, i.e., in a different order than the order in which the operations have been issued by the program. The reordering of memory access operations leads to efficient use of instruction pipelines and thus an improvement in program execution times. However, the gain in this efficiency comes at a price. More precisely, programs running under modern architectures may exhibit unexpected behaviors by programmers. The out-of-order execution has led to the invention of new program semantics, called weak memory model (WMM). One crucial problem is to ensure the correctness of concurrent programs running under weak memory models.

    The thesis proposes three techniques for reasoning and analyzing concurrent programs running under WMMs. The first one is a sound and complete analysis technique for finite-state programs running under the TSO semantics (Paper II). This technique is based on a novel and equivalent semantics for TSO, called Dual TSO semantics, and on the use of well-structured transition framework. The second technique is an under-approximation technique that can be used to detect bugs under the POWER semantics (Paper III). This technique is based on bounding the number of contexts in an explored execution where, in each context, there is only one active process. The third technique is also an under-approximation technique based on systematic testing (a.k.a. stateless model checking). This approach has been used to develop an optimal and efficient systematic testing approach for concurrent programs running under the Release-Acquire semantics (Paper IV).

    The thesis also considers the problem of effectively finding a minimal set of fences that guarantees the correctness of a concurrent program running under WMMs (Paper I). A fence (a.k.a. barrier) is an operation that can be inserted in the program to prohibit certain reorderings between operations issued before and after the fence. Since fences are expensive, it is crucial to automatically find a minimal set of fences to ensure the program correctness. This thesis presents a method for automatic fence insertion in programs running under the TSO semantics that offers the best-known trade-off between the efficiency and optimality of the algorithm. The technique is based on a novel notion of correctness, called Persistence, that compares the behaviors of a program running under WMMs to that running under the SC semantics.

    List of papers
    1. The Best of Both Worlds: Trading efficiency and optimality in fence insertion for TSO
    Open this publication in new window or tab >>The Best of Both Worlds: Trading efficiency and optimality in fence insertion for TSO
    2015 (English)In: Programming Languages and Systems: ESOP 2015, Springer Berlin/Heidelberg, 2015, p. 308-332Conference paper, Published paper (Refereed)
    Abstract [en]

    We present a method for automatic fence insertion in concurrent programs running under weak memory models that provides the best known trade-off between efficiency and optimality. On the one hand, the method can efficiently handle complex aspects of program behaviors such as unbounded buffers and large numbers of processes. On the other hand, it is able to find small sets of fences needed for ensuring correctness of the program. To this end, we propose a novel notion of correctness, called persistence, that compares the behavior of the program under the weak memory semantics with that under the classical interleaving (SC) semantics. We instantiate our framework for the Total Store Ordering (TSO) memory model, and give an algorithm that reduces the fence insertion problem under TSO to the reachability problem for programs running under SC. Furthermore, we provide an abstraction scheme that substantially increases scalability to large numbers of processes. Based on our method, we have implemented a tool and run it successfully on a wide range benchmarks.

    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2015
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743 ; 9032
    Keywords
    weak memory, correctness, verification, TSO, concurrent program
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-253645 (URN)10.1007/978-3-662-46669-8_13 (DOI)000361751400013 ()978-3-662-46668-1 (ISBN)
    Conference
    24th European Symposium on Programming, ESOP 2015, April 11–18, London, UK
    Projects
    UPMARC
    Available from: 2015-05-29 Created: 2015-05-29 Last updated: 2018-11-21
    2. A load-buffer semantics for total store ordering
    Open this publication in new window or tab >>A load-buffer semantics for total store ordering
    2018 (English)In: Logical Methods in Computer Science, ISSN 1860-5974, E-ISSN 1860-5974, Vol. 14, no 1, article id 9Article in journal (Refereed) Published
    Abstract [en]

    We address the problem of verifying safety properties of concurrent programs running over the Total Store Order (TSO) memory model. Known decision procedures for this model are based on complex encodings of store buffers as lossy channels. These procedures assume that the number of processes is fixed. However, it is important in general to prove the correctness of a system/algorithm in a parametric way with an arbitrarily large number of processes. 

    In this paper, we introduce an alternative (yet equivalent) semantics to the classical one for the TSO semantics that is more amenable to efficient algorithmic verification and for the extension to parametric verification. For that, we adopt a dual view where load buffers are used instead of store buffers. The flow of information is now from the memory to load buffers. We show that this new semantics allows (1) to simplify drastically the safety analysis under TSO, (2) to obtain a spectacular gain in efficiency and scalability compared to existing procedures, and (3) to extend easily the decision procedure to the parametric case, which allows obtaining a new decidability result, and more importantly, a verification algorithm that is more general and more efficient in practice than the one for bounded instances.

    Keywords
    Verification, TSO, concurrent program, safety property, well-structured transition system
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-337278 (URN)000426512000008 ()
    Projects
    UPMARC
    Available from: 2018-01-23 Created: 2017-12-21 Last updated: 2018-11-21
    3. Context-bounded analysis for POWER
    Open this publication in new window or tab >>Context-bounded analysis for POWER
    2017 (English)In: Tools and Algorithms for the Construction and Analysis of Systems: Part II, Springer, 2017, p. 56-74Conference paper, Published paper (Refereed)
    Abstract [en]

    We propose an under-approximate reachability analysis algorithm for programs running under the POWER memory model, in the spirit of the work on context-bounded analysis initiated by Qadeer et al. in 2005 for detecting bugs in concurrent programs (supposed to be running under the classical SC model). To that end, we first introduce a new notion of context-bounding that is suitable for reasoning about computations under POWER, which generalizes the one defined by Atig et al. in 2011 for the TSO memory model. Then, we provide a polynomial size reduction of the context-bounded state reachability problem under POWER to the same problem under SC: Given an input concurrent program P, our method produces a concurrent program P' such that, for a fixed number of context switches, running P' under SC yields the same set of reachable states as running P under POWER. The generated program P' contains the same number of processes as P and operates on the same data domain. By leveraging the standard model checker CBMC, we have implemented a prototype tool and applied it on a set of benchmarks, showing the feasibility of our approach.

    Place, publisher, year, edition, pages
    Springer, 2017
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10206
    Keywords
    POWER, weak memory model, under approximation, translation, concurrent program, testing
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-314901 (URN)10.1007/978-3-662-54580-5_4 (DOI)000440733400004 ()978-3-662-54579-9 (ISBN)
    Conference
    23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2017, April 22–29, Uppsala, Sweden
    Projects
    UPMARC
    Available from: 2017-03-31 Created: 2017-02-07 Last updated: 2018-11-21Bibliographically approved
    4. Optimal Stateless Model Checking under the Release-Acquire Semantics
    Open this publication in new window or tab >>Optimal Stateless Model Checking under the Release-Acquire Semantics
    2018 (English)In: SPLASH OOPSLA 2018, Boston, Nov 4-9, 2018, ACM Digital Library, 2018Conference paper, Published paper (Refereed)
    Abstract [en]

    We present a framework for efficient application of stateless model checking (SMC) to concurrent programs running under the Release-Acquire (RA) fragment of the C/C++11 memory model. Our approach is based on exploring the possible program orders, which define the order in which instructions of a thread are executed, and read-from relations, which define how reads obtain their values from writes. This is in contrast to previous approaches, which in addition explore the possible coherence orders, i.e., orderings between conflicting writes. Since unexpected test results such as program crashes or assertion violations depend only on the read-from relation, we avoid a potentially large source of redundancy. Our framework is based on a novel technique for determining whether a particular read-from relation is feasible under the RA semantics. We define an SMC algorithm which is provably optimal in the sense that it explores each program order and read-from relation exactly once. This optimality result is strictly stronger than previous analogous optimality results, which also take coherence order into account. We have implemented our framework in the tool Tracer. Experiments show that Tracer can be significantly faster than state-of-the-art tools that can handle the RA semantics.

    Place, publisher, year, edition, pages
    ACM Digital Library, 2018
    Keywords
    Software model checking, C/C++11, Release-Acquire, Concurrent program
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-358241 (URN)
    Conference
    SPLASH OOPSLA 2018
    Projects
    UPMARC
    Available from: 2018-08-26 Created: 2018-08-26 Last updated: 2019-01-09Bibliographically approved
  • 252.
    Ngo, Tuan-Phong
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Abdulla, Parosh
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jonsson, Bengt
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
    Atig, Mohamed Faouzi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Optimal Stateless Model Checking under the Release-Acquire Semantics2018In: SPLASH OOPSLA 2018, Boston, Nov 4-9, 2018, ACM Digital Library, 2018Conference paper (Refereed)
    Abstract [en]

    We present a framework for efficient application of stateless model checking (SMC) to concurrent programs running under the Release-Acquire (RA) fragment of the C/C++11 memory model. Our approach is based on exploring the possible program orders, which define the order in which instructions of a thread are executed, and read-from relations, which define how reads obtain their values from writes. This is in contrast to previous approaches, which in addition explore the possible coherence orders, i.e., orderings between conflicting writes. Since unexpected test results such as program crashes or assertion violations depend only on the read-from relation, we avoid a potentially large source of redundancy. Our framework is based on a novel technique for determining whether a particular read-from relation is feasible under the RA semantics. We define an SMC algorithm which is provably optimal in the sense that it explores each program order and read-from relation exactly once. This optimality result is strictly stronger than previous analogous optimality results, which also take coherence order into account. We have implemented our framework in the tool Tracer. Experiments show that Tracer can be significantly faster than state-of-the-art tools that can handle the RA semantics.

  • 253.
    Nikoleris, Nikos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Efficient Memory Modeling During Simulation and Native Execution2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Application performance on computer processors depends on a number of complex architectural and microarchitectural design decisions. Consequently, computer architects rely on performance modeling to improve future processors without building prototypes. This thesis focuses on performance modeling and proposes methods that quantify the impact of the memory system on application performance.

    Detailed architectural simulation, a common approach to performance modeling, can be five orders of magnitude slower than execution on the actual processor. At this rate, simulating realistic workloads requires years of CPU time. Prior research uses sampling to speed up simulation. Using sampled simulation, only a number of small but representative portions of the workload are evaluated in detail. To fully exploit the speed potential of sampled simulation, the simulation method has to efficiently reconstruct the architectural and microarchitectural state prior to the simulation samples. Practical approaches to sampled simulation use either functional simulation at the expense of performance or checkpoints at the expense of flexibility. This thesis proposes three approaches that use statistical cache modeling to efficiently address the problem of cache warm up and speed up sampled simulation, without compromising flexibility. The statistical cache model uses sparse memory reuse information obtained with native techniques to model the performance of the cache. The proposed sampled simulation framework evaluates workloads 150 times faster than approaches that use functional simulation to warm up the cache.

    Other approaches to performance modeling use analytical models based on data obtained from execution on native hardware. These native techniques allow for better understanding of the performance bottlenecks on existing hardware. Efficient resource utilization in modern multicore processors is necessary to exploit their peak performance. This thesis proposes native methods that characterize shared resource utilization in modern multicores. These methods quantify the impact of cache sharing and off-chip memory sharing on overall application performance. Additionally, they can quantify scalability bottlenecks for data-parallel, symmetric workloads.

    List of papers
    1. Extending statistical cache models to support detailed pipeline simulators
    Open this publication in new window or tab >>Extending statistical cache models to support detailed pipeline simulators
    2014 (English)In: 2014 IEEE International Symposium On Performance Analysis Of Systems And Software (Ispass), IEEE Computer Society, 2014, p. 86-95Conference paper, Published paper (Refereed)
    Abstract [en]

    Simulators are widely used in computer architecture research. While detailed cycle-accurate simulations provide useful insights, studies using modern workloads typically require days or weeks. Evaluating many design points, only exacerbates the simulation overhead. Recent works propose methods with good accuracy that reduce the simulated overhead either by sampling the execution (e.g., SMARTS and SimPoint) or by using fast analytical models of the simulated designs (e.g., Interval Simulation). While these techniques reduce significantly the simulation overhead, modeling processor components with large state, such as the last-level cache, requires costly simulation to warm them up. Statistical simulation methods, such as SMARTS, report that the warm-up overhead accounts for 99% of the simulation overhead, while only 1% of the time is spent simulating the target design. This paper proposes WarmSim, a method that eliminates the need to warm up the cache. WarmSim builds on top of a statistical cache modeling technique and extends it to model accurately not only the miss ratio but also the outcome of every cache request. WarmSim uses as input, an application's memory reuse information which is hardware independent. Therefore, different cache configurations can be simulated using the same input data. We demonstrate that this approach can be used to estimate the CPI of the SPEC CPU2006 benchmarks with an average error of 1.77%, reducing the overhead compared to a simulation with a 10M instruction warm-up by a factor of 50x.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2014
    Series
    IEEE International Symposium on Performance Analysis of Systems and Software-ISPASS
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-224221 (URN)10.1109/ISPASS.2014.6844464 (DOI)000364102000010 ()978-1-4799-3604-5 (ISBN)
    Conference
    ISPASS 2014, March 23-25, Monterey, CA
    Projects
    UPMARC
    Available from: 2014-05-06 Created: 2014-05-06 Last updated: 2018-12-14Bibliographically approved
    2. CoolSim: Statistical Techniques to Replace Cache Warming with Efficient, Virtualized Profiling
    Open this publication in new window or tab >>CoolSim: Statistical Techniques to Replace Cache Warming with Efficient, Virtualized Profiling
    2016 (English)In: Proceedings Of 2016 International Conference On Embedded Computer Systems: Architectures, Modeling And Simulation (Samos) / [ed] Najjar, W Gerstlauer, A, IEEE , 2016, p. 106-115Conference paper, Published paper (Refereed)
    Abstract [en]

    Simulation is an important part of the evaluation of next-generation computing systems. Detailed, cycle-accurate simulation, however, can be very slow when evaluating realistic workloads on modern microarchitectures. Sampled simulation (e.g., SMARTS and SimPoint) improves simulation performance by an order of magnitude or more through the reduction of large workloads into a small but representative sample. Additionally, the execution state just prior to a simulation sample can be stored into checkpoints, allowing for fast restoration and evaluation. Unfortunately, changes in software, architecture or fundamental pieces of the microarchitecture (e.g., hardware-software co-design) require checkpoint regeneration. The end result for co-design degenerates to creating checkpoints for each modification, a task checkpointing was designed to eliminate. Therefore, a solution is needed that allows for fast and accurate simulation, without the need for checkpoints. Virtualized fast-forwarding (VFF), an alternative to using checkpoints, allows for execution at near-native speed between simulation points. Warming the micro-architectural state prior to each simulation point, however, requires functional simulation, a costly operation for large caches (e.g., 8 M B). Simulating future systems with caches of many MBs can require warming of billions of instructions, dominating simulation time. This paper proposes CoolSim, an efficient simulation framework that eliminates cache warming. CoolSim uses VFF to advance between simulation points collecting at the same time sparse memory reuse information (MRI). The MRI is collected more than an order of magnitude faster than functional simulation. At the simulation point, detailed simulation with a statistical cache model is used to evaluate the design. The previously acquired MRI is used to estimate whether each memory request hits in the cache. The MRI is an architecturally independent metric and a single profile can be used in simulations of any size cache. We describe a prototype implementation of CoolSim based on KVM and gem5 running 19 x faster than the state-of-the-art sampled simulation, while it estimates the CPI of the SPEC CPU2006 benchmarks with 3.62% error on average, across a wide range of cache sizes.

    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-322061 (URN)000399143000015 ()9781509030767 (ISBN)
    Conference
    International Conference on Embedded Computer Systems - Architectures, Modeling and Simulation (SAMOS), JUL 17-21, 2016, Samos, GREECE
    Funder
    Swedish Foundation for Strategic Research EU, FP7, Seventh Framework Programme, 610490
    Available from: 2017-05-16 Created: 2017-05-16 Last updated: 2018-12-14Bibliographically approved
    3. Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation
    Open this publication in new window or tab >>Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation
    2018 (English)Report (Other academic)
    Abstract [en]

    Current practice for accurate and efficient simulation (e.g., SMARTS and Simpoint) makes use of sampling to significantly reduce the time needed to evaluate new research ideas. By evaluating a small but representative portion of the original application, sampling can allow for both fast and accurate performance analysis. However, as cache sizes of modern architectures grow, simulation time is dominated by warming microarchitectural state and not by detailed simulation, reducing overall simulation efficiency. While checkpoints can significantly reduce cache warming, improving efficiency, they limit the flexibility of the system under evaluation, requiring new checkpoints for software updates (such as changes to the compiler and compiler flags) and many types of hardware modifications. An ideal solution would allow for accurate cache modeling for each simulation run without the need to generate rigid checkpointing data a priori.

    Enabling this new direction for fast and flexible simulation requires a combination of (1) a methodology that allows for hardware and software flexibility and (2) the ability to quickly and accurately model arbitrarily-sized caches. Current approaches that rely on checkpointing or statistical cache modeling require rigid, up-front state to be collected which needs to be amortized over a large number of simulation runs. These earlier methodologies are insufficient for our goals for improved flexibility. In contrast, our proposed methodology, Delorean, outlines a unique solution to this problem. The Delorean simulation methodology enables both flexibility and accuracy by quickly generating a targeted cache model for the next detailed region on the fly without the need for up-front simulation or modeling. More specifically, we propose a new, more accurate statistical cache modeling method that takes advantage of hardware virtualization to precisely determine the memory regions accessed and to minimize the time needed for data collection while maintaining accuracy.

    Delorean uses a multi-pass approach to understand the memory regions accessed by the next, upcoming detailed region. Our methodology collects the entire set of key memory accesses and, through fast virtualization techniques, progressively scans larger, earlier regions to learn more about these key accesses in an efficient way. Using these techniques, we demonstrate that Delorean allows for the fast evaluation of systems and their software though the generation of accurate cache models on the fly. Delorean outperforms previous proposals by an order of magnitude, with a simulation speed of 150 MIPS and a similar average CPI error (below 4%).

    Publisher
    p. 12
    Series
    Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-369320 (URN)
    Available from: 2018-12-12 Created: 2018-12-12 Last updated: 2019-01-08Bibliographically approved
    4. Cache Pirating: Measuring the Curse of the Shared Cache
    Open this publication in new window or tab >>Cache Pirating: Measuring the Curse of the Shared Cache
    2011 (English)In: Proc. 40th International Conference on Parallel Processing, IEEE Computer Society, 2011, p. 165-175Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE Computer Society, 2011
    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-181254 (URN)10.1109/ICPP.2011.15 (DOI)978-1-4577-1336-1 (ISBN)
    Conference
    ICPP 2011
    Projects
    UPMARCCoDeR-MP
    Available from: 2011-10-17 Created: 2012-09-20 Last updated: 2018-12-14Bibliographically approved
    5. Bandwidth Bandit: Quantitative Characterization of Memory Contention
    Open this publication in new window or tab >>Bandwidth Bandit: Quantitative Characterization of Memory Contention
    2013 (English)In: Proc. 11th International Symposium on Code Generation and Optimization: CGO 2013, IEEE Computer Society, 2013, p. 99-108Conference paper, Published paper (Refereed)
    Abstract [en]

    On multicore processors, co-executing applications compete for shared resources, such as cache capacity and memory bandwidth. This leads to suboptimal resource allocation and can cause substantial performance loss, which makes it important to effectively manage these shared resources. This, however, requires insights into how the applications are impacted by such resource sharing. While there are several methods to analyze the performance impact of cache contention, less attention has been paid to general, quantitative methods for analyzing the impact of contention for memory bandwidth. To this end we introduce the Bandwidth Bandit, a general, quantitative, profiling method for analyzing the performance impact of contention for memory bandwidth on multicore machines. The profiling data captured by the Bandwidth Bandit is presented in a bandwidth graph. This graph accurately captures the measured application's performance as a function of its available memory bandwidth, and enables us to determine how much the application suffers when its available bandwidth is reduced. To demonstrate the value of this data, we present a case study in which we use the bandwidth graph to analyze the performance impact of memory contention when co-running multiple instances of single threaded application.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2013
    Keywords
    bandwidth, memory, caches
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-194101 (URN)10.1109/CGO.2013.6494987 (DOI)000318700200010 ()978-1-4673-5524-7 (ISBN)
    Conference
    CGO 2013, 23-27 February, Shenzhen, China
    Projects
    UPMARC
    Funder
    Swedish Research Council
    Available from: 2013-04-18 Created: 2013-02-08 Last updated: 2018-12-14Bibliographically approved
    6. A software based profiling method for obtaining speedup stacks on commodity multi-cores
    Open this publication in new window or tab >>A software based profiling method for obtaining speedup stacks on commodity multi-cores
    2014 (English)In: 2014 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS): ISPASS 2014, IEEE Computer Society, 2014, p. 148-157Conference paper, Published paper (Refereed)
    Abstract [en]

    A key goodness metric of multi-threaded programs is how their execution times scale when increasing the number of threads. However, there are several bottlenecks that can limit the scalability of a multi-threaded program, e.g., contention for shared cache capacity and off-chip memory bandwidth; and synchronization overheads. In order to improve the scalability of a multi-threaded program, it is vital to be able to quantify how the program is impacted by these scalability bottlenecks. We present a software profiling method for obtaining speedup stacks. A speedup stack reports how much each scalability bottleneck limits the scalability of a multi-threaded program. It thereby quantifies how much its scalability can be improved by eliminating a given bottleneck. A software developer can use this information to determine what optimizations are most likely to improve scalability, while a computer architect can use it to analyze the resource demands of emerging workloads. The proposed method profiles the program on real commodity multi-cores (i.e., no simulations required) using existing performance counters. Consequently, the obtained speedup stacks accurately account for all idiosyncrasies of the machine on which the program is profiled. While the main contribution of this paper is the profiling method to obtain speedup stacks, we present several examples of how speedup stacks can be used to analyze the resource requirements of multi-threaded programs. Furthermore, we discuss how their scalability can be improved by both software developers and computer architects.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2014
    Series
    IEEE International Symposium on Performance Analysis of Systems and Software-ISPASS
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-224230 (URN)10.1109/ISPASS.2014.6844479 (DOI)000364102000025 ()978-1-4799-3604-5 (ISBN)
    Conference
    ISPASS 2014, March 23-25, Monterey, CA
    Projects
    UPMARC
    Available from: 2014-05-06 Created: 2014-05-06 Last updated: 2018-12-14Bibliographically approved
  • 254.
    Nikoleris, Nikos
    et al.
    Arm Research, Cambridge UK.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Carlson, Trevor E.
    Department of Computer Science, National University of Singapore.
    Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation2018Report (Other academic)
    Abstract [en]

    Current practice for accurate and efficient simulation (e.g., SMARTS and Simpoint) makes use of sampling to significantly reduce the time needed to evaluate new research ideas. By evaluating a small but representative portion of the original application, sampling can allow for both fast and accurate performance analysis. However, as cache sizes of modern architectures grow, simulation time is dominated by warming microarchitectural state and not by detailed simulation, reducing overall simulation efficiency. While checkpoints can significantly reduce cache warming, improving efficiency, they limit the flexibility of the system under evaluation, requiring new checkpoints for software updates (such as changes to the compiler and compiler flags) and many types of hardware modifications. An ideal solution would allow for accurate cache modeling for each simulation run without the need to generate rigid checkpointing data a priori.

    Enabling this new direction for fast and flexible simulation requires a combination of (1) a methodology that allows for hardware and software flexibility and (2) the ability to quickly and accurately model arbitrarily-sized caches. Current approaches that rely on checkpointing or statistical cache modeling require rigid, up-front state to be collected which needs to be amortized over a large number of simulation runs. These earlier methodologies are insufficient for our goals for improved flexibility. In contrast, our proposed methodology, Delorean, outlines a unique solution to this problem. The Delorean simulation methodology enables both flexibility and accuracy by quickly generating a targeted cache model for the next detailed region on the fly without the need for up-front simulation or modeling. More specifically, we propose a new, more accurate statistical cache modeling method that takes advantage of hardware virtualization to precisely determine the memory regions accessed and to minimize the time needed for data collection while maintaining accuracy.

    Delorean uses a multi-pass approach to understand the memory regions accessed by the next, upcoming detailed region. Our methodology collects the entire set of key memory accesses and, through fast virtualization techniques, progressively scans larger, earlier regions to learn more about these key accesses in an efficient way. Using these techniques, we demonstrate that Delorean allows for the fast evaluation of systems and their software though the generation of accurate cache models on the fly. Delorean outperforms previous proposals by an order of magnitude, with a simulation speed of 150 MIPS and a similar average CPI error (below 4%).

  • 255.
    Nikoleris, Nikos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sandberg, Andreas
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Carlson, Trevor E.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    CoolSim: Eliminating Traditional Cache Warming with Fast, Virtualized Profiling2016In: 2016 IEEE International Symposium On Performance Analysis Of Systems And Software ISPASS 2016, 2016, p. 149-150Conference paper (Refereed)
    Abstract [en]

    Sampling (e.g., SMARTS and SimPoint) improves simulation performance by an order of magnitude or more through the reduction of large workloads into a small but representative sample. Virtualized fast-forwarding (e.g., FSA) speeds up simulation further by advancing execution at near-native speed between simulation points, making cache warming the critical limiting factor for simulation performance. CoolSim is an efficient simulation framework that eliminates cache warming. It collects sparse memory reuse information (MRI) while advancing between simulation points using virtualized fast-forwarding. During detailed simulation, a statistical cache model uses the previously acquired MRI to estimate the performance of the caches. CoolSim builds upon KVM and gem5 and runs 19x faster than the state-of-the-art sampled simulation. It estimates the CPI of the SPEC CPU2006 bench-marks with 3.62% error on average, across a wide range of cache sizes.

  • 256.
    Nordén, Lars-Åke
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Mannila, Linda
    Linkoping Univ, Dept Comp & Informat Sci, Linkoping, Sweden.
    Pears, Arnold
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Development of a self-efficacy scale for digital competences in schools2017In: 2017 IEEE Frontiers in Education Conference (FIE): Proc. 47th ASEE/IEEE Frontiers in Education Conference, IEEE Press, 2017Conference paper (Refereed)
    Abstract [en]

    As computer science enters the school curricula in an increasing number of countries, teachers must prepare to integrate digital competences into their teaching. This integration is a moving target where new methods, tools and applications appear and disappear at such rates that teachers must have confidence to independently and continuously explore what is new, what is relevant and how to plan their pedagogic activities to include digital competences. In this context approaches which can be used to study self-efficacy in digital competences among school teachers are desperately needed. With such a tool in place, we can make a baseline study and then follow teachers over time to measure changes in their self-efficacy, the cause of these changes and learn how to build their digital competence self-efficacy in different ways. The same tool can also be used to measure the self-efficacy in other populations, e.g., students in teacher training programs to ensure that they obtain an adequate self-efficacy in digital competences during their studies. This paper describes the development of a self-efficacy scale in digital competences, based on the DigiComp 2.0 framework definition of digital competence. The tool focuses predominantly on digital competences relevant for teachers in school years K-9.

  • 257.
    Nylén, Aletta
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Cajander, Åsa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Pears, Arnold
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    McDermott, Roger
    Why are we here?: Student perspectives on the goal of STEM higher education2017In: Proc. 47th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2017Conference paper (Refereed)
  • 258.
    Nylén, Aletta
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Isomöttönen, Ville
    McDermott, Roger
    Open-ended projects opened up – aspects of openness2017In: Proc. 47th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2017Conference paper (Refereed)
  • 259.
    Nylén, Aletta
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Pears, Arnold
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Cajander, Åsa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    McDermott, Roger
    Robert GordoSchool of Computer Science and Digital Media, Robert Gordon University, Aberdeen, UK.
    Isomöttönen, Ville
    Faculty of Information Technology, University of Jyväskylä, Finland.
    Why are we here?: The educational value model (EVM) as a framework to investigate the role of students’ professional identity development2018In: 2018 IEEE Frontiers in Education Conference (FIE), Piscataway, NJ: IEEE, 2018Conference paper (Refereed)
    Abstract [en]

    Education can be seen as a preparation for a future profession, where some educational programs very clearly prepare their students for a certain profession, e.g. plumber, nurse and architect. The possible professions for students following education programs in computing is quite varied and thus difficult to cater for, but to educate towards a professional life is still a stated goal in most higher education settings. We argue that this goal is typically not even closely reached and provide an analysis indicating factors explaining this situation. The analysis is based on the concept of professional identity. In earlier work [1] a framework with which to reason about student interactions with the regulatory structure of higher education and teachers was developed. In that paper we developed a compound model which not only relates these players to one another, but also provides approaches to reasoning about misalignments which arise when students and teachers approach their shared learning context from different perspectives. This framework is in this paper applied to address different aspects of professional identity with the intent of bringing forth deeper insights into challenges with educating towards professions. This issue is highly complex and the framework provides a structure that is beneficial for analysing different aspects in a more holistic manner.

  • 260.
    Nylén, Aletta
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Thota, Neena
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Eckerdal, Anna
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Kinnunen, Päivi
    Butler, Matthew
    Morgan, Michael
    Multidimensional analysis of creative coding MOOC forums: a methodological discussion2015In: Proc. 15th International Conference on Computing Education Research: Koli Calling, New York: ACM Press, 2015, p. 137-141Conference paper (Refereed)
  • 261.
    Pan, Xiaoyue
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Performance Modeling of Multi-core Systems: Caches and Locks2016Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Performance is an important aspect of computer systems since it directly affects user experience. One way to analyze and predict performance is via performance modeling. In recent years, multi-core systems have made processors more powerful while keeping power consumption relatively low. However the complicated design of these systems makes it difficult to analyze performance. This thesis presents performance modeling techniques for cache performance and synchronization cost on multi-core systems.

    A cache can be designed in many ways with different configuration parameters including cache size, associativity and replacement policy. Understanding cache performance under different configurations is useful to explore the design choices. We propose a general modeling framework for estimating the cache miss ratio under different cache configurations, based on the reuse distance distribution. On multi-core systems, each core usually has a private cache. Keeping shared data in private caches coherent has an extra cost. We propose three models to estimate this cost, based on information that can be gathered when running the program on a single core.

    Locks are widely used as a synchronization primitive in multi-threaded programs on multi-core systems. While they are often necessary for protecting shared data, they also introduce lock contention, which causes performance issues. We present a model to predict how much contention a lock has on multi-core systems, based on information obtainable from profiling a run on a single core. If lock contention is shown to be a performance bottleneck, one of the ways to mitigate it is to use another lock implementation. However, it is costly to investigate if adopting another lock implementation would reduce lock contention since it requires reimplementation and measurement. We present a model for forecasting lock contention with another lock implementation without replacing the current lock implementation.

    List of papers
    1. A Modeling Framework for Reuse Distance-based Estimation of Cache Performance
    Open this publication in new window or tab >>A Modeling Framework for Reuse Distance-based Estimation of Cache Performance
    2015 (English)In: Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on, IEEE, 2015, p. 62-71Conference paper, Published paper (Refereed)
    Abstract [en]

    We develop an analytical modeling framework for efficient prediction of cache miss ratios based on reuse distance distributions. The only input needed for our predictions is the reuse distance distribution of a program execution: previous work has shown that they can be obtained with very small overhead by sampling from native executions. This should be contrasted with previous approaches that base predictions on stack distance distributions, whose collection need significantly larger overhead or additional hardware support. The predictions are based on a uniform modeling framework which can be specialized for a variety of cache replacement policies, including Random, LRU, PLRU, and MRU (aka. bit-PLRU), and for arbitrary values of cache size and cache associativity. We evaluate our modeling framework with the SPEC CPU 2006 benchmark suite over a set of cache configurations with varying cache size, associativity and replacement policy. The introduced inaccuracies were generally below 1% for the model of the policy, and additionally around 2% when set-local reuse distances must be estimated from global reuse distance distributions. The inaccuracy introduced by sampling is significantly smaller.

    Place, publisher, year, edition, pages
    IEEE: , 2015
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-260767 (URN)10.1109/ISPASS.2015.7095785 (DOI)000380554200007 ()9781479919574 (ISBN)
    External cooperation:
    Conference
    2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS),
    Projects
    UPMARC
    Available from: 2015-08-24 Created: 2015-08-24 Last updated: 2016-09-15Bibliographically approved
    2. Modeling cache coherence misses on multicores
    Open this publication in new window or tab >>Modeling cache coherence misses on multicores
    2014 (English)In: 2014 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), IEEE, 2014, p. 96-105Conference paper, Published paper (Refereed)
    Abstract [en]

    While maintaining the coherency of private caches, invalidation-based cache coherence protocols introduce cache coherence misses. We address the problem of predicting the number of cache coherence misses in the private cache of a parallel application when running on a multicore system with an invalidation-based cache coherence protocol. We propose three new performance models (uniform, phased and symmetric) for estimating the number of coherence misses from information about inter-core data sharing patterns and the individual core's data reuse patterns. The inputs to the uniform and phased models are the write frequency and reuse distance distribution of shared data from different cores. This input can be obtained either from profiling the target application on a single core or by analyzing the data access pattern statically, and does not need a detailed simulation of the pattern of interleaving accesses to shared data. The output of the models is an estimated number of coherence misses of the target application. The output can be combined with the number of other kinds of misses to estimate the total number of misses in each core's private cache. This output can also be used to guide program optimization to improve cache performance. We evaluate our models with a set of benchmarks from the PARSEC benchmark suite on real hardware.

    Place, publisher, year, edition, pages
    IEEE: , 2014
    Series
    IEEE International Symposium on Performance Analysis of Systems and Software-ISPASS
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-238139 (URN)10.1109/ISPASS.2014.6844465 (DOI)000364102000011 ()978-1-4799-3604-5 (ISBN)
    Conference
    2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Monterey, CA
    Projects
    UPMARC
    Available from: 2014-12-09 Created: 2014-12-09 Last updated: 2016-02-22Bibliographically approved
    3. Predicting the Cost of Lock Contention in Parallel Applications on Multicores using Analytic Modeling
    Open this publication in new window or tab >>Predicting the Cost of Lock Contention in Parallel Applications on Multicores using Analytic Modeling
    2012 (English)In: Proc. 5th Swedish Workshop on Multi-Core Computing, 2012Conference paper, Published paper (Other academic)
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-270130 (URN)
    Conference
    MCC12
    Projects
    UPMARC
    Available from: 2012-11-23 Created: 2015-12-21 Last updated: 2018-01-10Bibliographically approved
    4. Forecasting Lock Contention Before Adopting Another Lock Algorithm
    Open this publication in new window or tab >>Forecasting Lock Contention Before Adopting Another Lock Algorithm
    2015 (English)Report (Other academic)
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-270716 (URN)
    Available from: 2016-01-03 Created: 2016-01-03 Last updated: 2018-01-10Bibliographically approved
  • 262.
    Pan, Xiaoyue
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jonsson, Bengt
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    A Modeling Framework for Reuse Distance-based Estimation of Cache Performance2015In: Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on, IEEE, 2015, p. 62-71Conference paper (Refereed)
    Abstract [en]

    We develop an analytical modeling framework for efficient prediction of cache miss ratios based on reuse distance distributions. The only input needed for our predictions is the reuse distance distribution of a program execution: previous work has shown that they can be obtained with very small overhead by sampling from native executions. This should be contrasted with previous approaches that base predictions on stack distance distributions, whose collection need significantly larger overhead or additional hardware support. The predictions are based on a uniform modeling framework which can be specialized for a variety of cache replacement policies, including Random, LRU, PLRU, and MRU (aka. bit-PLRU), and for arbitrary values of cache size and cache associativity. We evaluate our modeling framework with the SPEC CPU 2006 benchmark suite over a set of cache configurations with varying cache size, associativity and replacement policy. The introduced inaccuracies were generally below 1% for the model of the policy, and additionally around 2% when set-local reuse distances must be estimated from global reuse distance distributions. The inaccuracy introduced by sampling is significantly smaller.

  • 263.
    Pan, Xiaoyue
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Klaftenegger, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Jonsson, Bengt
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Forecasting Lock Contention Before Adopting Another Lock Algorithm2015Report (Other academic)
  • 264.
    Panda, Anmol
    et al.
    BITS Pilani KK Birla Goa Campus, Dept Comp Sci & Informat Syst, Sancoale, Goa, India..
    Rümmer, Philipp
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Goveas, Neena
    BITS Pilani KK Birla Goa Campus, Dept Comp Sci & Informat Syst, Sancoale, Goa, India..
    A Comparative Study of GPUVerify and GKLEE2016In: 2016 Fourth International Conference On Parallel, Distributed And Grid Computing (PDGC) / [ed] Kumar, P Singh, AK, IEEE , 2016, p. 112-117Conference paper (Refereed)
    Abstract [en]

    Use of Graphics Processing Unit (CPU) software is increasing due to the need for data intensive operations and availability of GPUs. This has led to a need for effective GPU software verification tools. These tools have to satisfy requirements such as accuracy, reliability and ease of use. In this work, we have considered two such tools: GPUVerify and GKLEE. Our objectives were to learn about the common challenges developers faced in GPU programming, to understand the specific bugs that these two tools report and compare their scope and scalability aspects. We have also considered usability and learnability aspects. In order to test the software, twenty-six benchmarks were selected from open-source applications. These benchmarks were then verified using the tools and the results documented and analysed. The conclusions have been included in the final section.

  • 265.
    Pears, Arnold
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Assuring the Quality of Engineering Education2015In: 2015 International Conference on Learning and Teaching in Computing and Engineering, 2015, p. 108-111Conference paper (Refereed)
    Abstract [en]

    Only the foolhardy would question the importance and status that quality assurance holds in the current political climate that prevails for higher education. This paper critiques the culture of higher education quality assurance and accreditation that has emerged over the last twenty years. An underlying trend that influences quality systems for higher education is the focus on process and service. This results in quality systems that focus on documenting educational processes, and which often assume that students are a type of customer, and education a type of service. This paper argues that this educational epistemology oversimplifies the nature and role of higher education. We argue that a more holistic quality evaluation model is needed, including often overlooked aspects of quality such as curriculum, pedagogical aspects of instructional design combined with industrial and educational quality assurance processes that can better meet the needs of practitioners and institutions.

  • 266.
    Pears, Arnold
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Cajander, Åsa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    The Archetype Learning Method: Scaffolding teamwork competences in the engineering classroom2017In: Proc. 47th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2017Conference paper (Refereed)
  • 267.
    Pears, Arnold
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Nylén, Aletta
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Implications of anonymous assessment2015In: Proc. 45th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2015, p. 1404-1408Conference paper (Refereed)
    Abstract [en]

    The role of anonymous assessment in ensuring fair and equitable outcomes for students has been one of the major tenets of educational reform over the last few decades [1]. One of the major goals of these efforts is to reduce the impact of subconscious discriminatory behaviour in assigning grades based on perceptions of ability of gender or minority groups by the examiner. Recently however research has been emerging which challenges the widespread assumptions about the benefits of anonymity drawn from Newstead's work. Contrary results include the work of Dorsey and Colliver, 1995, in medical education, and Batten et al. 2013, who explore the impact of student reputation on assessment. These and many other studies conclude that anonymous assessment resulted in no apparent changes in assessment outcomes. In this paper we explore the implications of anonymity taking examples from educational settings where student anonymity is already an adopted practice. We discuss the positive and negative implications of student anonymity, and identify areas for future research.

  • 268.
    Pears, Arnold
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Nylén, Aletta
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    A critical analysis of trends in student-centric engineering education and their implications for learning2016In: Proc. 46th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2016Conference paper (Refereed)
    Abstract [en]

    Student-centric education has emerged as a dominant aspect of Higher Education policy over the last two decades. Much has been written about the benefits of student active educational approaches, and applied educational research, for instance the meta-study of Hattie, places emphasis on student-centric learning practices that enhance achieved learning outcomes.

    Most existing studies have been evaluations of single courses. In contrast this study focusses on the complete study context of the learner, who typically is in the situation of reading two or three courses simultaneously.

    Our primary goal in this paper is to explore potential challenges as we attempt to scale up active learning to encompass the full curricula. We use a mixture of interview and survey data collected from staff, combined with course schedules and student input to explore some of the potential implications of mandating a student-centric approach over an entire curriculum.

  • 269.
    Peters, Anne-Kathrin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Learning Computing at University: Participation and Identity: A Longitudinal Study2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Computing education has struggled with student engagement and diversity in the student population for a long time. Research in science, technology, engineering, and mathematics (STEM) education suggests that taking a social, long-term perspective on learning is a fruitful approach to resolving some of these persistent challenges.

    A longitudinal study has been conducted, following students from two computing study programmes (CS/IT) over a three-year period. The students reflected on their experiences with CS/IT in a series of interviews. Drawing on social identity theory, the analysis has focused on describing participation in CS/IT, doing, thinking, feeling in relation to CS/IT, as negotiated among different people.

    Phenomenographic analysis yields an outcome space that describes increasingly broad ways in which the students experience participation in CS/IT over the years. Two further outcome spaces provide nuanced insights into experiences that are of increasing relevance as the students advance in their studies; participation as problem solving and problem solving for others. Problem solving defined as solving difficult (technical) problems seems predominate in the learning environment. Problem solving for others brings the user into perspective, but first in the human computer interaction (HCI) course in year three. Students react with scepticism to HCI, excluding HCI from computing, some are students who commenced their studies with broader interests in computing.

    Demonstrating (technical) problem solving competence is the most vital indicator competence in the two study programmes and the students adapt their reflections on who they are as computing students and professionals accordingly. People showing broader interests in computing risk being marginalised. I identify a gap between conceptions of computing as interdisciplinary and important for society and constructions of computing as technical. Closing the gap could improve retention and diversity, and result in graduates that are better prepared to contribute to societal development.

    List of papers
    1. Students' experiences and attitudes towards learning Computer Science
    Open this publication in new window or tab >>Students' experiences and attitudes towards learning Computer Science
    2012 (English)In: Proc. 42nd ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE , 2012, p. 88-93Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE, 2012
    National Category
    Computer Sciences Educational Sciences
    Identifiers
    urn:nbn:se:uu:diva-184605 (URN)10.1109/FIE.2012.6462238 (DOI)000356489200033 ()978-1-4673-1353-7 (ISBN)
    Available from: 2012-10-06 Created: 2012-11-09 Last updated: 2018-01-12Bibliographically approved
    2. Engagement in Computer Science and IT — What!: A matter of identity?
    Open this publication in new window or tab >>Engagement in Computer Science and IT — What!: A matter of identity?
    2013 (English)In: Proc. 1st International Conference on Learning and Teaching in Computing and Engineering, Los Alamitos, CA: IEEE Computer Society, 2013, p. 114-121Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Los Alamitos, CA: IEEE Computer Society, 2013
    National Category
    Computer Sciences Educational Sciences
    Identifiers
    urn:nbn:se:uu:diva-202686 (URN)10.1109/LaTiCE.2013.42 (DOI)000324484000016 ()978-1-4673-5627-5 (ISBN)
    Conference
    LaTiCE 2013
    Available from: 2013-06-22 Created: 2013-06-25 Last updated: 2018-01-11Bibliographically approved
    3. First year Computer Science and IT students' experience of participation in the discipline
    Open this publication in new window or tab >>First year Computer Science and IT students' experience of participation in the discipline
    2014 (English)In: Proc. 2nd International Conference on Learning and Teaching in Computing and Engineering, Los Alamitos, CA: IEEE Computer Society, 2014, p. 1-8Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Los Alamitos, CA: IEEE Computer Society, 2014
    National Category
    Computer Sciences Educational Sciences
    Identifiers
    urn:nbn:se:uu:diva-224653 (URN)10.1109/LaTiCE.2014.9 (DOI)000355978500001 ()978-1-4799-3591-8 (ISBN)
    Conference
    LaTiCE 2014
    Available from: 2014-06-10 Created: 2014-05-16 Last updated: 2018-01-11Bibliographically approved
    4. Second year Computer Science and IT students' experience of participation in the discipline
    Open this publication in new window or tab >>Second year Computer Science and IT students' experience of participation in the discipline
    2015 (English)In: Proc. 15th International Conference on Computing Education Research: Koli Calling, New York: ACM Press, 2015, p. 68-76Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    New York: ACM Press, 2015
    National Category
    Computer Sciences Educational Sciences
    Identifiers
    urn:nbn:se:uu:diva-269178 (URN)10.1145/2828959.2828962 (DOI)978-1-4503-4020-5 (ISBN)
    Available from: 2015-11-21 Created: 2015-12-14 Last updated: 2018-01-10Bibliographically approved
    5. Students' experience of participation in a discipline: A longitudinal study of computer science and IT engineering students
    Open this publication in new window or tab >>Students' experience of participation in a discipline: A longitudinal study of computer science and IT engineering students
    2018 (English)In: ACM Transactions on Computing Education, ISSN 1946-6226, E-ISSN 1946-6226, Vol. 19, no 1, article id 5Article in journal (Refereed) Published
    National Category
    Computer and Information Sciences Educational Sciences
    Identifiers
    urn:nbn:se:uu:diva-331401 (URN)10.1145/3230011 (DOI)000456615300005 ()
    Available from: 2018-09-28 Created: 2017-10-13 Last updated: 2019-04-03Bibliographically approved
  • 270.
    Peters, Anne-Kathrin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Students' experience of participation in a discipline: A longitudinal study of computer science and IT engineering students2018In: ACM Transactions on Computing Education, ISSN 1946-6226, E-ISSN 1946-6226, Vol. 19, no 1, article id 5Article in journal (Refereed)
  • 271.
    Peters, Anne-Kathrin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Eckerdal, Anna
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Pears, Arnold
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Second year Computer Science and IT students' experience of participation in the discipline2015In: Proc. 15th International Conference on Computing Education Research: Koli Calling, New York: ACM Press, 2015, p. 68-76Conference paper (Refereed)
  • 272.
    Peters, Anne-Kathrin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hussain, Waqar
    Cajander, Åsa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Clear, Tony
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Preparing the global software engineer2015In: Proc. 10th International Conference on Global Software Engineering, Los Alamitos, CA: IEEE Computer Society, 2015, p. 61-70Conference paper (Refereed)
  • 273.
    Peters, Anne-Kathrin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Oden Choi, Judeth
    The making of a computer scientist2018In: XRDS, ISSN 1528-4972, Vol. 25, no 1, p. 7-8Article in journal (Refereed)
  • 274.
    Pinol, Oriol Pinol
    et al.
    Yanzi Networks AB, Stockholm, Sweden..
    Raza, Shahid
    SICS Swedish ICT, Stockholm, Sweden..
    Eriksson, Joakim
    Yanzi Networks AB, Stockholm, Sweden.;SICS Swedish ICT, Stockholm, Sweden..
    Voigt, Thiemo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. SICS Swedish ICT, Stockholm, Sweden..
    BSD-based Elliptic Curve Cryptography for the Open Internet of Things2015In: 2015 7Th International Conference On New Technologies, Mobility And Security (NtTMS), 2015Conference paper (Refereed)
    Abstract [en]

    The Internet of Things (IoT) is the interconnection of everyday physical objects with the Internet and their representation in the digital world. Due to the connectivity of physical objects with the untrusted Internet, security has become an important pillar for the success of IoT-based services. Things in the IoT are resource-constrained devices with limited processing and storage capabilities. Often, these things are battery powered and connected through lossy wireless links. Therefore, lightweight and efficient ways of providing secure communication in the IoT are needed. In this context, Elliptic Curve Cryptography (ECC) is considered as a strong candidate to provide security in the IoT while being able to function in constrained environments. In this paper we present a lightweight implementation and evaluation of ECC for the Contiki OS. For fast, secure and cost-effective mass development of IoT-based services by different vendors, it is important that the IoT protocols are implemented and released as open source and open licensed. To the best of our knowledge our ECC is the first lightweight BSD-licensed ECC for the IoT devices. We show the feasibility of our implementation by a thorough performance analysis using several implementations and optimization algorithms. Moreover, we evaluate it on a real IoT hardware platform.

  • 275. Piskac, Ruzica
    et al.
    Rümmer, PhilippUppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Verified Software. Theories, Tools, and Experiments: Revised Selected Papers2018Conference proceedings (editor) (Refereed)
  • 276. Porter, Leo
    et al.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Member spotlight part 22017In: ACM SIGCSE Bulletin, ISSN 0097-8418, Vol. 49, no 2, p. 11-14Article in journal (Other (popular science, discussion, etc.))
  • 277.
    Rezine, Othmane
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Verification of networks of communicating processes: Reachability problems and decidability issues2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Computer systems are used in almost all aspects of our lives and our dependency on them keeps on increasing. When computer systems are used to handle critical tasks, any software failure can cause severe human and/or material losses. Therefore, for such applications, it is important to detect software errors at an early stage of software development. Furthermore, the growing use of concurrent and distributed programs exponentially increases the complexity of computer systems, making the problem of detecting software errors even harder (if not impossible). This calls for defining systematic and efficient techniques to evaluate the safety and the correctness of programs. The aim of Model-Checking is to analyze automatically whether a given program satisfies its specification. Early applications of Model-Checking were restricted to systems whose behaviors can be captured by finite graphs, so called finite-state systems. Since many computer systems cannot be modeled as finite-state machines, there has been a growing interest in extending the applicability of Model-Checking to infinite-state systems.

    The goal of this thesis is to extend the applicability of Model Checking for three instances of infinite-state systems: Ad-Hoc Networks, Dynamic Register Automata and Multi Pushdown Systems. Each one of these instances models challenging types of networks of communicating processes. In both Ad-Hoc Networks and Dynamic Register Automata, communication is carried through message passing. In each type of network, a graph topology models the communication links between processes in the network. The graph topology is static in the case of Ad-Hoc Networks while it is dynamic in the case of Dynamic Register Automata. The number of processes in both types of networks is unbounded. Finally, we consider Multi Pushdown Systems, a model used to study the behaviors of concurrent programs composed of sequential recursive sequential programs communicating through a shared memory.

    List of papers
    1. Budget-bounded model-checking pushdown systems
    Open this publication in new window or tab >>Budget-bounded model-checking pushdown systems
    2014 (English)In: Formal methods in system design, ISSN 0925-9856, E-ISSN 1572-8102, Vol. 45, no 2, p. 273-301Article in journal (Refereed) Published
    Abstract [en]

    We address the verification problem for concurrent programs modeled as multi-pushdown systems (MPDS). In general, MPDS are Turing powerful and hence come along with undecidability of all basic decision problems. Because of this, several subclasses of MPDS have been proposed and studied in the literature (Atig et al. in LNCS, Springer, Berlin, 2005; La Torre et al. in LICS, IEEE, 2007; Lange and Lei in Inf Didact 8, 2009; Qadeer and Rehof in TACAS, LNCS, Springer, Berlin, 2005). In this paper, we propose the class of bounded-budget MPDS, which are restricted in the sense that each stack can perform an unbounded number of context switches only if its depth is below a given bound, and a bounded number of context switches otherwise. We show that the reachability problem for this subclass is Pspace-complete and that LTL-model-checking is Exptime-complete. Furthermore, we propose a code-to-code translation that inputs a concurrent program and produces a sequential program such that running under the budget-bounded restriction yields the same set of reachable states as running . Moreover, detecting (fair) non-terminating executions in can be reduced to LTL-Model-Checking of . By leveraging standard sequential analysis tools, we have implemented a prototype tool and applied it on a set of benchmarks, showing the feasibility of our translation.

    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-234422 (URN)10.1007/s10703-014-0207-y (DOI)000343210700007 ()
    Projects
    UPMARCConcurrent recursive programs
    Funder
    Swedish Research Council
    Available from: 2014-04-25 Created: 2014-10-17 Last updated: 2018-01-11
    2. Verification of Directed Acyclic Ad Hoc Networks
    Open this publication in new window or tab >>Verification of Directed Acyclic Ad Hoc Networks
    2013 (English)In: Formal Techniques for Distributed Systems: FORTE 2013, Springer Berlin/Heidelberg, 2013, p. 193-208Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer Berlin/Heidelberg, 2013
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743 ; 7892
    National Category
    Computer Systems
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-211409 (URN)10.1007/978-3-642-38592-6_14 (DOI)978-3-642-38591-9 (ISBN)
    Conference
    Formal Techniques for Distributed Systems (FORTE 2013), June 3-5, 2013, Florence, Italy
    Projects
    ProFuN
    Funder
    Swedish Foundation for Strategic Research
    Available from: 2013-11-27 Created: 2013-11-22 Last updated: 2017-11-27Bibliographically approved
    3. Verification of Dynamic Register Automata
    Open this publication in new window or tab >>Verification of Dynamic Register Automata
    2014 (English)In: Leibniz International Proceedings in Informatics: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2014), 2014Conference paper, Published paper (Refereed)
    Abstract [en]

    We consider the verification problem for Dynamic Register Automata (Dra). Dra extend classical register automata by process creation. In this setting, each process is equipped with a finite number of registers in which the process IDs of other processes can be stored. A process can communicate with processes whose IDs are stored in its registers and can send them the content of its registers. The state reachability problem asks whether a Dra reaches a configuration where at least one process is in an error state. We first show that this problem is in general undecidable. This result holds even when we restrict the analysis to configurations where the maximal length of the simple paths in their underlying (un)directed communication graphs are bounded by some constant. Then we introduce the model of degenerative Dra which allows non-deterministic reset of the registers. We prove that for every given Dra, its corresponding degenerative one has the same set of reachable states. While the state reachability of a degenerative Dra remains undecidable, we show that the problem becomes decidable with nonprimitive-recursive complexity when we restrict the analysis to strongly bounded configurations, i.e. configurations whose underlying undirected graphs have bounded simple paths. Finally, we consider the class of strongly safe Dra, where all the reachable configurations are assumed to be strongly bounded. We show that for strongly safe Dra, the state reachability problem becomes decidable. 

    Keywords
    Register Automata, State Reachability, Formal Verification
    National Category
    Computer Sciences
    Research subject
    Computer Science
    Identifiers
    urn:nbn:se:uu:diva-237854 (URN)
    Conference
    IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, New Delhi, India, December 15–17 2014.
    Projects
    ProFuNUPMARC
    Available from: 2014-12-05 Created: 2014-12-05 Last updated: 2018-01-11Bibliographically approved
    4. Verification of buffered dynamic register automata
    Open this publication in new window or tab >>Verification of buffered dynamic register automata
    2015 (English)In: Networked Systems: NETYS 2015, Springer, 2015, p. 15-31Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer, 2015
    Series
    Lecture Notes in Computer Science ; 9466
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-247828 (URN)10.1007/978-3-319-26850-7_2 (DOI)978-3-319-26849-1 (ISBN)
    Conference
    NETYS 2015, May 13–15, Agadir, Morocco
    Projects
    ProFuNUPMARC
    Funder
    Swedish Foundation for Strategic Research , RIT08-0065
    Available from: 2016-03-23 Created: 2015-03-24 Last updated: 2018-01-11Bibliographically approved
  • 278.
    Ros, Alberto
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    The Superfluous Load Queue2018In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE, 2018, p. 95-107Conference paper (Refereed)
    Abstract [en]

    In an out-of-order core, the load queue (LQ), the store queue (SQ), and the store buffer (SB) are responsible for ensuring: i) correct forwarding of stores to loads and ii) correct ordering among loads (with respect to external stores). The first requirement safeguards the sequential semantics of program execution and applies to both serial and parallel code; the second requirement safeguards the semantics of coherence and consistency (e.g., TSO). In particular, loads search the SQ/SB for the latest value that may have been produced by a store, and stores and invalidations search the LQ to find speculative loads in case they violate uniprocessor or multiprocessor ordering. To meet timing constraints the LQ and SQ/SB system is composed of CAM structures that are frequently searched. This results in high complexity, cost, and significant difficulty to scale, but is the current state of the art. Prior research demonstrated the feasibility of a non-associative LQ by replaying loads at commit. There is a steep cost however: a significant increase in L1 accesses and contention for L1 ports. This is because prior work assumes Sequential Consistency and completely ignores the existence of a SB in the system. In contrast, we intentionally delay stores in the SB to achieve a total management of stores and loads in a core, while still supporting TSO. Our main result is that we eliminate the LQ without burdening the L1 with extra accesses. Store forwarding is achieved by delaying our own stores until speculatively issued loads are validated on commit, entirely in-core; TSO load -> load ordering is preserved by delaying remote external stores in their SB until our own speculative reordered loads commit. While the latter is inspired by recent work on non-speculative load reordering, our contribution here is to show that this can be accomplished without having a load queue. Eliminating the LQ results in both energy savings and performance improvement from the elimination of LQ-induced stalls.

  • 279. Ros, Alberto
    et al.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics2017In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 28, no 12, p. 3413-3425Article in journal (Refereed)
  • 280. Ros, Alberto
    et al.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Sakalis, Christos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics2016In: Proc. International Conference on Parallel Architectures and Compilation: PACT 2016, New York: ACM Press, 2016, p. 433-434Conference paper (Refereed)
  • 281.
    Rümmer, Philipp
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hojjat, Hossein
    Kuncak, Viktor
    On recursion-free Horn clauses and Craig interpolation2015In: Formal methods in system design, ISSN 0925-9856, E-ISSN 1572-8102, Vol. 47, no 1, p. 1-25Article in journal (Refereed)
  • 282.
    Rümmer, Philipp
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Characterization of simulation by probabilistic testing2016In: Theory and Practice of Formal Methods, Springer, 2016, p. 360-372Chapter in book (Refereed)
  • 283.
    Sakalis, Christos
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Leonardsson, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaxiras, Stefanos
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Architecture and Computer Communication.
    Ros, Alberto
    Splash-3: A properly synchronized benchmark suite for contemporary research2016In: Proc. International Symposium on Performance Analysis of Systems and Software: ISPASS 2016, IEEE Computer Society, 2016, p. 101-111Conference paper (Refereed)
    Abstract [en]

    Benchmarks are indispensable in evaluating the performance implications of new research ideas. However, their usefulness is compromised if they do not work correctly on a system under evaluation or, in general, if they cannot be used consistently to compare different systems. A well-known benchmark suite of parallel applications is the Splash-2 suite. Since its creation in the context of the DASH project, Splash-2 benchmarks have been widely used in research. However, Splash-2 was released over two decades ago and does not adhere to the recent C memory consistency model. This leads to unexpected and often incorrect behavior when some Splash-2 benchmarks are used in conjunction with contemporary compilers and hardware (simulated or real). Most importantly, we discovered critical performance bugs that may question some of the reported benchmark results. In this work, we analyze the Splash-2 benchmarks and expose data races and related performance bugs. We rectify the problematic benchmarks and evaluate the resulting performance. Our work contributes to the community a new sanitized version of the Splash-2 benchmarks, called the Splash-3 benchmark suite.

  • 284.
    Sathyamoorthy, Peramanathan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Ngai, Edith C.-H.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hu, Xiping
    Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China.; Chinese Univ Hong Kong, Shatin, Hong Kong, Peoples R China.
    Leung, Victor C. M.
    Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada.
    Profiling energy efficiency and data communications for mobile Internet of Things2017In: Wireless Communications & Mobile Computing, ISSN 1530-8669, E-ISSN 1530-8677, Vol. 17, article id 6562915Article in journal (Refereed)
  • 285.
    Sathyamoorthy, Peramanathan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Ngai, Edith
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hu, Xiping
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V5Z 1M9, Canada.
    Leung, Victor
    Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V5Z 1M9, Canada.
    Energy Efficiency as an Orchestration Service for Mobile Internet of Things2015Conference paper (Refereed)
    Abstract [en]

    This paper proposes a novel power management solution for resource-constrained devices in the context of Internet of Things (IoT). We focus on smartphones in the IoT, as they are getting increasingly popular and equipped with strong sensing capabilities. Smartphones have complex and asynchronous power consumption incurred by heterogeneous components including their on-board sensors. Their interaction with the cloud allows them to offload computation tasks and access remote data storage. In this work, we aim at monitoring the power consumption behaviours of the smartphones, profiling both individual applications and the system as a whole, to make better decisions in power management. We design a cloud orchestration architecture as an epic predictor of behaviours of smart devices by extracting their application characteristics and resource utilization. We design and implement this architecture to perform energy profiling and data analysis on massive data logs. This cloud orchestration architecture coordinates a number of cloud-based services and supports dynamic workflows between service components, which can reduce energy consumption in the energy profiling process itself. Experimental results showed that small portion of applications dominate the energy consumption of smartphones. Heuristic profiling can effectively reduce energy consumption in data logging and communications without scarifying the accuracy of power monitoring.

  • 286.
    Schwartz-Narbonne, Daniel
    et al.
    NYU, New York, NY USA..
    Schaef, Martin
    SRI Int, Menlo Pk, CA 94025 USA..
    Jovanovic, Dejan
    SRI Int, Menlo Pk, CA 94025 USA..
    Rümmer, Philipp
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Wies, Thomas
    NYU, New York, NY USA..
    Conflict-Directed Graph Coverage2015In: NASA FORMAL METHODS (NFM 2015), 2015, p. 327-342Conference paper (Refereed)
    Abstract [en]

    Many formal method tools for increasing software reliability apply Satisfiability Modulo Theories (SMT) solvers to enumerate feasible paths in a program subject to certain coverage criteria. Examples include inconsistent code detection tools and concolic test case generators. These tools have in common that they typically treat the SMT solver as a black box, relying on its ability to efficiently search through large search spaces. However, in practice the performance of SMT solvers often degrades significantly if the search involves reasoning about complex control-flow. In this paper, we open the black box and devise a new algorithm for this problem domain that we call conflict-directed graph coverage. Our algorithm relies on two core components of an SMT solver, namely conflict-directed learning and deduction by propagation, and applies domain-specific modifications for reasoning about control-flow graphs. We implemented conflict-directed coverage and used it for detecting code inconsistencies in several large Java open-source projects with over one million lines of code in total. The new algorithm yields significant performance gains on average compared to previous algorithms and reduces the running times on hard search instances from hours to seconds.

  • 287.
    Sembrant, Andreas
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hiding and Reducing Memory Latency: Energy-Efficient Pipeline and Memory System Techniques2016Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Memory accesses in modern processors are both far slower and vastly more energy-expensive than the actual computations. To improve performance, processors spend a significant amount of energy and resources trying to hide and reduce the memory latency. To hide the latency, processors use out-order-order execution to overlap memory accesses with independent work and aggressive speculative instruction scheduling to execute dependent instructions back-to-back. To reduce the latency, processors use several levels of caching that keep frequently used data closer to the processor. However, these optimizations are not for free. Out-of-order execution requires expensive processor resources, and speculative scheduling must re-execute instructions on incorrect speculations, and multi-level caching requires extra energy and latency to search the cache hierarchy. This thesis investigates several energy-efficient techniques for: 1) hiding the latency in the processor pipeline, and 2) reducing the latency in the memory hierarchy.

    Much of the inefficiencies of hiding latency in the processor come from two sources. First, processors need several large and expensive structures to do out-of-order execution (instructions queue, register file, etc.). These resources are typically allocated in program order, effectively giving all instructions equal priority. To reduce the size of these expensive resources without hurting performance, we propose Long Term Parking (LTP). LTP parks non-critical instructions before they allocate resources, thereby making room for critical memory accessing instructions to continue and expose more memory-level parallelism. This enables us to save energy by shrinking the resources sizes without hurting performance. Second, when a load's data returns, the load's dependent instructions need to be scheduled and executed. To execute the dependent instructions back-to-back, the processor will speculatively schedule instructions before the processor knows if the input data will be available at execution time. To save energy, we investigate different scheduling techniques that reduce the number of re-executions due to misspeculation.

    The inefficiencies of traditional memory hierarchies come from the need to do level-by-level searches to locate data. The search starts at the L1 cache, then proceeds level by level until the data is found, or determined not to be in any cache, at which point the processor has to fetch the data from main memory. This wastes time and energy for every level that is searched. To reduce the latency, we propose tracking the location of the data directly in a separate metadata hierarchy. This allows us to directly access the data without needing to search. The processor simply queries the metadata hierarchy for the location information about where the data is stored. Separating metadata into its own hierarchy brings a wide range of additional benefits, including flexibility in how we place data storages in the hierarchy, the ability to intelligently store data in the hierarchy, direct access to remote cores, and many other data-oriented optimizations that can leverage our precise knowledge of where data are located.

    List of papers
    1. Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors
    Open this publication in new window or tab >>Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors
    Show others...
    2015 (English)In: Proc. 48th International Symposium on Microarchitecture, 2015Conference paper, Published paper (Refereed)
    Abstract [en]

    Modern processors employ large structures (IQ, LSQ, register file, etc.) to expose instruction-level parallelism (ILP) and memory-level parallelism (MLP). These resources are typically allocated to instructions in program order. This wastes resources by allocating resources to instructions that are not yet ready to be executed and by eagerly allocating resources to instructions that are not part of the application’s critical path.

    This work explores the possibility of allocating pipeline resources only when needed to expose MLP, and thereby enabling a processor design with significantly smaller structures, without sacrificing performance. First we identify the classes of instructions that should not reserve resources in program order and evaluate the potential performance gains we could achieve by delaying their allocations. We then use this information to “park” such instructions in a simpler, and therefore more efficient, Long Term Parking (LTP) structure. The LTP stores instructions until they are ready to execute, without allocating pipeline resources, and thereby keeps the pipeline available for instructions that can generate further MLP.

    LTP can accurately and rapidly identify which instructions to park, park them before they execute, wake them when needed to preserve performance, and do so using a simple queue instead of a complex IQ. We show that even a very simple queue-based LTP design allows us to significantly reduce IQ (64 →32) and register file (128→96) sizes while retaining MLP performance and improving energy efficiency.

    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-272468 (URN)
    Conference
    MICRO 2015, December 5–9, Waikiki, HI
    Projects
    UPMARCUART
    Available from: 2016-01-14 Created: 2016-01-14 Last updated: 2018-01-10
    2. Cost-effective speculative scheduling in high performance processors
    Open this publication in new window or tab >>Cost-effective speculative scheduling in high performance processors
    Show others...
    2015 (English)In: Proc. 42nd International Symposium on Computer Architecture, New York: ACM Press, 2015, p. 247-259Conference paper, Published paper (Refereed)
    Abstract [en]

    To maximize performance, out-of-order execution processors sometimes issue instructions without having the guarantee that operands will be available in time; e.g. loads are typically assumed to hit in the L1 cache and dependent instructions are issued accordingly. This form of speculation - that we refer to as speculative scheduling - has been used for two decades in real processors, but has received little attention from the research community. In particular, as pipeline depth grows, and the distance between the Issue and the Execute stages increases, it becomes critical to issue instructions dependent on variable-latency instructions as soon as possible rather than wait for the actual cycle at which the result becomes available. Unfortunately, due to the uncertain nature of speculative scheduling, the scheduler may wrongly issue an instruction that will not have its source(s) available on the bypass network when it reaches the Execute stage. In that event, the instruction is canceled and replayed, potentially impairing performance and increasing energy consumption. In this work, we do not present a new replay mechanism. Rather, we focus on ways to reduce the number of replays that are agnostic of the replay scheme. First, we propose an easily implementable, low-cost solution to reduce the number of replays caused by L1 bank conflicts. Schedule shifting always assumes that, given a dual-load issue capacity, the second load issued in a given cycle will be delayed because of a bank conflict. Its dependents are thus always issued with the corresponding delay. Second, we also improve on existing L1 hit/miss prediction schemes by taking into account instruction criticality. That is, for some criterion of criticality and for loads whose hit/miss behavior is hard to predict, we show that it is more cost-effective to stall dependents if the load is not predicted critical.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2015
    National Category
    Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-272467 (URN)10.1145/2749469.2749470 (DOI)000380455700020 ()9781450334020 (ISBN)
    Conference
    ISCA 2015, June 13–17, Portland, OR
    Projects
    UPMARCUART
    Available from: 2015-06-13 Created: 2016-01-14 Last updated: 2016-12-05Bibliographically approved
    3. TLC: A tag-less cache for reducing dynamic first level cache energy
    Open this publication in new window or tab >>TLC: A tag-less cache for reducing dynamic first level cache energy
    2013 (English)In: Proceedings of the 46th International Symposium on Microarchitecture, New York: ACM Press, 2013, p. 49-61Conference paper, Published paper (Refereed)
    Abstract [en]

    First level caches are performance-critical and are therefore optimized for speed. To do so, modern processors reduce the miss ratio by using set-associative caches and optimize latency by reading all ways in parallel with the TLB and tag lookup. However, this wastes energy since only data from one way is actually used.

    To reduce energy, phased-caches and way-prediction techniques have been proposed wherein only data of the matching/predicted way is read. These optimizations increase latency and complexity, making them less attractive for first level caches.

    Instead of adding new functionality on top of a traditional cache, we propose a new cache design that adds way index information to the TLB. This allow us to: 1) eliminate ex-tra data array reads (by reading the right way directly), 2) avoid tag comparisons (by eliminating the tag array), 3) later out misses (by checking the TLB), and 4) amortize the TLB lookup energy (by integrating it with the way information). In addition, the new cache can directly replace existing caches without any modication to the processor core or software.

    This new Tag-Less Cache (TLC) reduces the dynamic energy for a 32 kB, 8-way cache by 60% compared to a VIPT cache without aecting performance.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2013
    National Category
    Computer Engineering Computer Systems
    Identifiers
    urn:nbn:se:uu:diva-213236 (URN)10.1145/2540708.2540714 (DOI)978-1-4503-2638-4 (ISBN)
    Conference
    MICRO-46; December 7-11, 2013; Davis, CA, USA
    Projects
    UPMARCCoDeR-MP
    Available from: 2013-12-07 Created: 2013-12-19 Last updated: 2018-01-11Bibliographically approved
    4. The Direct-to-Data (D2D) Cache: Navigating the cache hierarchy with a single lookup
    Open this publication in new window or tab >>The Direct-to-Data (D2D) Cache: Navigating the cache hierarchy with a single lookup
    2014 (English)In: Proc. 41st International Symposium on Computer Architecture, Piscataway, NJ: IEEE Press, 2014, p. 133-144Conference paper, Published paper (Refereed)
    Abstract [en]

    Modern processors optimize for cache energy and performance by employing multiple levels of caching that address bandwidth, low-latency and high-capacity. A request typically traverses the cache hierarchy, level by level, until the data is found, thereby wasting time and energy in each level. In this paper, we present the Direct-to-Data (D2D) cache that locates data across the entire cache hierarchy with a single lookup.

    To navigate the cache hierarchy, D2D extends the TLB with per cache-line location information that indicates in which cache and way the cache line is located. This allows the D2D cache to: 1) skip levels in the hierarchy (by accessing the right cache level directly), 2) eliminate extra data array reads (by reading the right way directly), 3) avoid tag comparisons (by eliminating the tag arrays), and 4) go directly to DRAM on cache misses (by checking the TLB). This reduces the L2 latency by 40% and saves 5-17% of the total cache hierarchy energy.

    D2D´s lower L2 latency directly improves L2 sensitive applications´ performance by 5-14%. More significantly, we can take advantage of the L2 latency reduction to optimize other parts of the microarchitecture. For example, we can reduce the ROB size for the L2 bound applications by 25%, or we can reduce the L1 cache size, delivering an overall 21% energy savings across all benchmarks, without hurting performance.

    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE Press, 2014
    National Category
    Computer Engineering Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-235362 (URN)10.1145/2678373.2665694 (DOI)000343652800012 ()978-1-4799-4394-4 (ISBN)
    Conference
    ISCA 2014, June 14–18, Minneapolis, MN
    Projects
    UPMARCCoDeR-MP
    Available from: 2014-06-14 Created: 2014-10-31 Last updated: 2018-01-11Bibliographically approved
    5. A split cache hierarchy for enabling data-oriented optimizations
    Open this publication in new window or tab >>A split cache hierarchy for enabling data-oriented optimizations
    2017 (English)In: Proc. 23rd International Symposium on High Performance Computer Architecture, IEEE Computer Society, 2017, p. 133-144Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE Computer Society, 2017
    National Category
    Computer Engineering
    Identifiers
    urn:nbn:se:uu:diva-306368 (URN)10.1109/HPCA.2017.25 (DOI)000403330300012 ()978-1-5090-4985-1 (ISBN)
    Conference
    HPCA 2017, February 4–8, Austin, TX
    Projects
    UPMARC
    Available from: 2017-05-08 Created: 2016-10-27 Last updated: 2019-03-08
  • 288.
    Sembrant, Andreas
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Carlson, Trevor E.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Black-Schaffer, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    POSTER: Putting the G back into GPU/CPU Systems Research2017In: 2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, p. 130-131Conference paper (Refereed)
    Abstract [en]

    Modern SoCs contain several CPU cores and many GPU cores to execute both general purpose and highly-parallel graphics workloads. In many SoCs, more area is dedicated to graphics than to general purpose compute. Despite this, the micro-architecture research community primarily focuses on GPGPU and CPU-only research, and not on graphics (the primary workload for many SoCs). The main reason for this is the lack of efficient tools and simulators for modern graphics applications. This work focuses on the GPU's memory traffic generated by graphics. We describe a new graphics tracing framework and use it to both study graphics applications' memory behavior as well as how CPUs and GPUs affect system performance. Our results show that graphics applications exhibit a wide range of memory behavior between applications and across time, and slows down co-running SPEC applications by 59% on average.

  • 289.
    Shrestha, Amendra
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.
    Techniques for analyzing digital environments from a security perspective2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The development of the Internet and social media has exploded in the last couple of years. Digital environments such as social media and discussion forums provide an effective method of communication and are used by various groups in our societies.  For example, violent extremist groups use social media platforms for recruiting, training, and communicating with their followers, supporters, and donors. Analyzing social media is an important task for law enforcement agencies in order to detect activity and individuals that might pose a threat towards the security of the society.

    In this thesis, a set of different technologies that can be used to analyze digital environments from a security perspective are presented. Due to the nature of the problems that are studied, the research is interdisciplinary, and knowledge from terrorism research, psychology, and computer science are required. The research is divided into three different themes. Each theme summarizes the research that has been done in a specific area.

    The first theme focuses on analyzing digital environments and phenomena. The theme consists of three different studies. The first study is about the possibilities to detect propaganda from the Islamic State on Twitter.  The second study focuses on identifying references to a narrative containing xenophobic and conspiratorial stereotypes in alternative immigration critic media. In the third study, we have defined a set of linguistic features that we view as markers of a radicalization.

    A group consists of a set of individuals, and in some cases, individuals might be a threat towards the security of the society.  The second theme focuses on the risk assessment of individuals based on their written communication. We use different technologies including machine learning to experiment the possibilities to detect potential lone offenders.  Our risk assessment approach is implemented in the tool PRAT (Profile Risk Assessment Tool).

    Internet users have the ability to use different aliases when they communicate since it offers a degree of anonymity. In the third theme, we present a set of techniques that can be used to identify users with multiple aliases. Our research focuses on solving two different problems: author identification and alias matching. The technologies that we use are based on the idea that each author has a fairly unique writing style and that we can construct a writeprint that represents the author. In a similar manner,  we also use information about when a user communicates to create a timeprint. By combining the writeprint and the timeprint, we can obtain a set of powerful features that can be used to identify users with multiple aliases.

    To ensure that the technologies can be used in real scenarios, we have implemented and tested the techniques on data from social media. Several of the results are promising, but more studies are needed to determine how well they work in reality.

    List of papers
    1. A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
    Open this publication in new window or tab >>A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
    2017 (English)In: 2017 28th International Workshop on Database and Expert Systems Applications (DEXA) / [ed] Tjoa, AM Wagner, RR, IEEE, 2017, p. 1-5Conference paper, Published paper (Other academic)
    Abstract [en]

    In this study we try to identify extreme adopters on a discussion forum using machine learning. An extreme adopter is a user that has adopted a high level of a community-specific jargon and therefore can be seen as a user that has a high degree of identification with the community. The dataset that we consider consists of a Swedish xenophobic discussion forum where we use a machine learning approach to identify extreme adopters using a number of linguistic features that are independent on the dataset and the community. The results indicates that it is possible to separate these extreme adopters from the rest of the discussants on the discussion forum with more than 80% accuracy. Since the linguistic features that we use are highly domain independent, the results indicates that there is a possibility to use this kind of techniques to identify extreme adopters within other communities as well.

    Place, publisher, year, edition, pages
    IEEE, 2017
    Series
    International Workshop on Database and Expert Systems Applications-DEXA, ISSN 1529-4188
    Keywords
    Discussion forums, Support vector machines, Pragmatics, Manuals, Radio frequency, Electronic mail, Social network services
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-351187 (URN)10.1109/DEXA.2017.17 (DOI)000426078300001 ()978-1-5386-1051-0 (ISBN)
    Conference
    28th International Workshop on Database and Expert Systems Applications (DEXA), AUG 28-31, 2017, Lyon3 Univ, Lyon, FRANCE
    Available from: 2018-05-23 Created: 2018-05-23 Last updated: 2019-03-22Bibliographically approved
    2. Identifying warning behaviors of violent lone offenders in written communication
    Open this publication in new window or tab >>Identifying warning behaviors of violent lone offenders in written communication
    2016 (English)In: Proc. 16th ICDM Workshops, IEEE Computer Society, 2016, p. 1053-1060Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE Computer Society, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306943 (URN)10.1109/ICDMW.2016.0152 (DOI)978-1-5090-5910-2 (ISBN)
    Conference
    ICDM Workshop on Social Media and Risk, SOMERIS 2016, December 12, Barcelona, Spain
    Available from: 2017-02-02 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    3. Automatic detection of xenophobic narratives: A case study on Swedish alternative media
    Open this publication in new window or tab >>Automatic detection of xenophobic narratives: A case study on Swedish alternative media
    2016 (English)In: Proc. 14th International Conference on Intelligence and Security Informatics, IEEE, 2016, p. 121-126Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306903 (URN)10.1109/ISI.2016.7745454 (DOI)000390129600021 ()978-1-5090-3865-7 (ISBN)
    Conference
    ISI 2016, September 28–30, Tucson, AZ
    Available from: 2016-11-17 Created: 2016-11-04 Last updated: 2019-03-22Bibliographically approved
    4. Linguistic analysis of lone offender manifestos
    Open this publication in new window or tab >>Linguistic analysis of lone offender manifestos
    2016 (English)In: Proc. 4th International Conference on Cybercrime and Computer Forensics, IEEE, 2016Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-306941 (URN)10.1109/ICCCF.2016.7740427 (DOI)000390123800007 ()978-1-5090-6096-2 (ISBN)
    Conference
    ICCCF 2016, June 12–14, Vancouver, Canada
    Available from: 2016-11-17 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    5. Detecting multipliers of jihadism on twitter
    Open this publication in new window or tab >>Detecting multipliers of jihadism on twitter
    2015 (English)In: Proc. 15th ICDM Workshops, IEEE Computer Society, 2015, p. 954-960Conference paper, Published paper (Refereed)
    Abstract [en]

    Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. In this paper we describe a first step towards automatically classifying twitter user accounts (tweeps) as supporters of jihadist groups who disseminate propaganda content online. We use a machine learning approach with two set of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and that can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added.

    Place, publisher, year, edition, pages
    IEEE Computer Society, 2015
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-272243 (URN)10.1109/ICDMW.2015.9 (DOI)000380556700127 ()9781467384926 (ISBN)
    External cooperation:
    Conference
    ICDM Workshop on Intelligence and Security Informatics, ISI-ICDM 2015, November 14, Atlantic City, NJ
    Available from: 2015-11-14 Created: 2016-01-12 Last updated: 2019-03-22Bibliographically approved
    6. Detecting multiple aliases in social media
    Open this publication in new window or tab >>Detecting multiple aliases in social media
    2013 (English)In: Proc. 5th International Conference on Advances in Social Networks Analysis and Mining, New York: ACM Press, 2013, p. 1004-1011Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    New York: ACM Press, 2013
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-216568 (URN)10.1145/2492517.2500261 (DOI)978-1-4503-2240-9 (ISBN)
    Conference
    ASONAM 2013, August 25-29, Niagara Falls, Canada
    Funder
    Vinnova
    Available from: 2013-08-29 Created: 2014-01-23 Last updated: 2019-03-22Bibliographically approved
    7. Timeprints for identifying social media users with multiple aliases
    Open this publication in new window or tab >>Timeprints for identifying social media users with multiple aliases
    2015 (English)In: Security Informatics, ISSN 2190-8532, Vol. 4, p. 7:1-11, article id 7Article in journal (Refereed) Published
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-272242 (URN)10.1186/s13388-015-0022-z (DOI)
    Available from: 2015-09-24 Created: 2016-01-12 Last updated: 2019-03-22Bibliographically approved
    8. Multi-domain alias matching using machine learning
    Open this publication in new window or tab >>Multi-domain alias matching using machine learning
    2016 (English)In: Proc. 3rd European Network Intelligence Conference, IEEE, 2016, p. 77-84Conference paper, Published paper (Refereed)
    Abstract [en]

    We describe a methodology for linking aliases belonging to the same individual based on a user's writing style (stylometric features extracted from the user generated content) and her time patterns (time-based features extracted from the publishing times of the user generated content). While most previous research on social media identity linkage relies on matching usernames, our methodology can also be used for users who actively try to choose dissimilar usernames when creating their aliases. In our experiments on a discussion forum dataset and a Twitter dataset, we evaluate the performance of three different classifiers. We use the best classifier (AdaBoost) to evaluate how well it works on different datasets using different features. Experiments show that combining stylometric and time based features yield good results on our synthetic datasets and a small-scale evaluation on real-world blog data confirm these results, yielding a precision over 95%. The use of emotion-related and Twitter-related features yield no significant impact on the results.

    Place, publisher, year, edition, pages
    IEEE, 2016
    National Category
    Computer and Information Sciences
    Identifiers
    urn:nbn:se:uu:diva-306944 (URN)10.1109/ENIC.2016.019 (DOI)000399097600011 ()9781509034550 (ISBN)
    Conference
    ENIC 2016, September 5–7, Wroclaw, Poland
    Available from: 2017-02-02 Created: 2016-11-07 Last updated: 2019-03-22Bibliographically approved
    9. Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
    Open this publication in new window or tab >>Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
    Show others...
    2018 (English)Report (Other academic)
    Place, publisher, year, edition, pages
    Belgium: EUROPOL, 2018. p. 24
    National Category
    Engineering and Technology
    Identifiers
    urn:nbn:se:uu:diva-367346 (URN)
    Note

    This paper was presented at the 2nd European Counter-Terrorism Centre (ECTC) Advisory Groupconference, 17-18 April 2018, at Europol Headquarters, The Hague.

    Available from: 2018-11-30 Created: 2018-11-30 Last updated: 2019-03-22Bibliographically approved
    10. Linguistic markers of a radicalized mind-set among extreme adopters
    Open this publication in new window or tab >>Linguistic markers of a radicalized mind-set among extreme adopters
    2017 (English)In: Proc. 10th ACM International Conference on Web Search and Data Mining, New York: ACM Press, 2017, p. 823-824Conference paper, Published paper (Refereed)
    Abstract [en]

    The words that we use when communicating in social media can reveal how we relate to ourselves and to others. For instance, within many online communities, the degree of adaptation to a community-specific jargon can serve as a marker of identification with the community. In this paper we single out a group of so called extreme adopters of community-specific jargon from the whole group of users of a Swedish discussion forum devoted to the topics immigration and integration. The forum is characterized by a certain xenophobic jargon, and we hypothesize that extreme adopters of this jargon also exhibit certain linguistic features that we view as markers of a radicalized mind-set. We use a Swedish translation of LIWC (linguistic inquiry word count) and find that the group of extreme adopters differs significantly from the whole group of forum users regarding six out of seven linguistic markers of a radicalized mind-set.

    Place, publisher, year, edition, pages
    New York: ACM Press, 2017
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-379919 (URN)10.1145/3018661.3022760 (DOI)978-1-4503-4675-7 (ISBN)
    Conference
    WSDM 2017, 1st International Workshop on Cyber Deviance Detection
    Available from: 2017-02-02 Created: 2019-03-21 Last updated: 2019-04-08Bibliographically approved
  • 290.
    Shrestha, Amendra
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Kaati, Lisa
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. FOI, Stockholm, Sweden..
    Cohen, Katie
    FOI, Stockholm, Sweden..
    A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities2017In: 2017 28th International Workshop on Database and Expert Systems Applications (DEXA) / [ed] Tjoa, AM Wagner, RR, IEEE, 2017, p. 1-5Conference paper (Other academic)
    Abstract [en]

    In this study we try to identify extreme adopters on a discussion forum using machine learning. An extreme adopter is a user that has adopted a high level of a community-specific jargon and therefore can be seen as a user that has a high degree of identification with the community. The dataset that we consider consists of a Swedish xenophobic discussion forum where we use a machine learning approach to identify extreme adopters using a number of linguistic features that are independent on the dataset and the community. The results indicates that it is possible to separate these extreme adopters from the rest of the discussants on the discussion forum with more than 80% accuracy. Since the linguistic features that we use are highly domain independent, the results indicates that there is a possibility to use this kind of techniques to identify extreme adopters within other communities as well.

  • 291.
    Sigurgeirsson, Daniel
    et al.
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Lárusdóttir, Marta
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Hamdaga, Mohammad
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Daniels, Mats
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Jónsson, Björn
    Reykjavik Univ, Sch Comp Sci, Reykjavik, Iceland.
    Learning Outcome Outcomes: An Evaluation of Quality2018In: Proc. 48th ASEE/IEEE Frontiers in Education Conference, Piscataway, NJ: IEEE Press, 2018Conference paper (Refereed)
    Abstract [en]

    Learning outcomes are a standard specification of knowledge, skills and capabilities that a student is expected to acquire by attending a course or a degree program. While, in theory, the process of evaluating learning outcomes appears to be trivial, in practice it is a complicated and daunting process. In this study, we evaluate how learning outcomes can be effectively applied. The work focuses on the quality of both the specification of the learning outcomes and the assessment of whether these outcomes are reached. We discuss different abstraction levels for learning outcomes and the issue of alignment between high-level and low-level learning outcomes. We also address the criteria for assessing whether a student is meeting a learning outcome.

    Our work is focused on project-oriented courses, where assessing learning outcomes is seen as particularly challenging. In particular, we draw on an empirical study focused on systematically collecting key performance indicators of the progress towards achieving learning outcomes. The data gathering was done during the course through in-class questionnaires and individual diary notes, as a complementary process to the traditional observations made by the teacher running the course. This data serves as the basis for understanding how individual students advance towards the stated learning goals. We also conducted a focus group discussion after the course to better understand how to interpret the data collected during the course.

    An important result of our work is forming an understanding and vocabulary regarding learning outcomes and the assessment of how well students meet these learning goals in project-based educational settings. In addition to this, we make the following major contributions:

    • We present a systematic methodology to gauge how well students meet learning outcomes through in-class self-evaluation.
    • We present the results of an empirical study of a process-oriented evaluation of the students' development towards stated learning outcomes.

    We state some lessons learned from this process, that are applicable for designers of project-based courses.

  • 292. Singh, Abhishek
    et al.
    Ekberg, Pontus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Baruah, Sanjoy
    Applying Real-Time Scheduling Theory to the Synchronous Data Flow Model of Computation2017Conference paper (Refereed)
  • 293. Singh, Abhishek
    et al.
    Ekberg, Pontus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Baruah, Sanjoy
    Uniprocessor scheduling of real-time synchronous dataflow tasks2019In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 55, no 1, p. 1-31Article in journal (Refereed)
    Abstract [en]

    The synchronous dataflow graph (SDFG) model is widely used today for modeling real-time applications in safety-critical application domains. Schedulability analysis techniques that are well understood within the real-time scheduling community are applied to the analysis of recurrent real-time workloads that are represented using this model. An enhancement to the standard SDFG model is proposed, which supports the specification of a real-time latency constraint between a specified input and a specified output of an SDFG. A polynomial-time algorithm is derived for representing the computational requirement of each such enhanced SDFG task in terms of the notion of the demand bound function (dbf), which is widely used in real-time scheduling theory for characterizing computational requirements of recurrent processes represented by, e.g., the sporadic task model. By so doing, the extensive dbf-centered machinery that has been developed in real-time scheduling theory for the hard-real-time schedulability analysis of systems of recurrent tasks may be applied to the analysis of systems represented using the SDFG model as well. The applicability of this approach is illustrated by applying prior results from real-time scheduling theory to construct an exact preemptive uniprocessor schedulability test for collections of independent recurrent processes that are each represented using the enhanced SDFG model.

  • 294.
    Stigge, Martin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Combinatorial abstraction refinement for feasibility analysis of static priorities2015In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 51, no 6, p. 639-674Article in journal (Refereed)
  • 295.
    Stigge, Martin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Graph-based models for real-time workload: a survey2015In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 51, no 5, p. 602-636Article in journal (Refereed)
  • 296.
    Sun, Jinghao
    et al.
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China;Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Guan, Nan
    Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Jiang, Xu
    Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China.
    Chang, Shuangshuang
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    Guo, Zhishan
    Univ Florida, Dept Elect & Comp Engn, Gainesville, FL 32611 USA;Missouri Univ Sci & Technol, Dept Comp Sci, Rolla, MO 65409 USA.
    Deng, Qingxu
    Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    Wang, Yi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Liaoning, Peoples R China.
    A Capacity Augmentation Bound for Real-Time Constrained-Deadline Parallel Tasks Under GEDF2018In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, no 11, p. 2200-2211Article in journal (Refereed)
    Abstract [en]

    Capacity augmentation bound is a widely used quantitative metric in theoretical studies of schedulability analysis for directed acyclic graph (DAG) parallel real-time tasks, which not only quantifies the suboptimality of the scheduling algorithms, but also serves as a simple linear-time schedulability test. Earlier studies on capacity augmentation bounds of the sporadic DAG task model were either restricted to a single DAG task or a set of tasks with implicit deadlines. In this paper, we consider parallel tasks with constrained deadlines under global earliest deadline first policy. We first show that it is impossible to obtain a constant bound for our problem setting, and derive both lower and upper bounds of the capacity augmentation bound as a function with respect to the maximum ratio of task period to deadline. Our upper bound is at most 1.47 times larger than the optimal one. We conduct experiments to compare the acceptance ratio of our capacity augmentation bound with the existing schedulability test also having linear-time complexity. The results show that our capacity augmentation bound significantly outperforms the existing linear-time schedulability test under different parameter settings.

  • 297. Sun, Jinghao
    et al.
    Guan, Nan
    Wang, Yang
    Deng, Qingxu
    Zeng, Peng
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Feasibility of fork-join real-time task graph models: Hardness and algorithms2016In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 15, no 1, article id 14Article in journal (Refereed)
  • 298.
    Sun, Jinghao
    et al.
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China.;Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Guan, Nan
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Wang, Yang
    Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    He, Qingqiang
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Wang, Yi
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Real-Time Scheduling and Analysis of OpenMP Task Systems with Tied Tasks2017In: 2017 IEEE Real-Time Systems Symposium (RTSS), IEEE, 2017, p. 92-103Conference paper (Refereed)
    Abstract [en]

    OpenMP is a promising framework for developing parallel real-time software on multi-cores. Although similar to the DAG task model, OpenMP task systems are significantly more difficult to analyze due to constraints posed by the OpenMP specification. An important feature in OpenMP is tied tasks, which must execute on the same thread during the whole life cycle. Although tied tasks enjoy benefits in simplicity and efficiency, it was considered to be not suitable to real-time systems due to its complex behavior. In this paper, we study the real-time scheduling and analysis of OpenMP task systems with tied tasks. First, we show that under the existing scheduling algorithms in OpenMP, tied tasks indeed may lead to extremely bad timing behaviors where the parallel workload is sequentially executed completely. To solve this problem, we proposed a new scheduling algorithm and developed two response time bounds for it, with different trade-off between simplicity and analysis precision. Experiments with both randomly generated OpenMP task systems and realistic OpenMP programs show that the response time bounds obtained by our approach for tied task systems are very close to that of untied tasks.

  • 299.
    Svensson, Maria
    et al.
    Göteborgs universitet.
    Ingerman, Åke
    Göteborgs universitet.
    Berglund, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Teaching and learning system thinking in technology2015In: Plurality and Complementarity of Approaches in Design and Technology Education / [ed] Chatoney, Marjolaine, Marseille: Presses Universitaires de Provence , 2015, p. 404-409Conference paper (Refereed)
    Abstract [en]

    Complex technological systems have emerged during the last decade as an important strand in technology teaching in several national curricula for compulsory school. However, even though understanding the systemic aspects and connected nature of contemporary society, it remains unclear what such understanding entails in detail, and even more unclear what may constitute good teaching. We present the results from a teaching-learning design project on the topic of large societal and complex technological systems, which are seen as constituted of transformation and transport, acting on matter, energy and information. The main results are a suggested and evaluated plan of teaching developed in collaboration with a team of technology teachers, as well as descriptions of how pupils’ system thinking is constituted in terms of four basic aspects: Resource and intention of the system; System component constitution; Process and transformation in components and system; Network character. In total, a teaching plan spanning four lessons was realised in four different classrooms, with classes’ sizes ranging 15 to 25 pupils in the ages 14 and 15. The teaching design progresses through focusing specific parts of various systems, for example the transformation of polluted water to clean water in a water purification plant as part of the water supply system. There is an emphasis on the function of the part in relation to the system on the one hand, and on how the part is and can be realised technically, taking care to relate the latter to what is taken up in other curricular strands of technology. The last part focuses the examination of technological systems as constituted by interacting and meaningful parts, where their network nature may emerge.

  • 300.
    Tang, Yue
    et al.
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Guan, Nan
    Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China..
    Liu, Weichen
    Nanyang Technol Univ, Singapore, Singapore..
    Phan, Linh Thi Xuan
    Yi, Wang
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Northeastern Univ, Shenyang, Liaoning, Peoples R China..
    Revisiting GPC and AND Connector in Real-Time Calculus2017In: 2017 IEEE Real-Time Systems Symposium (RTSS), IEEE, 2017, p. 255-265Conference paper (Refereed)
    Abstract [en]

    Real-Time Calculus (RTC) is a powerful framework for modeling and worst-case performance analysis of networked systems. GPC and AND are two fundamental components in RTC, which model priority-based resource arbitration and synchronization operations, respectively. In this paper, we revisit GPC and AND. For GPC, we develop tighter output arrival curves to more precisely characterize the output event streams. For AND, we first identify a problem in the existing analysis method that may lead to negative values in the output curves, and present corrections to the problem. Then we generalize AND to synchronize more than two input event streams. We implement our new theoretical results and conduct experiments to evaluate their performance. Experiment results show significant improvement of our new methods in analysis precision and efficiency.

345678 251 - 300 of 351
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf