uu.seUppsala universitets publikasjoner
Endre søk
Begrens søket
123456 151 - 200 of 252
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 151.
    Helmisaari, Marc
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Det beroendeframkallande klicket: Engagerande och emotionella icke-spel2015Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
    Abstract [sv]

    En ny spelgenre har ökat i popularitet de senaste fem åren. En spelgenre som faller utanför den klassiska definitionen av spel. En spelgenre vid namn ”Idle Games”. Föreliggande studie handlar om vilka element i dessa spel som får spelaren att fortsätta spela och hur elementen kan analyseras med hjälp av MDA och AARRR ramverken. Data har samlats in från tre populära Idle Games vid namn Cookie Clicker, Clicker Heroes och AdVenture Capatalist. En enkät har också skickats till spelarna av dessa spel för att få en uppfattning om varför spelen är populära. Resultaten har sedan analyserats med olika speldesignteorier för att undersöka vilka spelmekaniker som skapar lusten att spela och varför dessa spel är populära.

  • 152.
    Henriksson, Michael
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
    A Cognitive Work Analysis as Basis for Development of a Compact C2 System to Support Air Surveillance Work2012Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
    Abstract [en]

    This Master of Science thesis is producedat SAAB Security and Defence Solutions.The purpose of the thesis is to analyzehow air surveillance work can be carriedout. This information is then used to givesuggestions for the design of a new systemcontaining only the most essentialfunctionality. This is done by examiningthe available frameworks which can informinterface design and applying a frameworkto analyze work in a complete system usedas the basis of the new Compact C2 system.The second part of the analysis isdirected towards the stripped system(Compact C2) and both parts of theanalysis are used to inform interfacedesign of the Compact C2 system. By usingthe full range of the chosen framework foranalysis of the identification process inSwedish air surveillance work, someessential functions were identified andshould also have support in a Compact C2 system.

  • 153.
    Hermans, Frederik
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Rensfelt, Olof
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Voigt, Thiemo
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Ngai, Edith
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Nordén, Lars-Åke
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Gunningberg, Per
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    SoNIC: Classifying interference in 802.15.4 sensor networks2013Inngår i: Proc. 12th International Conference on Information Processing in Sensor Networks, New York: ACM Press, 2013, s. 55-66Konferansepaper (Fagfellevurdert)
  • 154.
    Hnich, Brahim
    et al.
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Samhällsvetenskapliga fakulteten, Institutionen för informationsvetenskap.
    Kiziltan, Zeynep
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Samhällsvetenskapliga fakulteten, Institutionen för informationsvetenskap.
    Walsh, Toby
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Samhällsvetenskapliga fakulteten, Institutionen för informationsvetenskap.
    Modelling a Balanced Academic Curriculum Problem2002Inngår i: Proceedings CPAIOR 2002, 2002Konferansepaper (Fagfellevurdert)
  • 155.
    Hojjat, Hossein
    et al.
    Cornell Univ, Ithaca, NY 14853 USA..
    Rümmer, Philipp
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    McClurg, Jedidiah
    CU Boulder, Boulder, CO USA..
    Cerny, Pavol
    CU Boulder, Boulder, CO USA..
    Foster, Nate
    Cornell Univ, Ithaca, NY 14853 USA..
    Optimizing Horn Solvers for Network Repair2016Inngår i: Proceedings of the 2016 16Th Conference on Formal Methods In Computer-Aided Design (FMCAD 2016) / [ed] Piskac, R Talupur, M, IEEE , 2016, s. 73-80Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Automatic program repair modifies a faulty program to make it correct with respect to a specification. Previous approaches have typically been restricted to specific programming languages and a fixed set of syntactical mutation techniques-e.g., changing the conditions of if statements. We present a more general technique based on repairing sets of unsolvable Horn clauses. Working with Horn clauses enables repairing programs from many different source languages, but also introduces challenges, such as navigating the large space of possible repairs. We propose a conservative semantic repair technique that only removes incorrect behaviors and does not introduce new behaviors. Our proposed framework allows the user to request the best repairs-it constructs an optimization lattice representing the space of possible repairs, and uses a novel local search technique that exploits heuristics to avoid searching through sub-lattices with no feasible repairs. To illustrate the applicability of our approach, we apply it to problems in software-defined networking (SDN), and illustrate how it is able to help network operators fix buggy configurations by properly filtering undesired traffic. We show that interval and Boolean lattices are effective choices of optimization lattices in this domain, and we enable optimization objectives such as modifying the minimal number of switches. We have implemented a prototype repair tool, and present preliminary experimental results on several benchmarks using real topologies and realistic repair scenarios in data centers and congested networks.

  • 156.
    Homewood, Thomas
    et al.
    Swedish Institute of Computer Science.
    Norström, Christer
    Swedish Institute of Computer Science.
    Gunningberg, Per
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Skitracker: Measuring skiing performance using a body-area network2013Inngår i: Proc. 12th International Conference on Information Processing in Sensor Networks, New York: ACM Press, 2013, s. 319-320Konferansepaper (Fagfellevurdert)
  • 157. Hossain, Adnan
    Synliggörande av provfordonets elsystemstatus2016Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
    Abstract [sv]

    Projektets mål är att utveckla ett eller flera program för att möta de behov systemutvecklarna har. I dagsläget är det väldigt problematiskt att bestämma en lastbils konfiguration sett på elsystem. Det krävs en stor teknisk bakgrund samt att det är väldigt tidskrävande. Detta i sin tur leder till att processen för systemtester dröjer och allt testandet av lastbilar tar längre tid. Lastbilarna som är prototyper byggs efter framtidens behov och kräver kontinuerligt testande. Därav är önskemålen stor kring utvecklandet av ett program som minimerar tidsförlusten och som inte kräver en så stor teknisk bakgrund. Utvecklandet skedde i Microsoft Visual Studio som är ett utvecklarprogram där programmeringsspråket är C\# och asp .net. För hanteringen av data gällande bilarna så användes databaser där dataflödet styrdes med hjälp av frågespråket SQL. Resultatet av projektet vart en webbapplikation och en uppgradering av ett befintligt program. I detta skede är webbapplikationen och det tillhörande programmet ute på testning bland Scania anställda och framtida uppdateringar och justeringar är inplanerade.

  • 158.
    Jacobsson, Martin
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Orfanidis, Charalampos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Using software-defined networking principles for wireless sensor networks2015Inngår i: Proc. 11th Swedish National Computer Networking Workshop, 2015Konferansepaper (Fagfellevurdert)
  • 159.
    Johansson, Magnus
    et al.
    Stockholms universitet, Institutionen för data- och systemvetenskap.
    Verhagen, Harko
    Stockholms universitet, Institutionen för data- och systemvetenskap.
    Massively multiple online role playing games as normative multiagent systems2009Inngår i: Normative Multi-Agent Systems, Guido Boella, Pablo Noriega, Gabriella Pigozzi, and Harko Verhagen , 2009Konferansepaper (Annet vitenskapelig)
  • 160.
    Jonsson, Kristoffer
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Tekniska sektionen, Institutionen för teknikvetenskaper.
    Lundberg, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Tekniska sektionen, Institutionen för teknikvetenskaper.
    Digital Interface for Intelligent Sensors2013Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
    Abstract [en]

    Digital Interface for Intelligent Sensors was a project whose goal was to create a digital network interface that enabled easy distribution of data from different types of digital sensors to a central computer. The purpose was to replace the already existing analogue data collection system, in order to benefit from the advantages of digital communication. This demanded a software protocol that satisfyingly would be implementable on a microcontroller. Along with software implementation the specific objective was to design, construct and build an intelligent hardware sensor device. This device was supposed to measure temperature, humidity, wind direction and wind speed by collecting information from adequate digital transducers.

    The project involved researches about bus-protocols as well as practically design and build circuits. A lot of software programming was made during the project, to get the device to work as expected. During research the Modbus-protocol was found to be the best option for our specific software needs. As for the hardware part, the core of the sensor device was based on an ATmega328 microcontroller. The ATmega328 proved to be a suitable hardware platform for implementing both the Modbus-protocol and the necessary code required to extract information from the transducers. By linking a computer to the system, working as a master, weather data from the device were able to be logged.

    The device was successfully installed on the roof at Ångströmslaboratoriet, house 2. The complete system enables other digital, Modbus implemented, devices to connect in order to communicate with the central computer. Having many devices can lead to rather complex systems. The system created in this project keeps track on all the installed devices using addresses, making a complex system easy to manage.

    The project also involved a brief collaboration with another group constructing a different digital measuring device. This device was able to connect to the system using the same Modbus-protocol and thereby communicating with the central computer.

  • 161.
    Jouet, Antoine
    et al.
    University of Angers.
    Gac, Pierre
    University of Angers.
    Hayashi, Masaki
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Bachelder, Steven
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Nakajima, Masayuki
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    When virtual reality meet television: Use of a motor Text-To-Vision adapted for the television2015Inngår i: Proceedings of Art and Science Forum 2015, 2015Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    This paper explains and details an automated TV News Show program, using the Text-To-Vision (T2V) technology. Today, 3D CG environments are more and more often used, even in the classic media like TV. However, there is not any fully virtual TV News Show coming yet, staring only virtual characters with being completely automated, using news source available on the Internet. We made it possible to create this kind of automatic news show system owing to the T2V, with interactive avatars, facial expressions and multiple modular and dynamic scenes.

  • 162.
    Kameoka, Masahiro
    et al.
    Tokyo University of Science.
    Furukawa, Toshihiro
    Tokyo University of Science.
    Hayashi, Masaki
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    発見的手法によるWebニュースからのクイズ自動生成の試み2015Inngår i: Proceedings of Art and Science Forum 2015, 2015Konferansepaper (Annet vitenskapelig)
    Abstract [ja]

    本研究では、コンテンツ制作技術の1つとしてWebコンテンツの再利用を目的に,Webニュース記事を情報源としたクイズコンテンツの自動生成を試みる.これまで行われてきたクイズコンテンツ自動生成の関連研究は主に分析的方法によるものだが、本研究は,元となる文章から人間がクイズを作る時の思考を追ってコンピュータにシミュレートさせる発見的手法を用いている.発見的手法によりクイズの自動生成が可能であるかを実験で示し,考察を行う.

  • 163.
    Kameoka, Masahiro
    et al.
    Tokyo University of Science.
    Hayashi, Masaki
    Högskolan på Gotland, Institutionen för speldesign, teknik och lärande.
    Furukawa, Toshihiro
    Tokyo University of Science.
    Improvement of Automatic BBS Visualization in T2V: Animation considering dialogue structure2013Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    T2V(Text-to-Vision) a technology which is capable of automatic animated movie generation to assist individuals who do not have special knowledge about animation production. This paper shows improvement of the function that animates BBS (Bulletin Board System) with this technology. T2V has a package (2ch convertor) that is capable of animating “2channel” (largest BBS in Japan). The present 2ch convertor, however, does not support dialogue situation based on quotation marks. Therefore, it causes a problem that it cannot produce animation with dialogue. In this paper we propose a method of animation production regarding the conversation structure in BBS to create more natural expression in the animation.

  • 164.
    Kaxiras, Stefanos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Ros, Alberto
    A New Perspective for Efficient Virtual-Cache Coherence2013Inngår i: Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013, s. 535-546Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Coherent shared virtual memory (cSVM) is highly coveted for heterogeneous architectures as it will simplify program- ming across different cores and manycore accelerators. In this context, virtual L1 caches can be used to great advan- tage, e.g., saving energy consumption by eliminating address translation for hits. Unfortunately, multicore virtual-cache coherence is complex and costly because it requires reverse translation for any coherence request directed towards a vir- tual L1. The reason is the ambiguity of the virtual address due to the possibility of synonyms. In this paper, we take a radically different approach than all prior work which is focused on reverse translation. We examine the problem from the perspective of the coherence protocol. We show that if a coherence protocol adheres to certain conditions, it operates effortlessly with virtual caches, without requir- ing reverse translations even in the presence of synonyms. We show that these conditions hold in a new class of simple and efficient request-response protocols that use both self- invalidation and self-downgrade.This results in a new solu- tion for virtual-cache coherence, significantly less complex and more efficient than prior proposals. We study design choices for TLB placement under our proposal and compare them against those under a directory-MESI protocol. Our approach allows for choices that are particularly effective as for example combining all per-core TLBs in a single logical TLB in front of the last level cache. Significant area, energy, and performance benefits ensue as a result of simplifying the entire multicore memory organization. 

  • 165.
    Khan, Muneeb
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Sembrant, Andreas
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Hagersten, Erik
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Low Overhead Instruction-Cache Modeling Using Instruction Reuse Profiles2012Inngår i: International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'12), IEEE Computer Society , 2012, s. 260-269Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Performance loss caused by L1 instruction cache misses varies between different architectures and cache sizes. For processors employing power-efficient in-order execution with small caches, performance can be significantly affected by instruction cache misses. The growing use of low-power multi-threaded CPUs (with shared L1 caches) in general purpose computing platforms requires new efficient techniques for analyzing application instruction cache usage. Such insight can be achieved using traditional simulation technologies modeling several cache sizes, but the overhead of simulators may be prohibitive for practical optimization usage. In this paper we present a statistical method to quickly model application instruction cache performance. Most importantly we propose a very low-overhead sampling mechanism to collect runtime data from the application's instruction stream. This data is fed to the statistical model which accurately estimates the instruction cache miss ratio for the sampled execution. Our sampling method is about 10x faster than previously suggested sampling approaches, with average runtime overhead as low as 25% over native execution. The architecturally-independent data collected is used to accurately model miss ratio for several cache sizes simultaneously, with average absolute error of 0.2%. Finally, we show how our tool can be used to identify program phases with large instruction cache footprint. Such phases can then be targeted to optimize for reduced code footprint.

  • 166.
    Koukos, Konstantinos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datorteknik. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Efficient Execution Paradigms for Parallel Heterogeneous Architectures2016Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. The end of Dennard scaling is threatening the effectiveness of DVFS in future nodes; therefore, new execution paradigms are required to exploit the non-linear relationship between performance and energy efficiency of memory-bound application-regions. To attack this problem, we propose the decoupled access-execute (DAE) paradigm. DAE transforms regions of interest (at program-level) in two coarse-grain phases: the access-phase and the execute-phase, which we can independently DVFS. The access-phase is intended to prefetch the data in the cache, and is therefore expected to be predominantly memory-bound, while the execute-phase runs immediately after the access-phase (that has warmed-up the cache) and is therefore expected to be compute-bound.

    DAE, achieves good energy savings (on average 25% lower EDP) without performance degradation, as opposed to other DVFS techniques. Furthermore, DAE increases the memory level parallelism (MLP) of memory-bound regions, which results in performance improvements of memory-bound applications. To automatically transform application-regions to DAE, we propose compiler techniques to automatically generate and incorporate the access-phase(s) in the application. Our work targets affine, non-affine, and even complex, general-purpose codes. Furthermore, we explore the benefits of software multi-versioning to optimize DAE in dynamic environments, and handle codes with statically unknown access-phase overheads. In general, applications automatically-transformed to DAE by our compiler, maintain (or even exceed in some cases) the good performance and energy efficiency of manually-optimized DAE codes.

    Finally, to ease the programming environment of heterogeneous systems (with integrated GPUs), we propose a novel system-architecture that provides unified virtual memory with low overhead. The underlying insight behind our work is that existing data-parallel programming models are a good fit for relaxed memory consistency models (e.g., the heterogeneous race-free model). This allows us to simplify the coherency protocol between the CPU – GPU, as well as the GPU memory management unit. On average, we achieve 45% speedup and 45% lower EDP over the corresponding SC implementation.

    Delarbeid
    1. Towards more efficient execution: a decoupled access-execute approach
    Åpne denne publikasjonen i ny fane eller vindu >>Towards more efficient execution: a decoupled access-execute approach
    2013 (engelsk)Inngår i: Proc. 27th ACM International Conference on Supercomputing, New York: ACM Press, 2013, s. 253-262Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    The end of Dennard scaling is expected to shrink the range of DVFS in future nodes, limiting the energy savings of this technique. This paper evaluates how much we can increase the effectiveness of DVFS by using a software decoupled access-execute approach. Decoupling the data access from execution allows us to apply optimal voltage-frequency selection for each phase and therefore improve energy efficiency over standard coupled execution.

    The underlying insight of our work is that by decoupling access and execute we can take advantage of the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency, while maintaining good performance. To demonstrate this we built a task based parallel execution infrastructure consisting of: (1) a runtime system to orchestrate the execution, (2) power models to predict optimal voltage-frequency selection at runtime, (3) a modeling infrastructure based on hardware measurements to simulate zero-latency, per-core DVFS, and (4) a hardware measurement infrastructure to verify our model's accuracy.

    Based on real hardware measurements we project that the combination of decoupled access-execute and DVFS has the potential to improve EDP by 25% without hurting performance. On memory-bound applications we significantly improve performance due to increased MLP in the access phase and ILP in the execute phase. Furthermore we demonstrate that our method can achieve high performance both in presence or absence of a hardware prefetcher.

    sted, utgiver, år, opplag, sider
    New York: ACM Press, 2013
    Emneord
    Task-Based Execution, Decoupled Execution, Performance, Energy, DVFS
    HSV kategori
    Forskningsprogram
    Datorteknik
    Identifikatorer
    urn:nbn:se:uu:diva-203239 (URN)10.1145/2464996.2465012 (DOI)978-1-4503-2130-3 (ISBN)
    Konferanse
    ICS 2013, June 10-14, Eugene, OR
    Prosjekter
    LPGPU FP7-ICT-288653UPMARC
    Forskningsfinansiär
    EU, FP7, Seventh Framework Programme, ICT-288653Swedish Research Council
    Tilgjengelig fra: 2013-07-06 Laget: 2013-07-05 Sist oppdatert: 2016-09-02bibliografisk kontrollert
    2. Fix the code. Don't tweak the hardware: A new compiler approach to Voltage–Frequency scaling
    Åpne denne publikasjonen i ny fane eller vindu >>Fix the code. Don't tweak the hardware: A new compiler approach to Voltage–Frequency scaling
    Vise andre…
    2014 (engelsk)Inngår i: Proc. 12th International Symposium on Code Generation and Optimization, New York: ACM Press, 2014, s. 262-272Konferansepaper, Publicerat paper (Fagfellevurdert)
    sted, utgiver, år, opplag, sider
    New York: ACM Press, 2014
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-212778 (URN)978-1-4503-2670-4 (ISBN)
    Konferanse
    CGO 2014, February 15-19, Orlando, FL
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2014-02-19 Laget: 2013-12-13 Sist oppdatert: 2018-01-11bibliografisk kontrollert
    3. Multiversioned decoupled access-execute: The key to energy-efficient compilation of general-purpose programs
    Åpne denne publikasjonen i ny fane eller vindu >>Multiversioned decoupled access-execute: The key to energy-efficient compilation of general-purpose programs
    Vise andre…
    2016 (engelsk)Inngår i: Proc. 25th International Conference on Compiler Construction, New York: ACM Press, 2016, s. 121-131Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    Computer architecture design faces an era of great challenges in an attempt to simultaneously improve performance and energy efficiency. Previous hardware techniques for energy management become severely limited, and thus, compilers play an essential role in matching the software to the more restricted hardware capabilities. One promising approach is software decoupled access-execute (DAE), in which the compiler transforms the code into coarse-grain phases that are well-matched to the Dynamic Voltage and Frequency Scaling (DVFS) capabilities of the hardware. While this method is proved efficient for statically analyzable codes, general purpose applications pose significant challenges due to pointer aliasing, complex control flow and unknown runtime events. We propose a universal compile-time method to decouple general-purpose applications, using simple but efficient heuristics. Our solutions overcome the challenges of complex code and show that automatic decoupled execution significantly reduces the energy expenditure of irregular or memory-bound applications and even yields slight performance boosts. Overall, our technique achieves over 20% on average energy-delay-product (EDP) improvements (energy over 15% and performance over 5%) across 14 bench-marks from SPEC CPU 2006 and Parboil benchmark suites, with peak EDP improvements surpassing 70%.

    sted, utgiver, år, opplag, sider
    New York: ACM Press, 2016
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-283200 (URN)10.1145/2892208.2892209 (DOI)000389808800012 ()9781450342414 (ISBN)
    Konferanse
    CC 2016, March 17–18, Barcelona, Spain
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2016-03-17 Laget: 2016-04-11 Sist oppdatert: 2018-12-03bibliografisk kontrollert
    4. Building Heterogeneous Unified Virtual Memories (UVMs) without the Overhead
    Åpne denne publikasjonen i ny fane eller vindu >>Building Heterogeneous Unified Virtual Memories (UVMs) without the Overhead
    2016 (engelsk)Inngår i: ACM Transactions on Architecture and Code Optimization (TACO), ISSN 1544-3566, E-ISSN 1544-3973, Vol. 13, nr 1, artikkel-id 1Artikkel i tidsskrift (Fagfellevurdert) Published
    Abstract [en]

    This work proposes a novel scheme to facilitate heterogeneous systems with unified virtual memory. Research proposals implement coherence protocols for sequential consistency (SC) between central processing unit (CPU) cores and between devices. Such mechanisms introduce severe bottlenecks in the system; therefore, we adopt the heterogeneous-race-free (HRF) memory model. The use of HRF simplifies the coherency protocol and the graphics processing unit (GPU) memory management unit (MMU). Our protocol optimizes CPU and GPU demands separately, with the GPU part being simpler while the CPU is more elaborate and latency aware. We achieve an average 45% speedup and 45% energy-delay product reduction (20% energy) over the corresponding SC implementation.

    Emneord
    Multicore; heterogeneous coherence; GPU MMU design; virtual coherence protocol; directory-less protocol
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-295765 (URN)10.1145/2889488 (DOI)000373904600001 ()
    Prosjekter
    UPMARC
    Forskningsfinansiär
    EU, FP7, Seventh Framework Programme, FP7-ICT-288653EU, European Research Council, TIN2012-38341-C04-03
    Tilgjengelig fra: 2016-04-05 Laget: 2016-06-09 Sist oppdatert: 2017-11-30bibliografisk kontrollert
  • 167.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Black-Schaffer, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Towards more efficient execution: a decoupled access-execute approach2013Inngår i: Proc. 27th ACM International Conference on Supercomputing, New York: ACM Press, 2013, s. 253-262Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The end of Dennard scaling is expected to shrink the range of DVFS in future nodes, limiting the energy savings of this technique. This paper evaluates how much we can increase the effectiveness of DVFS by using a software decoupled access-execute approach. Decoupling the data access from execution allows us to apply optimal voltage-frequency selection for each phase and therefore improve energy efficiency over standard coupled execution.

    The underlying insight of our work is that by decoupling access and execute we can take advantage of the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency, while maintaining good performance. To demonstrate this we built a task based parallel execution infrastructure consisting of: (1) a runtime system to orchestrate the execution, (2) power models to predict optimal voltage-frequency selection at runtime, (3) a modeling infrastructure based on hardware measurements to simulate zero-latency, per-core DVFS, and (4) a hardware measurement infrastructure to verify our model's accuracy.

    Based on real hardware measurements we project that the combination of decoupled access-execute and DVFS has the potential to improve EDP by 25% without hurting performance. On memory-bound applications we significantly improve performance due to increased MLP in the access phase and ILP in the execute phase. Furthermore we demonstrate that our method can achieve high performance both in presence or absence of a hardware prefetcher.

  • 168.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Black-Schaffer, David
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Spiliopoulos, Vasileios
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Towards Power Efficiency on Task-Based, Decoupled Access-Execute Models2013Inngår i: PARMA 2013, 4th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This work demonstrates the potential of hardware and software optimization to improve theeffectiveness of dynamic voltage and frequency scaling (DVFS). For software, we decouple data prefetch (access) and computation (execute) to enable optimal DVFS selectionfor each phase. For hardware, we use measurements from state-of-the-art multicore processors to accurately model the potential of per-core, zero-latency DVFS. We demonstrate that the combinationof decoupled access-execute and precise DVFS has the potential to decrease EDP by 25-30% without reducing performance.

    The underlying insight in this work is that by decoupling access and execute we can take advantageof the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency. For the memory-bound access phase, where we prefetch data into the cachefrom main memory, we can run at a reduced frequency and voltage without hurting performance. Thereafter, the execute phase can run much faster, thanks to the prefetching of the access phase, and achieve higher performance. This decoupled program behavior allows us to achieve more effective use of DVFS than standard coupled executions which mix data access and compute.

    To understand the potential of this approach, we measure application performance and power consumption on a modern multicore system across a range of frequencies and voltages. From this data we build a model that allows us to analyze the effects of per-core, zero-latency DVFS. The results of this work demonstrate the significant potential for finer-grain DVFS in combination with DVFS-optimized software.

  • 169.
    Koukos, Konstantinos
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Ros, Alberto
    Hagersten, Erik
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Kaxiras, Stefanos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation.
    Building Heterogeneous Unified Virtual Memories (UVMs) without the Overhead2016Inngår i: ACM Transactions on Architecture and Code Optimization (TACO), ISSN 1544-3566, E-ISSN 1544-3973, Vol. 13, nr 1, artikkel-id 1Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work proposes a novel scheme to facilitate heterogeneous systems with unified virtual memory. Research proposals implement coherence protocols for sequential consistency (SC) between central processing unit (CPU) cores and between devices. Such mechanisms introduce severe bottlenecks in the system; therefore, we adopt the heterogeneous-race-free (HRF) memory model. The use of HRF simplifies the coherency protocol and the graphics processing unit (GPU) memory management unit (MMU). Our protocol optimizes CPU and GPU demands separately, with the GPU part being simpler while the CPU is more elaborate and latency aware. We achieve an average 45% speedup and 45% energy-delay product reduction (20% energy) over the corresponding SC implementation.

  • 170.
    Kumar, Rakesh
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. Univ Edinburgh, Edinburgh, Midlothian, Scotland.
    Grot, Boris
    Univ Edinburgh, Edinburgh, Midlothian, Scotland.
    Nagarajan, Vijay
    Univ Edinburgh, Edinburgh, Midlothian, Scotland.
    Blasting Through The Front-End Bottleneck With Shotgun2018Inngår i: ACM Sigplan Notices, 2018, Vol. 53, nr 2, s. 30-42Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The front-end bottleneck is a well-established problem in server workloads owing to their deep software stacks and large instruction working sets. Despite years of research into effective L1-I and BTB prefetching, state-of-the-art techniques force a trade-off between performance and metadata storage costs. This work introduces Shotgun, a BTB-directed front-end prefetcher powered by a new BTB organization that maintains a logical map of an application's instruction footprint, which enables high-efficacy prefetching at low storage cost. To map active code regions, Shotgun precisely tracks an application's global control flow (e.g., function and trap routine entry points) and summarizes local control flow within each code region. Because the local control flow enjoys high spatial locality, with most functions comprised of a handful of instruction cache blocks, it lends itself to a compact region-based encoding. Meanwhile, the global control flow is naturally captured by the application's unconditional branch working set (calls, returns, traps). Based on these insights, Shotgun devotes the bulk of its BTB capacity to branches responsible for the global control flow and a spatial encoding of their target regions. By effectively capturing a map of the application's instruction footprint in the BTB, Shotgun enables highly effective BTB-directed prefetching. Using a storage budget equivalent to a conventional BTB, Shotgun outperforms the state-of-the-art BTB-directed front-end prefetcher by up to 14% on a set of varied commercial workloads.

  • 171.
    Kähkönen, Christian
    Högskolan på Gotland, Institutionen för speldesign, teknik och lärande.
    How to create a 3D character model for a pre-existing live action film, that matches the characteristics of the intellectual property and the visual style of the chosen film2012Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
    Abstract [en]

    My aim is to find out how to create a 3d character model for a pre-existing live action film, give this character characteristics that match the intellectual property and follow the visual style of the chosen film. For my example in this degree project, I chose Disney's adaption of John Carter of Mars.

    I used my own pipeline, which is a collection of work methods from different artists, for the creation of the example 3d character model. Though with a limit of bringing the model through the first two steps, as I focus on the constraint of this thesis work.

    In order to create this model, I researched the universe of John Carter, and the visual style of the film, and from that knowledge I designed a character to create a 3d model of.

    The finished 3d character model of this degree project was then compared to models from the production of John Carter of Mars, both by the author and through a survey to evaluate the result.

  • 172.
    Lampa, Samuel
    et al.
    Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Farmaceutiska fakulteten, Institutionen för farmaceutisk biovetenskap.
    Alvarsson, Jonathan
    Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Farmaceutiska fakulteten, Institutionen för farmaceutisk biovetenskap.
    Spjuth, Ola
    Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Farmaceutiska fakulteten, Institutionen för farmaceutisk biovetenskap. Uppsala universitet, Science for Life Laboratory, SciLifeLab.
    Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles2016Inngår i: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 8, artikkel-id 67Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Predictive modelling in drug discovery is challenging to automate as it often contains multiple analysis steps and might involve cross-validation and parameter tuning that create complex dependencies between tasks. With large-scale data or when using computationally demanding modelling methods, e-infrastructures such as high-performance or cloud computing are required, adding to the existing challenges of fault-tolerant automation. Workflow management systems can aid in many of these challenges, but the currently available systems are lacking in the functionality needed to enable agile and flexible predictive modelling. We here present an approach inspired by elements of the flow-based programming paradigm, implemented as an extension of the Luigi system which we name SciLuigi. We also discuss the experiences from using the approach when modelling a large set of biochemical interactions using a shared computer cluster.

  • 173.
    Lampka, Kai
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    With Real-time Performance Analysis and Monitoring to Timing Predictable Use of Multi-core Architectures2013Inngår i: Runtime Verification, Springer Berlin/Heidelberg, 2013, s. 400-402Konferansepaper (Fagfellevurdert)
  • 174.
    Lampka, Kai
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. Elektrobit Automot, Erlangen, Germany..
    Bondorf, Steffen
    Univ Kaiserslautern, Distributed Comp Syst DISCO Lab, Kaiserslautern, Germany..
    Schmitt, Jens B.
    Univ Kaiserslautern, Distributed Comp Syst DISCO Lab, Kaiserslautern, Germany..
    Guan, Nan
    Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China..
    Wang, Yi
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Generalized Finitary Real-Time Calculus2017Inngår i: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, IEEE, 2017Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Real-time Calculus (RTC) is a non-stochastic queuing theory to the worst-case performance analysis of distributed real-time systems. Workload as well as resources are modelled as piece-wise linear, pseudo-periodic curves and the system under investigation is modelled as a sequence of algebraic operations over these curves. The memory footprint of computed curves increases exponentially with the sequence of operations and RTC may become computationally infeasible fast. Recently, Finitary RTC has been proposed to counteract this problem. Finitary RTC restricts curves to finite input domains and thereby counteracts the memory demand explosion seen with pseudo periodic curves of common RTC implementations. However, the proof to the correctness of Finitary RTC specifically exploits the operational semantic of the greed processing component (GPC) model and is tied to the maximum busy window size. This is an inherent limitation, which prevents a straight-forward generalization. In this paper, we provide a generalized Finitary RTC that abstracts from the operational semantic of a specific component model and reduces the finite input domains of curves even further. The novel approach allows for faster computations and the extension of the Finitary RTC idea to a much wider range of RTC models.

  • 175.
    Lampka, Kai
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Siegle, Markus
    University of the Federal Forces Germany.
    A Symbolic Approach to the Analysis of Multi-Formalism Markov Reward Models2013Inngår i: Theory and Application of Multi-Formalism Modeling, Pennsylvania: IGI Global, 2013, 1Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    With complex systems and complex requirements being a challenge that designers must face to reach quality results, multi-formalism modeling offers tools and methods that allow modelers to exploit the benefits of different techniques in a general framework intended to address these challenges.

    Theory and Application of Multi-Formalism Modeling boldly explores the importance of this topic by gathering experiences, theories, applications, and solutions from diverse perspectives of those involved with multi-formalism modeling. Professionals, researchers, academics, and students in this field will be able to critically evaluate the latest developments and future directions of multi-formalism research.

  • 176.
    Lantz, Olof
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Fysiska sektionen, Institutionen för fysik och astronomi, Tillämpad kärnfysik.
    Virtualiserad testmiljö: Utvärdering av virtualiseringsprogramvaror2014Independent thesis Basic level (professional degree), 10 poäng / 15 hpOppgave
    Abstract [en]

    Virtualization has increasingly been adopted in the last decade and the usages of virtualized environments are going to be an important part of how computers are used in the nearby future. There are a lot of advantages with virtualization and different methods have been developed to make it as efficient as possible.

    Forsmarks Kraftgrupp were interested in the possibility of taking advantage of virtualization in their testing environment.

    In this report, hypervisors of type 1 and type 2 and containers have been evaluated to determine which method and what program is preferable on a server cluster of four HP ProLiant DL380 Generation 4. Because of the hardware specifications of the DL380, focus has been on virtualization programs that do not require hardware assisted virtualization.

    The results show that it is possible to use some of the type 2 hypervisors on the HP ProLiant DL380 Generation 4. The suggested virtualization programs are VMware Workstation or Oracle VirtualBox. 

  • 177.
    Lind, Simon
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Biologiska sektionen, Institutionen för biologisk grundutbildning.
    Distributed Ensemble Learning With Apache Spark2016Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
  • 178.
    Lindén, Jonatan
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Jonsson, Bengt
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention2013Inngår i: OPODIS 2013: 17th International Conference On Principles Of DIstributed Systems / [ed] Roberto Baldoni, Nicolas Nisse, Maarten van Steen, Berlin: Springer Berlin/Heidelberg, 2013, s. 206-220Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Priority queues are fundamental to many multiprocessor  applications. Several priority queue algorithms based on skiplists  have been proposed, as skiplists allow concurrent accesses to  different parts of the data structure in a simple way. However, for  priority queues on multiprocessors, an inherent bottleneck is the  operation that deletes the minimal element. We present a  linearizable, lock-free, concurrent priority queue algorithm, based  on skiplists, which minimizes the contention for shared memory that  is caused by the DeleteMin operation. The main idea is to  minimize the number of global updates to shared memory that are  performed in one DeleteMin. In comparison with other  skiplist-based priority queue algorithms, our algorithm achieves a  30 - 80% improvement.

  • 179.
    Ljungberg, Jens
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Tekniska sektionen, Institutionen för teknikvetenskaper, Elektricitetslära.
    Evaluation of a Centralized Substation Protection and Control System for HV/MV Substation2018Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
    Abstract [en]

    Today, conventional substation protection and control systems are of a widely distributed character. One substation can easily have as many as 50 data processing points that all perform similar algorithms on voltage and current data. There is also only limited communication between protection devices, and each device is only aware of the bay in which it is installed. With the intent of implementing a substation protection system that is simpler, more efficient and better suited for future challenges, Ellevio AB implemented a centralized system in a primary substation in 2015. It is comprised of five components that each handle one type of duty: Data processing, communication, voltage measurements, current measurements and breaker control. Since its implementation, the centralized system has been in parallel operation with the conventional, meaning that it performs station wide data acquisition, processing and communication, but is unable to trip the station breakers. The only active functionality of the centralized system is the voltage regulation. This work is an evaluation of the centralized system and studies its protection functionality, voltage regulation, fault response and output signal correlation with the conventional system. It was found that the centralized system required the implementation of a differential protection function and protection of the capacitor banks and busbar coupling to provide protection equivalent to that of the conventional system. The voltage regulation showed unsatisfactory long regulation time lengths, which could have been a result of low time resolution. The fault response and signal correlation were deemed satisfactory.

  • 180.
    Lundstedt, Magnus
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Teknisk-naturvetenskapliga fakulteten.
    Implementation and Evaluation of Image Retrieval Method Utilizing Geographic Location Metadata2009Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
    Abstract [en]

    Multimedia retrieval systems are very important today with millions of content creators all over the world generating huge multimedia archives. Recent developments allows for content based image and video retrieval. These methods are often quite slow, especially if applied on a library of millions of media items.

    In this research a novel image retrieval method is proposed, which utilizes spatial metadata on images. By finding clusters of images based on their geographic location, the spatial metadata, and combining this information with existing content- based image retrieval algorithms, the proposed method enables efficient presentation of high quality image retrieval results to system users.

    Clustering methods considered include Vector Quantization, Vector Quantization LBG and DBSCAN. Clustering was performed on three different similarity measures; spatial metadata, histogram similarity or texture similarity.

    For histogram similarity there are many different distance metrics to use when comparing histograms. Euclidean, Quadratic Form and Earth Mover’s Distance was studied. As well as three different color spaces; RGB, HSV and CIE Lab. 

  • 181.
    Löscher, Andreas
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datalogi.
    Tsiftes, Nicolas
    Voigt, Thiemo
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Handziski, Vlado
    Efficient and Flexible Sensornet Checkpointing2014Inngår i: Wireless Sensor Networks, volume 8354, 2014, s. -65Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Developing sensornet software is difficult partly because ofthe limited visibility of the system state of deployed nodes. Sensor-net checkpointing is a method that allows developers to save and restore full system state of nodes. We present four extensions to sensornetcheckpointing—compression, binary diffs, selective checkpointing, and checkpoint inspection—that reduce the time required for checkpointing operations considerably, and improve the granularity at which system state can be examined and manipulated down to the variable level. We show through an experimental evaluation that the checkpoint sizes can be reduced by 70%-93%, and the time can be reduced by at least 50% because of these improvements. The reduced time and increased granularity benefits multiple checkpointing use cases, including automated testing, network visualization, and software debugging.

  • 182.
    Mann, I. R.
    et al.
    Univ Alberta, Dept Phys, Edmonton, AB, Canada.
    Di Pippo, S.
    United Nations Off Vienna, Off Outer Space Affairs, Vienna, Austria.
    Opgenoorth, Hermann Josef
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Fysiska sektionen, Institutet för rymdfysik, Uppsalaavdelningen. Univ Leicester, Dept Phys & Astron, Leicester, Leics, England.
    Kuznetsova, M.
    NASA, Goddard Spaceflight Ctr, Greenbelt, MD USA.
    Kendall, D. J.
    Canadian Space Agcy, St Hubert, PQ, Canada.
    International Collaboration Within the United Nations Committee on the Peaceful Uses of Outer Space: Framework for International Space Weather Services (2018-2030)2018Inngår i: Space Weather: The international journal of research and applications, ISSN 1542-7390, E-ISSN 1542-7390, Vol. 16, nr 5, s. 428-433Artikkel i tidsskrift (Annet vitenskapelig)
    Abstract [en]

    Severe space weather is a global threat that requires a coordinated global response. In this Commentary, we review some previous successful actions supporting international coordination between member states in the United Nations (UN) context and make recommendations for a future approach. Member states of the UN Committee on the Peaceful Uses of Outer Space (COPUOS) recently approved new guidelines related to space weather under actions for the long-term sustainability of outer space activities. This is to be followed by UN Conference on the Exploration and Peaceful Uses of Outer Space (UNISPACE)+50, which will take place in June 2018 on the occasion of the fiftieth anniversary of the first UNISPACE I held in Vienna in 1968. Expanded international coordination has been proposed within COPUOS under the UNISPACE+50 process, where priorities for 2018-2030 are to be defined under Thematic Priority 4: Framework for International Space Weather Services. The COPUOS expert group for space weather has proposed the creation of a new International Coordination Group for Space Weather be implemented as part of this thematic priority. This coordination group would lead international coordination between member states and across international stakeholders, monitor progress against implementation of guidelines and best practices, and promote coordinated global efforts in the space weather ecosystem spanning observations, research, modeling, and validation, with the goal of improved space weather services. We argue that such improved coordination at the international policy level is essential for increasing global resiliency against the threats arising from severe space weather.

  • 183.
    Mohammedsalih, Salah
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Samhällsvetenskapliga fakulteten, Institutionen för informatik och media, Människa-datorinteraktion.
    Mobile Journalism: Using smartphone in journalistic work2017Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
    Abstract [en]

    Mobile phones have had a drastic influence on media production, by providing a ubiquitous connection. This revolution has come about when smartphone turned into a powerful tool to do almost all the production-related work that was done previously by specialized equipment and computers. This has encouraged ordinary individuals to involve in media work and emerging the phenomenon of mobile journalism, where citizens and individuals can engage in journalism work carry out a job that was supposed to be done only by journalists for a long time ago. We are talking about hundreds of thousands of prosumers and amateurs who are making and covering news by their smartphones and contributing to journalism work. This has become particularly apparent in relation to reporting from remote and risky areas, where journalists cannot reach easily or may not arrive on time while important events occur. This was obvious during the Arab-spring - The role of smartphones in feeding both social media and traditional media with instant photos and videos taken by protesters themselves. This thesis focuses on the role of the smartphone in facilitating the work of journalists.

    As a part of the literature review, the author has gone through many texts, watched videos and listened to radio shows with journalists and workers in media spheres, in which journalists talk about their own experience with practicing mobile journalism. Then from a phenomenological perspective and framework the experience of technology and user aspects of mobile journalism are investigated. As the aim of this thesis is not to validate a hypothesis or a theory, a qualitative research method is used to come to an evaluation and explanation of the phenomenon of using mobile in journalism. For that purpose, several qualitative methods have been used to collect data such as auto-ethnography, observation, interviews and focus groups. The data are collected mainly from Kurdistan region in northern Iraq where journalists were covering news of war in dangerous and risky battle fields.  

    The findings from the results showed that the main factors that make smartphones powerful tools for journalists are: the low budget required for acquiring a smartphone compared to expensive equipment used in traditional media, the freedom and independence that a mobile can give to a journalist, the design aspects which provide a pocket-size tool with unsuspiciousness feature that make it possible to be carried and used even in areas where journalistic work is not allowed. The ubiquity feature of mobile has helped to cover news in areas where traditional media cannot be existing or cannot reach easily. The ability of individuals to obtain a smartphone in one hand and the universal design of mobile in another hand have helped to be used in journalism work by many people with no necessary training courses. This situation has created a good opportunity for media institutions and TV stations to expand their correspondents’ network all over the countries.

  • 184.
    Mohaqeqi, Morteza
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Abdullah, Jakaria
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Ekberg, Pontus
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Yi, Wang
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Refinement of workload models for engine controllers by state space partitioning2017Inngår i: 29th Euromicro Conference on Real-Time Systems: ECRTS 2017, Dagstuhl, Germany: Leibniz-Zentrum für Informatik , 2017, s. 11:1-22Konferansepaper (Fagfellevurdert)
  • 185.
    Mottola, Luca
    et al.
    SICS and Politecnico di Milano.
    Voigt, Thiemo
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Picco, G. P.
    Univ of Trento.
    Electronically-switched Directional Antennas for Wireless Sensor Networks: A Full-stack Evaluation2013Inngår i: IEEE SECON, 2013Konferansepaper (Fagfellevurdert)
  • 186.
    Mustini, Jeton
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
    Development of a cloud service and a mobile client that visualizes business data stored in Microsoft Dynamics CRM2015Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
    Abstract [en]

    In this master thesis a prototype application is developed to help decision makers analyze data and present it so that decision makers can make business decisions more easily. The application consists of a client application, a cloud service, and a Microsoft Dynamics CRM system. The client application is developed as a Windows Store App, and the cloud service is developed as a web application using ASP.NET Web API. From the client users can connect to the cloud service by providing a set of user credentials. These credentials are then used against the users Microsoft Dynamics CRM server to retrieve business data. Data is modeled in a component on the cloud service to useful information defined by key performance indicators. The user's hierarchical organization structure is also replicated in the cloud service to enable users to drill-down forward and backward between organizational units and view their key performance indicators. These key performance indicators are finally returned to the client and presented on a dashboard using interactive charts

  • 187.
    nakajima, masayuki
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Current Topics in Computer Graphics;: Report of SIGGRAPH20132013Inngår i: ITE Technical Report, ITE , 2013, s. 13-20Konferansepaper (Annet (populærvitenskap, debatt, mm))
    Abstract [en]

    CG ,Human Interface, Multimedia, Virtual Reality  Technology  are improved rapidly these days for many kind1s of  fields  in entertainment like movie ,TV and  game and Visualization in Engineering ,Science and Art etc..  I report current topics in 40  SIGGRAPH2013 conference  in Anahaim  Convention Center , Calfolnia.

  • 188.
    Nakajima, Masayuki
    Högskolan på Gotland, Institutionen för speldesign, teknik och lärande.
    Intelligent CG Making Technology and Intelligent Media2013Inngår i: ITE Transactions on Media Technology and Applications, ISSN 2186-7364, Vol. 1, nr 1, s. 20-26Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In this invited research paper, I will describe the Intelligent CG Making Technology, (ICGMT) productionmethodology and Intelligent Media (IM). I will begin with an explanation of the key aspects of theICGMT and a definition of IM. Thereafter I will explain the three approaches of the ICGMT. These approachesare the reuse of animation data, the making animation from text, and the making animation from natural spokenlanguage. Finally, I will explain current approaches of the ICGMT under development by the Nakajima laboratory.

  • 189.
    nakajima, masayuki
    et al.
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Chang, Youngha
    Tokyo City University.
    Mukai, Nobuhiko
    Tokyo City University.
    Color Similarity Metric Based on Categorical Color Perception2013Inngår i: ITE journal, ISSN 1342-6893, Vol. 67, nr 3, s. 116-119Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The calculation of color difference is one of the most basic techniques in image processing fields. For example, color clustering and edge detection are the first steps of most image processes and we compute them by using a color difference formula. Although the CIELAB color difference formula is a commonly used one, the results obtained with it are not in accordance with human feelings when the color difference becomes large. In this paper, we have performed psychophysical experiments on color similarity between colors that have large color differences. We have then analyzed the results and found that the similarity is strongly restricted by the basic color categories. In accordance with this result, we propose a new color similarity metric based on the CIEDE2000 color difference formula and categorical color perception.

  • 190.
    nakajima, masayuki
    et al.
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Miyai, Ayumi
    Tokyo University.
    Yamaguchi, Yasushi
    Tokyo University.
    How to  Evaluate Learning Outcomes of Stereoscopic 3D Computer Graphics by Scene Rendering2013Inngår i: ITE Technical Report: Vol.37,No.45,ME2013-117, ITE , 2013, s. 21-24Konferansepaper (Annet (populærvitenskap, debatt, mm))
    Abstract [en]

     Use of stereoscopic 3DCG (S3DCG) is increasing in movies, games and animations. However, a method for objectively evaluating production capability has not been established. If possible to measure the production capability on basis of certain criteria, unified evaluation can be useful at school. In addition, it is useful to human resource development and adoption of enterprise. Therefore, the experiment conducted practical tests using 3DCG software to making a scene of S3DCG. The practical tests were carried out before and after subjects learning. As a result, it was able to measure improvement of subject's capability after learning, and the difference in capability between subjects. In this paper, we will report on the experimental method and results.

  • 191.
    nakajima, masayuki
    et al.
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Ono, Smiaki
    Alexis, Andre
    Chang, Youngha
    Tokyo City University.
    Automatic Generation of LEGO from the Polygonal data2013Inngår i: IWAIT2013 Nagoya, 2013, s. 262-267Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this work, we propose a method that converts a 3D

    polygonal model into a corresponding LEGO brick assembly. For

    this, we first convert the polygonal model into the voxel model,

    and then convert it to the brick representation. The difficulty lies

    in the connection between bricks should be guaranteed. To

    achieve this, we define replacement priority, and the conversion

    from voxel to brick representation is done according to this priority.

    We show some experimental results, which show that our

    method can keep the connection, and achieve a robust and optimized

    method for assembling LEGO building bricks.

  • 192.
    nakajima, masayuki
    et al.
    Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Historisk-filosofiska fakulteten, Institutionen för speldesign.
    Ono, Sumiaki
    Chang, Yang
    Tokyo City University.
    Andre, Alexis
    LEGO Builder: Automatic Generation of LEGO Assembly Manual from 3D Polygon Model2013Inngår i: ITE English Journal, ISSN 1342-6893, Vol. 1, nr 4, s. 354-360Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The LEGO brick system is one of the most popular toys in the world. It can stimulate one’s creativity while

    being lots of fun. It is however very hard for the naive user to assemble complex models without instructions. In this work,

    we propose a method that converts 3D polygonal models into LEGO brick building instructions automatically. The most

    important part of the conversion is that the connectivity between the bricks should be assured. For this, we introduce a

    graph structure named ”legograph” that allows us to generate physically sound models that do not fall apart by managing

    the connections between the bricks. We show some experimental results and evaluation results. These show that the 3D

    brick models generated following the instructions generated by our method do not fall apart and that one can learn how to

    efficiently build 3D structures from our instructions.

  • 193.
    Ngo, Tuan-Phong
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datorteknik.
    Model Checking of Software Systems under Weak Memory Models2019Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    When a program is compiled and run on a modern architecture, different optimizations may be applied to gain in efficiency. In particular, the access operations (e.g., read and write) to the shared memory may be performed in an out-of-order manner, i.e., in a different order than the order in which the operations have been issued by the program. The reordering of memory access operations leads to efficient use of instruction pipelines and thus an improvement in program execution times. However, the gain in this efficiency comes at a price. More precisely, programs running under modern architectures may exhibit unexpected behaviors by programmers. The out-of-order execution has led to the invention of new program semantics, called weak memory model (WMM). One crucial problem is to ensure the correctness of concurrent programs running under weak memory models.

    The thesis proposes three techniques for reasoning and analyzing concurrent programs running under WMMs. The first one is a sound and complete analysis technique for finite-state programs running under the TSO semantics (Paper II). This technique is based on a novel and equivalent semantics for TSO, called Dual TSO semantics, and on the use of well-structured transition framework. The second technique is an under-approximation technique that can be used to detect bugs under the POWER semantics (Paper III). This technique is based on bounding the number of contexts in an explored execution where, in each context, there is only one active process. The third technique is also an under-approximation technique based on systematic testing (a.k.a. stateless model checking). This approach has been used to develop an optimal and efficient systematic testing approach for concurrent programs running under the Release-Acquire semantics (Paper IV).

    The thesis also considers the problem of effectively finding a minimal set of fences that guarantees the correctness of a concurrent program running under WMMs (Paper I). A fence (a.k.a. barrier) is an operation that can be inserted in the program to prohibit certain reorderings between operations issued before and after the fence. Since fences are expensive, it is crucial to automatically find a minimal set of fences to ensure the program correctness. This thesis presents a method for automatic fence insertion in programs running under the TSO semantics that offers the best-known trade-off between the efficiency and optimality of the algorithm. The technique is based on a novel notion of correctness, called Persistence, that compares the behaviors of a program running under WMMs to that running under the SC semantics.

    Delarbeid
    1. The Best of Both Worlds: Trading efficiency and optimality in fence insertion for TSO
    Åpne denne publikasjonen i ny fane eller vindu >>The Best of Both Worlds: Trading efficiency and optimality in fence insertion for TSO
    2015 (engelsk)Inngår i: Programming Languages and Systems: ESOP 2015, Springer Berlin/Heidelberg, 2015, s. 308-332Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    We present a method for automatic fence insertion in concurrent programs running under weak memory models that provides the best known trade-off between efficiency and optimality. On the one hand, the method can efficiently handle complex aspects of program behaviors such as unbounded buffers and large numbers of processes. On the other hand, it is able to find small sets of fences needed for ensuring correctness of the program. To this end, we propose a novel notion of correctness, called persistence, that compares the behavior of the program under the weak memory semantics with that under the classical interleaving (SC) semantics. We instantiate our framework for the Total Store Ordering (TSO) memory model, and give an algorithm that reduces the fence insertion problem under TSO to the reachability problem for programs running under SC. Furthermore, we provide an abstraction scheme that substantially increases scalability to large numbers of processes. Based on our method, we have implemented a tool and run it successfully on a wide range benchmarks.

    sted, utgiver, år, opplag, sider
    Springer Berlin/Heidelberg, 2015
    Serie
    Lecture Notes in Computer Science, ISSN 0302-9743 ; 9032
    Emneord
    weak memory, correctness, verification, TSO, concurrent program
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-253645 (URN)10.1007/978-3-662-46669-8_13 (DOI)000361751400013 ()978-3-662-46668-1 (ISBN)
    Konferanse
    24th European Symposium on Programming, ESOP 2015, April 11–18, London, UK
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2015-05-29 Laget: 2015-05-29 Sist oppdatert: 2018-11-21
    2. A load-buffer semantics for total store ordering
    Åpne denne publikasjonen i ny fane eller vindu >>A load-buffer semantics for total store ordering
    2018 (engelsk)Inngår i: Logical Methods in Computer Science, ISSN 1860-5974, E-ISSN 1860-5974, Vol. 14, nr 1, artikkel-id 9Artikkel i tidsskrift (Fagfellevurdert) Published
    Abstract [en]

    We address the problem of verifying safety properties of concurrent programs running over the Total Store Order (TSO) memory model. Known decision procedures for this model are based on complex encodings of store buffers as lossy channels. These procedures assume that the number of processes is fixed. However, it is important in general to prove the correctness of a system/algorithm in a parametric way with an arbitrarily large number of processes. 

    In this paper, we introduce an alternative (yet equivalent) semantics to the classical one for the TSO semantics that is more amenable to efficient algorithmic verification and for the extension to parametric verification. For that, we adopt a dual view where load buffers are used instead of store buffers. The flow of information is now from the memory to load buffers. We show that this new semantics allows (1) to simplify drastically the safety analysis under TSO, (2) to obtain a spectacular gain in efficiency and scalability compared to existing procedures, and (3) to extend easily the decision procedure to the parametric case, which allows obtaining a new decidability result, and more importantly, a verification algorithm that is more general and more efficient in practice than the one for bounded instances.

    Emneord
    Verification, TSO, concurrent program, safety property, well-structured transition system
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-337278 (URN)000426512000008 ()
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2018-01-23 Laget: 2017-12-21 Sist oppdatert: 2018-11-21
    3. Context-bounded analysis for POWER
    Åpne denne publikasjonen i ny fane eller vindu >>Context-bounded analysis for POWER
    2017 (engelsk)Inngår i: Tools and Algorithms for the Construction and Analysis of Systems: Part II, Springer, 2017, s. 56-74Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    We propose an under-approximate reachability analysis algorithm for programs running under the POWER memory model, in the spirit of the work on context-bounded analysis initiated by Qadeer et al. in 2005 for detecting bugs in concurrent programs (supposed to be running under the classical SC model). To that end, we first introduce a new notion of context-bounding that is suitable for reasoning about computations under POWER, which generalizes the one defined by Atig et al. in 2011 for the TSO memory model. Then, we provide a polynomial size reduction of the context-bounded state reachability problem under POWER to the same problem under SC: Given an input concurrent program P, our method produces a concurrent program P' such that, for a fixed number of context switches, running P' under SC yields the same set of reachable states as running P under POWER. The generated program P' contains the same number of processes as P and operates on the same data domain. By leveraging the standard model checker CBMC, we have implemented a prototype tool and applied it on a set of benchmarks, showing the feasibility of our approach.

    sted, utgiver, år, opplag, sider
    Springer, 2017
    Serie
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10206
    Emneord
    POWER, weak memory model, under approximation, translation, concurrent program, testing
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-314901 (URN)10.1007/978-3-662-54580-5_4 (DOI)000440733400004 ()978-3-662-54579-9 (ISBN)
    Konferanse
    23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2017, April 22–29, Uppsala, Sweden
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2017-03-31 Laget: 2017-02-07 Sist oppdatert: 2018-11-21bibliografisk kontrollert
    4. Optimal Stateless Model Checking under the Release-Acquire Semantics
    Åpne denne publikasjonen i ny fane eller vindu >>Optimal Stateless Model Checking under the Release-Acquire Semantics
    2018 (engelsk)Inngår i: SPLASH OOPSLA 2018, Boston, Nov 4-9, 2018, ACM Digital Library, 2018Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    We present a framework for efficient application of stateless model checking (SMC) to concurrent programs running under the Release-Acquire (RA) fragment of the C/C++11 memory model. Our approach is based on exploring the possible program orders, which define the order in which instructions of a thread are executed, and read-from relations, which define how reads obtain their values from writes. This is in contrast to previous approaches, which in addition explore the possible coherence orders, i.e., orderings between conflicting writes. Since unexpected test results such as program crashes or assertion violations depend only on the read-from relation, we avoid a potentially large source of redundancy. Our framework is based on a novel technique for determining whether a particular read-from relation is feasible under the RA semantics. We define an SMC algorithm which is provably optimal in the sense that it explores each program order and read-from relation exactly once. This optimality result is strictly stronger than previous analogous optimality results, which also take coherence order into account. We have implemented our framework in the tool Tracer. Experiments show that Tracer can be significantly faster than state-of-the-art tools that can handle the RA semantics.

    sted, utgiver, år, opplag, sider
    ACM Digital Library, 2018
    Emneord
    Software model checking, C/C++11, Release-Acquire, Concurrent program
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-358241 (URN)
    Konferanse
    SPLASH OOPSLA 2018
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2018-08-26 Laget: 2018-08-26 Sist oppdatert: 2019-01-09bibliografisk kontrollert
  • 194.
    Ngo, Tuan-Phong
    et al.
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Abdulla, Parosh
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Jonsson, Bengt
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
    Atig, Mohamed Faouzi
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Optimal Stateless Model Checking under the Release-Acquire Semantics2018Inngår i: SPLASH OOPSLA 2018, Boston, Nov 4-9, 2018, ACM Digital Library, 2018Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a framework for efficient application of stateless model checking (SMC) to concurrent programs running under the Release-Acquire (RA) fragment of the C/C++11 memory model. Our approach is based on exploring the possible program orders, which define the order in which instructions of a thread are executed, and read-from relations, which define how reads obtain their values from writes. This is in contrast to previous approaches, which in addition explore the possible coherence orders, i.e., orderings between conflicting writes. Since unexpected test results such as program crashes or assertion violations depend only on the read-from relation, we avoid a potentially large source of redundancy. Our framework is based on a novel technique for determining whether a particular read-from relation is feasible under the RA semantics. We define an SMC algorithm which is provably optimal in the sense that it explores each program order and read-from relation exactly once. This optimality result is strictly stronger than previous analogous optimality results, which also take coherence order into account. We have implemented our framework in the tool Tracer. Experiments show that Tracer can be significantly faster than state-of-the-art tools that can handle the RA semantics.

  • 195.
    Nikoleris, Nikos
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datorteknik. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Efficient Memory Modeling During Simulation and Native Execution2019Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    Application performance on computer processors depends on a number of complex architectural and microarchitectural design decisions. Consequently, computer architects rely on performance modeling to improve future processors without building prototypes. This thesis focuses on performance modeling and proposes methods that quantify the impact of the memory system on application performance.

    Detailed architectural simulation, a common approach to performance modeling, can be five orders of magnitude slower than execution on the actual processor. At this rate, simulating realistic workloads requires years of CPU time. Prior research uses sampling to speed up simulation. Using sampled simulation, only a number of small but representative portions of the workload are evaluated in detail. To fully exploit the speed potential of sampled simulation, the simulation method has to efficiently reconstruct the architectural and microarchitectural state prior to the simulation samples. Practical approaches to sampled simulation use either functional simulation at the expense of performance or checkpoints at the expense of flexibility. This thesis proposes three approaches that use statistical cache modeling to efficiently address the problem of cache warm up and speed up sampled simulation, without compromising flexibility. The statistical cache model uses sparse memory reuse information obtained with native techniques to model the performance of the cache. The proposed sampled simulation framework evaluates workloads 150 times faster than approaches that use functional simulation to warm up the cache.

    Other approaches to performance modeling use analytical models based on data obtained from execution on native hardware. These native techniques allow for better understanding of the performance bottlenecks on existing hardware. Efficient resource utilization in modern multicore processors is necessary to exploit their peak performance. This thesis proposes native methods that characterize shared resource utilization in modern multicores. These methods quantify the impact of cache sharing and off-chip memory sharing on overall application performance. Additionally, they can quantify scalability bottlenecks for data-parallel, symmetric workloads.

    Delarbeid
    1. Extending statistical cache models to support detailed pipeline simulators
    Åpne denne publikasjonen i ny fane eller vindu >>Extending statistical cache models to support detailed pipeline simulators
    2014 (engelsk)Inngår i: 2014 IEEE International Symposium On Performance Analysis Of Systems And Software (Ispass), IEEE Computer Society, 2014, s. 86-95Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    Simulators are widely used in computer architecture research. While detailed cycle-accurate simulations provide useful insights, studies using modern workloads typically require days or weeks. Evaluating many design points, only exacerbates the simulation overhead. Recent works propose methods with good accuracy that reduce the simulated overhead either by sampling the execution (e.g., SMARTS and SimPoint) or by using fast analytical models of the simulated designs (e.g., Interval Simulation). While these techniques reduce significantly the simulation overhead, modeling processor components with large state, such as the last-level cache, requires costly simulation to warm them up. Statistical simulation methods, such as SMARTS, report that the warm-up overhead accounts for 99% of the simulation overhead, while only 1% of the time is spent simulating the target design. This paper proposes WarmSim, a method that eliminates the need to warm up the cache. WarmSim builds on top of a statistical cache modeling technique and extends it to model accurately not only the miss ratio but also the outcome of every cache request. WarmSim uses as input, an application's memory reuse information which is hardware independent. Therefore, different cache configurations can be simulated using the same input data. We demonstrate that this approach can be used to estimate the CPI of the SPEC CPU2006 benchmarks with an average error of 1.77%, reducing the overhead compared to a simulation with a 10M instruction warm-up by a factor of 50x.

    sted, utgiver, år, opplag, sider
    IEEE Computer Society, 2014
    Serie
    IEEE International Symposium on Performance Analysis of Systems and Software-ISPASS
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-224221 (URN)10.1109/ISPASS.2014.6844464 (DOI)000364102000010 ()978-1-4799-3604-5 (ISBN)
    Konferanse
    ISPASS 2014, March 23-25, Monterey, CA
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2014-05-06 Laget: 2014-05-06 Sist oppdatert: 2018-12-14bibliografisk kontrollert
    2. CoolSim: Statistical Techniques to Replace Cache Warming with Efficient, Virtualized Profiling
    Åpne denne publikasjonen i ny fane eller vindu >>CoolSim: Statistical Techniques to Replace Cache Warming with Efficient, Virtualized Profiling
    2016 (engelsk)Inngår i: Proceedings Of 2016 International Conference On Embedded Computer Systems: Architectures, Modeling And Simulation (Samos) / [ed] Najjar, W Gerstlauer, A, IEEE , 2016, s. 106-115Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    Simulation is an important part of the evaluation of next-generation computing systems. Detailed, cycle-accurate simulation, however, can be very slow when evaluating realistic workloads on modern microarchitectures. Sampled simulation (e.g., SMARTS and SimPoint) improves simulation performance by an order of magnitude or more through the reduction of large workloads into a small but representative sample. Additionally, the execution state just prior to a simulation sample can be stored into checkpoints, allowing for fast restoration and evaluation. Unfortunately, changes in software, architecture or fundamental pieces of the microarchitecture (e.g., hardware-software co-design) require checkpoint regeneration. The end result for co-design degenerates to creating checkpoints for each modification, a task checkpointing was designed to eliminate. Therefore, a solution is needed that allows for fast and accurate simulation, without the need for checkpoints. Virtualized fast-forwarding (VFF), an alternative to using checkpoints, allows for execution at near-native speed between simulation points. Warming the micro-architectural state prior to each simulation point, however, requires functional simulation, a costly operation for large caches (e.g., 8 M B). Simulating future systems with caches of many MBs can require warming of billions of instructions, dominating simulation time. This paper proposes CoolSim, an efficient simulation framework that eliminates cache warming. CoolSim uses VFF to advance between simulation points collecting at the same time sparse memory reuse information (MRI). The MRI is collected more than an order of magnitude faster than functional simulation. At the simulation point, detailed simulation with a statistical cache model is used to evaluate the design. The previously acquired MRI is used to estimate whether each memory request hits in the cache. The MRI is an architecturally independent metric and a single profile can be used in simulations of any size cache. We describe a prototype implementation of CoolSim based on KVM and gem5 running 19 x faster than the state-of-the-art sampled simulation, while it estimates the CPI of the SPEC CPU2006 benchmarks with 3.62% error on average, across a wide range of cache sizes.

    sted, utgiver, år, opplag, sider
    IEEE, 2016
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-322061 (URN)000399143000015 ()9781509030767 (ISBN)
    Konferanse
    International Conference on Embedded Computer Systems - Architectures, Modeling and Simulation (SAMOS), JUL 17-21, 2016, Samos, GREECE
    Forskningsfinansiär
    Swedish Foundation for Strategic Research EU, FP7, Seventh Framework Programme, 610490
    Tilgjengelig fra: 2017-05-16 Laget: 2017-05-16 Sist oppdatert: 2018-12-14bibliografisk kontrollert
    3. Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation
    Åpne denne publikasjonen i ny fane eller vindu >>Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation
    2018 (engelsk)Rapport (Annet vitenskapelig)
    Abstract [en]

    Current practice for accurate and efficient simulation (e.g., SMARTS and Simpoint) makes use of sampling to significantly reduce the time needed to evaluate new research ideas. By evaluating a small but representative portion of the original application, sampling can allow for both fast and accurate performance analysis. However, as cache sizes of modern architectures grow, simulation time is dominated by warming microarchitectural state and not by detailed simulation, reducing overall simulation efficiency. While checkpoints can significantly reduce cache warming, improving efficiency, they limit the flexibility of the system under evaluation, requiring new checkpoints for software updates (such as changes to the compiler and compiler flags) and many types of hardware modifications. An ideal solution would allow for accurate cache modeling for each simulation run without the need to generate rigid checkpointing data a priori.

    Enabling this new direction for fast and flexible simulation requires a combination of (1) a methodology that allows for hardware and software flexibility and (2) the ability to quickly and accurately model arbitrarily-sized caches. Current approaches that rely on checkpointing or statistical cache modeling require rigid, up-front state to be collected which needs to be amortized over a large number of simulation runs. These earlier methodologies are insufficient for our goals for improved flexibility. In contrast, our proposed methodology, Delorean, outlines a unique solution to this problem. The Delorean simulation methodology enables both flexibility and accuracy by quickly generating a targeted cache model for the next detailed region on the fly without the need for up-front simulation or modeling. More specifically, we propose a new, more accurate statistical cache modeling method that takes advantage of hardware virtualization to precisely determine the memory regions accessed and to minimize the time needed for data collection while maintaining accuracy.

    Delorean uses a multi-pass approach to understand the memory regions accessed by the next, upcoming detailed region. Our methodology collects the entire set of key memory accesses and, through fast virtualization techniques, progressively scans larger, earlier regions to learn more about these key accesses in an efficient way. Using these techniques, we demonstrate that Delorean allows for the fast evaluation of systems and their software though the generation of accurate cache models on the fly. Delorean outperforms previous proposals by an order of magnitude, with a simulation speed of 150 MIPS and a similar average CPI error (below 4%).

    Publisher
    s. 12
    Serie
    Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-369320 (URN)
    Tilgjengelig fra: 2018-12-12 Laget: 2018-12-12 Sist oppdatert: 2019-01-08bibliografisk kontrollert
    4. Cache Pirating: Measuring the Curse of the Shared Cache
    Åpne denne publikasjonen i ny fane eller vindu >>Cache Pirating: Measuring the Curse of the Shared Cache
    2011 (engelsk)Inngår i: Proc. 40th International Conference on Parallel Processing, IEEE Computer Society, 2011, s. 165-175Konferansepaper, Publicerat paper (Fagfellevurdert)
    sted, utgiver, år, opplag, sider
    IEEE Computer Society, 2011
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-181254 (URN)10.1109/ICPP.2011.15 (DOI)978-1-4577-1336-1 (ISBN)
    Konferanse
    ICPP 2011
    Prosjekter
    UPMARCCoDeR-MP
    Tilgjengelig fra: 2011-10-17 Laget: 2012-09-20 Sist oppdatert: 2018-12-14bibliografisk kontrollert
    5. Bandwidth Bandit: Quantitative Characterization of Memory Contention
    Åpne denne publikasjonen i ny fane eller vindu >>Bandwidth Bandit: Quantitative Characterization of Memory Contention
    2013 (engelsk)Inngår i: Proc. 11th International Symposium on Code Generation and Optimization: CGO 2013, IEEE Computer Society, 2013, s. 99-108Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    On multicore processors, co-executing applications compete for shared resources, such as cache capacity and memory bandwidth. This leads to suboptimal resource allocation and can cause substantial performance loss, which makes it important to effectively manage these shared resources. This, however, requires insights into how the applications are impacted by such resource sharing. While there are several methods to analyze the performance impact of cache contention, less attention has been paid to general, quantitative methods for analyzing the impact of contention for memory bandwidth. To this end we introduce the Bandwidth Bandit, a general, quantitative, profiling method for analyzing the performance impact of contention for memory bandwidth on multicore machines. The profiling data captured by the Bandwidth Bandit is presented in a bandwidth graph. This graph accurately captures the measured application's performance as a function of its available memory bandwidth, and enables us to determine how much the application suffers when its available bandwidth is reduced. To demonstrate the value of this data, we present a case study in which we use the bandwidth graph to analyze the performance impact of memory contention when co-running multiple instances of single threaded application.

    sted, utgiver, år, opplag, sider
    IEEE Computer Society, 2013
    Emneord
    bandwidth, memory, caches
    HSV kategori
    Forskningsprogram
    Datavetenskap
    Identifikatorer
    urn:nbn:se:uu:diva-194101 (URN)10.1109/CGO.2013.6494987 (DOI)000318700200010 ()978-1-4673-5524-7 (ISBN)
    Konferanse
    CGO 2013, 23-27 February, Shenzhen, China
    Prosjekter
    UPMARC
    Forskningsfinansiär
    Swedish Research Council
    Tilgjengelig fra: 2013-04-18 Laget: 2013-02-08 Sist oppdatert: 2018-12-14bibliografisk kontrollert
    6. A software based profiling method for obtaining speedup stacks on commodity multi-cores
    Åpne denne publikasjonen i ny fane eller vindu >>A software based profiling method for obtaining speedup stacks on commodity multi-cores
    2014 (engelsk)Inngår i: 2014 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS): ISPASS 2014, IEEE Computer Society, 2014, s. 148-157Konferansepaper, Publicerat paper (Fagfellevurdert)
    Abstract [en]

    A key goodness metric of multi-threaded programs is how their execution times scale when increasing the number of threads. However, there are several bottlenecks that can limit the scalability of a multi-threaded program, e.g., contention for shared cache capacity and off-chip memory bandwidth; and synchronization overheads. In order to improve the scalability of a multi-threaded program, it is vital to be able to quantify how the program is impacted by these scalability bottlenecks. We present a software profiling method for obtaining speedup stacks. A speedup stack reports how much each scalability bottleneck limits the scalability of a multi-threaded program. It thereby quantifies how much its scalability can be improved by eliminating a given bottleneck. A software developer can use this information to determine what optimizations are most likely to improve scalability, while a computer architect can use it to analyze the resource demands of emerging workloads. The proposed method profiles the program on real commodity multi-cores (i.e., no simulations required) using existing performance counters. Consequently, the obtained speedup stacks accurately account for all idiosyncrasies of the machine on which the program is profiled. While the main contribution of this paper is the profiling method to obtain speedup stacks, we present several examples of how speedup stacks can be used to analyze the resource requirements of multi-threaded programs. Furthermore, we discuss how their scalability can be improved by both software developers and computer architects.

    sted, utgiver, år, opplag, sider
    IEEE Computer Society, 2014
    Serie
    IEEE International Symposium on Performance Analysis of Systems and Software-ISPASS
    HSV kategori
    Identifikatorer
    urn:nbn:se:uu:diva-224230 (URN)10.1109/ISPASS.2014.6844479 (DOI)000364102000025 ()978-1-4799-3604-5 (ISBN)
    Konferanse
    ISPASS 2014, March 23-25, Monterey, CA
    Prosjekter
    UPMARC
    Tilgjengelig fra: 2014-05-06 Laget: 2014-05-06 Sist oppdatert: 2018-12-14bibliografisk kontrollert
  • 196.
    Nikoleris, Nikos
    et al.
    Arm Research, Cambridge UK.
    Hagersten, Erik
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorarkitektur och datorkommunikation. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    Carlson, Trevor E.
    Department of Computer Science, National University of Singapore.
    Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation2018Rapport (Annet vitenskapelig)
    Abstract [en]

    Current practice for accurate and efficient simulation (e.g., SMARTS and Simpoint) makes use of sampling to significantly reduce the time needed to evaluate new research ideas. By evaluating a small but representative portion of the original application, sampling can allow for both fast and accurate performance analysis. However, as cache sizes of modern architectures grow, simulation time is dominated by warming microarchitectural state and not by detailed simulation, reducing overall simulation efficiency. While checkpoints can significantly reduce cache warming, improving efficiency, they limit the flexibility of the system under evaluation, requiring new checkpoints for software updates (such as changes to the compiler and compiler flags) and many types of hardware modifications. An ideal solution would allow for accurate cache modeling for each simulation run without the need to generate rigid checkpointing data a priori.

    Enabling this new direction for fast and flexible simulation requires a combination of (1) a methodology that allows for hardware and software flexibility and (2) the ability to quickly and accurately model arbitrarily-sized caches. Current approaches that rely on checkpointing or statistical cache modeling require rigid, up-front state to be collected which needs to be amortized over a large number of simulation runs. These earlier methodologies are insufficient for our goals for improved flexibility. In contrast, our proposed methodology, Delorean, outlines a unique solution to this problem. The Delorean simulation methodology enables both flexibility and accuracy by quickly generating a targeted cache model for the next detailed region on the fly without the need for up-front simulation or modeling. More specifically, we propose a new, more accurate statistical cache modeling method that takes advantage of hardware virtualization to precisely determine the memory regions accessed and to minimize the time needed for data collection while maintaining accuracy.

    Delorean uses a multi-pass approach to understand the memory regions accessed by the next, upcoming detailed region. Our methodology collects the entire set of key memory accesses and, through fast virtualization techniques, progressively scans larger, earlier regions to learn more about these key accesses in an efficient way. Using these techniques, we demonstrate that Delorean allows for the fast evaluation of systems and their software though the generation of accurate cache models on the fly. Delorean outperforms previous proposals by an order of magnitude, with a simulation speed of 150 MIPS and a similar average CPI error (below 4%).

  • 197. Noda, Claro
    et al.
    Prabh, Shashi
    Alves, Mario
    Voigt, Thiemo
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik.
    On Packet Size and Error Correction Optimisations in Low-Power Wireless Networks2013Inngår i: IEEE International Conference on Sensing, Communication and Networking (IEEE SECON), 2013Konferansepaper (Fagfellevurdert)
  • 198. Noda, Claro
    et al.
    Prabh, Shashi
    Boano, Carlo Alberto
    Voigt, Thiemo
    Alves, Mário
    Poster abstract: A channel quality metric for interference-aware wireless sensor networks2011Inngår i: IPSN, 2011, s. 167-168Konferansepaper (Fagfellevurdert)
  • 199.
    Norgren, Magnus
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Tekniska sektionen, Institutionen för teknikvetenskaper.
    Wishbone compliant smart Pulse-Width Modulation (PWM) IP: Uppsala Universitet - ÅAC Mictrotec AB2012Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
  • 200.
    Olofsson, Simon
    Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för systemteknik.
    Probabilistic Feature Learning Using Gaussian Process Auto-Encoders2016Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
    Abstract [en]

    The focus of this report is the problem of probabilistic dimensionality reduction and feature learning from high-dimensional data (images). Extracting features and being able to learn from high-dimensional sensory data is an important ability in a general-purpose intelligent system. Dimensionality reduction and feature learning have in the past primarily been done using (convolutional) neural networks or linear mappings, e.g. in principal component analysis. However, these methods do not yield any error bars in the features or predictions. In this report, theory and a model for how dimensionality reduction and feature learning can be done using Gaussian process auto-encoders (GP-AEs) are presented. By using GP-AEs, the variance in the feature space is computed, thus, yielding a measure of the uncertainty in the constructed model. This measure is useful in order to avoid making over-confident system predictions. Results show that GP-AEs are capable of dimensionality reduction and feature learning, but that they suffer from scalability issues and problems with weak gradient signal propagation. Results in reconstruction quality are not as good as those achieved by state-of-the-art methods, and it takes very long to train the model. The model has potential though, since it can scale to large inputs.

123456 151 - 200 of 252
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf