Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Publications (10 of 193) Show all publications
Huber, N. & Wang, Y. (2025). An Encoding of Interaction Nets in OCaml. In: Jörg Endrullis; Dominik Grzelak; Tobias Heindel; Jens Kosiol (Ed.), Proceedings of the Fourteenth and Fifteenth International Workshop on Graph Computation Models: . Paper presented at 15th International Workshop on Graph Computation Models (GCM), July 9, 2024, Enschede, Netherlands (pp. 1-16). Open Publishing Association (417)
Open this publication in new window or tab >>An Encoding of Interaction Nets in OCaml
2025 (English)In: Proceedings of the Fourteenth and Fifteenth International Workshop on Graph Computation Models / [ed] Jörg Endrullis; Dominik Grzelak; Tobias Heindel; Jens Kosiol, Open Publishing Association , 2025, no 417, p. 1-16Conference paper, Published paper (Refereed)
Abstract [en]

Interaction nets constitute a visual programming language grounded in graph transformation. Owing to their distinctive properties, they inherently facilitate parallelism in the rewriting step. This paper showcases a simple and concise approach to encoding interaction nets within the programming language OCaml, emphasising correctness guarantees. To achieve this objective, we encode not only the interaction net primitives, but also Lafont's original type system.

Place, publisher, year, edition, pages
Open Publishing Association, 2025
Series
Electronic proceedings in theoretical computer science (EPTCS), E-ISSN 2075-2180 ; 417
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-557888 (URN)10.4204/EPTCS.417.1 (DOI)001478206100001 ()2-s2.0-105001918210 (Scopus ID)
Conference
15th International Workshop on Graph Computation Models (GCM), July 9, 2024, Enschede, Netherlands
Funder
EU, European Research Council
Available from: 2025-06-03 Created: 2025-06-03 Last updated: 2025-06-03Bibliographically approved
Luo, X., Jiang, X., Tang, Y., Liang, O., Guan, N. & Wang, Y. (2025). Analysis and optimization of communication delay in multi-subscriber environments of ROS 2. Journal of systems architecture, 164, Article ID 103428.
Open this publication in new window or tab >>Analysis and optimization of communication delay in multi-subscriber environments of ROS 2
Show others...
2025 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 164, article id 103428Article in journal (Refereed) Published
Abstract [en]

ROS 2 is one of the most popular robotic software development frameworks. ROS 2 employs a publish-subscribe paradigm for data exchanges, which can effectively address the diverse real-time requirements from the publisher side by various policies provided in the underlying communication layer. However, it remains unclear whether it adequately meets the various real-time requirements of subscribers. In this paper, we formally analyze the communication delay where a single message is published to multiple subscribers. In particular, we have identified a problem in the transmission of a message to multiple subscribers, where a subscriber with the highest priority may receive its message as late as the subscriber with the lowest priority. We propose policies to solve this problem and optimize the communication delay. Comprehensive experiments are conducted to validate and evaluate the proposed methods.

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
ROS 2, DDS, Multi-subscriber, Communication delay
National Category
Computer Sciences Communication Systems
Identifiers
urn:nbn:se:uu:diva-557101 (URN)10.1016/j.sysarc.2025.103428 (DOI)001486557100001 ()2-s2.0-105004177763 (Scopus ID)
Available from: 2025-05-22 Created: 2025-05-22 Last updated: 2025-05-22Bibliographically approved
Graf, S., Jonsson, B., Khodabandeloo, B., Huang, C., Huber, N., Rümmer, P. & Wang, Y. (2025). Timing is All You Need. In: Mike Hinchey; Bernhard Steffen (Ed.), The Combined Power of Research, Education, and Dissemination: Essays Dedicated to Tiziana Margaria on the Occasion of Her 60th Birthday (pp. 259-279). Cham: Springer
Open this publication in new window or tab >>Timing is All You Need
Show others...
2025 (English)In: The Combined Power of Research, Education, and Dissemination: Essays Dedicated to Tiziana Margaria on the Occasion of Her 60th Birthday / [ed] Mike Hinchey; Bernhard Steffen, Cham: Springer, 2025, p. 259-279Chapter in book (Refereed)
Abstract [en]

Deterministic models play a crucial role in computer system development, enabling the simulation and verification of system behaviors before ModelDriven Development (MDD) tools transform and compile these models into final implementations. Ensuring determinism is essential to guarantee that the behaviors of the implemented system maintain the properties analyzed in the models.

This paper investigates the semantics of deterministic models for data-flow networks, where systems consist of components that compute functions on streams. While Kahn Process Networks (KPN) serve as a well-established semantic theory for time-insensitive deterministic systems, it proves inadequate for systems with time dependent components. To address this limitation, we use the concept of timed streams and develop a fixed-point theory tailored for time-sensitive systems in the style of KPN. This theory serves as the foundation for the MDD tool-chain, known as MIMOS, currently under development in Uppsala.

Place, publisher, year, edition, pages
Cham: Springer, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15240
National Category
Computer Sciences Control Engineering
Identifiers
urn:nbn:se:uu:diva-557532 (URN)10.1007/978-3-031-73887-6_18 (DOI)001400883300018 ()2-s2.0-85208072715 (Scopus ID)978-3-031-73886-9 (ISBN)978-3-031-73887-6 (ISBN)
Available from: 2025-05-28 Created: 2025-05-28 Last updated: 2025-05-28Bibliographically approved
Pang, W., Jiang, X., Liu, S., Qiao, L., Fu, K., Gao, L. & Wang, Y. (2024). Control Flow Divergence Optimization by Exploiting Tensor Cores. In: PROCEEDINGS OF THE 61ST ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2024: . Paper presented at 61st Design Automation Conference, JUN 23-27, 2024, San Francisco, CA. ACM Digital Library
Open this publication in new window or tab >>Control Flow Divergence Optimization by Exploiting Tensor Cores
Show others...
2024 (English)In: PROCEEDINGS OF THE 61ST ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2024, ACM Digital Library, 2024Conference paper, Published paper (Refereed)
Abstract [en]

Kernels are scheduled on Graphics Processing Units (GPUs) in the granularity of GPU warp, which is a bunch of threads that must be scheduled together. When executing kernels with conditional branches, the threads within a warp may execute different branches sequentially, resulting in a considerable utilization loss and unpredictable execution time. This problem is known as the control flow divergence. In this work, we propose a novel method to predict threads' execution path before the launch of the kernel by deploying a branch prediction network on the GPU's tensor cores, which can efficiently parallel run with the kernels on CUDA cores, so that the divergence problem can be eased in a large extent with the lowest overhead. Combined with a well-designed thread data reorganization algorithm, this solution can better mitigate GPUs' control flow divergence problem.

Place, publisher, year, edition, pages
ACM Digital Library, 2024
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-559230 (URN)10.1145/3649329.3658462 (DOI)001447271200216 ()2-s2.0-85211162720 (Scopus ID)979-8-4007-0601-1 (ISBN)
Conference
61st Design Automation Conference, JUN 23-27, 2024, San Francisco, CA
Available from: 2025-06-12 Created: 2025-06-12 Last updated: 2025-06-12Bibliographically approved
Liu, S., Jiang, X., Guan, N., Wang, Z., Yu, M. & Wang, Y. (2024). RTeX: An Efficient and Timing-Predictable Multithreaded Executor for ROS 2. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 43(9), 2578-2591
Open this publication in new window or tab >>RTeX: An Efficient and Timing-Predictable Multithreaded Executor for ROS 2
Show others...
2024 (English)In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 43, no 9, p. 2578-2591Article in journal (Refereed) Published
Abstract [en]

Robot operating system (ROS) is a widely used robotic software development framework. In safety-critical applications that require timing guarantees, the first generation of ROS falls short. The introduction of ROS 2 has addressed some of these limitations, but its multithreaded executor still struggles to meet real-time requirements. To address this issue, we design a new multithreaded executor called RTeX for ROS 2. The goal of RTeX is to improve system performance in terms of both run-time efficiency and timing predictability. We have implemented RTeX in the latest version of ROS 2 and conducted experiments on a real platform. The experimental results demonstrate that RTeX outperforms both the default ROS 2 multithreaded executor and its state-of-the-art variant, achieving significant real-time performance improvements.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Instruction sets, Real-time systems, Message systems, Timing, Software, Operating systems, Job shop scheduling, Executor, lock free, real time, robot operating system (ROS) 2
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-538437 (URN)10.1109/TCAD.2024.3380551 (DOI)001297718600009 ()
Available from: 2024-09-16 Created: 2024-09-16 Last updated: 2024-09-16Bibliographically approved
Tang, Y., Jiang, X., Guan, N., Luo, X., Yang, M. & Wang, Y. (2024). Timing analysis of processing chains with data refreshing in ROS 2. Journal of systems architecture, 155, Article ID 103259.
Open this publication in new window or tab >>Timing analysis of processing chains with data refreshing in ROS 2
Show others...
2024 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 155, article id 103259Article in journal (Refereed) Published
Abstract [en]

Robot Operating System (ROS) 2 is currently the most popular framework for robotic software development. Safety-critical robotic software are subject to hard end-to-end timing constraints. A processing chain, composed of an ordered sequence of inter-communicating tasks, is used to describe the sequential steps to complete a certain functionality. Tasks in processing chains communicate via the buffer between them, and the data handling semantics greatly affects end-to-end timing performance. Data refreshing is one of the widely applied data handling semantics. However, limited research has been conducted on the timing performance associated with this type of semantics. This paper presents methods for analyzing the end-to-end timing performance with data refreshing semantics, and formally proves the buffer size configuration to optimize end-to-end latency. Experiments with randomly generated workload and a case study are conducted to evaluate proposed methods.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
ROS 2, Timing analysis, Processing chains, Data refreshing
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-538275 (URN)10.1016/j.sysarc.2024.103259 (DOI)001297717600001 ()
Available from: 2024-09-24 Created: 2024-09-24 Last updated: 2024-09-24Bibliographically approved
Chen, G., Zheng, Y., Zhou, Z., He, S. & Wang, Y. (2023). A GPU-accelerated real-time human voice separation framework for mobile phones. Journal of systems architecture, 145, Article ID 103005.
Open this publication in new window or tab >>A GPU-accelerated real-time human voice separation framework for mobile phones
Show others...
2023 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 145, article id 103005Article in journal (Refereed) Published
Abstract [en]

Mobile speech communication can experience significant degradation in quality when users are in a noisy acoustic environment. With the rapid development of artificial intelligence in recent years, deep learning based monaural speech separation methods have shown remarkable progress in boosting the performance of the separation accuracy. However, the latency and computational cost of these methods remain far insufficient for mobile devices. Performance and power constraints make it still challenging to deploy such methods on mobile devices due to their high computational complexity. In this paper, we present VoiceBit, an efficient and light-weight human voice separation framework for real-time speech separation on mobile devices. Specifically, we propose a light-weight speech separation network to segregate human voice and interfering noises directly from time-domain signals. We binarize the convolution blocks in down-sampling blocks to reduce computation complexity and memory footprint, and leverage scaler layers as well as learnable bias layers to enhance the representation ability of binary filters. In addition, we present a set of parallel optimizations to accelerate the operations in VoiceBit. Specifically, we adopt KKC-minor format for weight matrices of convolution layers to coalesce memory access from global memory. Then, we explore different methods to implement the transposed convolution operation under PhoneBit framework. Experimental results on the MUSDB18-HQ dataset and VCTK dataset show that VoiceBit achieves significant speedup and energy efficiency compared with state-of-the-art frameworks, while maintaining minimal compromise in accuracy.

Place, publisher, year, edition, pages
Elsevier, 2023
Keywords
Mobile Speech Communication, Deep Learning, Real-Time Speech Separation
National Category
Signal Processing Computer Sciences
Identifiers
urn:nbn:se:uu:diva-522885 (URN)10.1016/j.sysarc.2023.103005 (DOI)001149600800001 ()
Available from: 2024-02-12 Created: 2024-02-12 Last updated: 2024-02-12Bibliographically approved
Jiang, X., Luo, X., Guan, N., Dong, Z., Liu, S. & Wang, Y. (2023). Analysis and Optimization of Worst-Case Time Disparity in Cause-Effect Chains. In: 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE): . Paper presented at Design, Automation and Test in Europe Conference and Exhibition (DATE), APR 17-19, 2023, Antwerp, BELGIUM (pp. 1-6). IEEE
Open this publication in new window or tab >>Analysis and Optimization of Worst-Case Time Disparity in Cause-Effect Chains
Show others...
2023 (English)In: 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, 2023, p. 1-6Conference paper, Published paper (Refereed)
Abstract [en]

In automotive systems, an important timing requirement is that the time disparity (the maximum difference among the timestamps of all raw data produced by sensors that an output originates from) must be bounded in a certain range, so that information from different sensors can be correctly synchronized and fused. In this paper, we study the problem of analyzing the worst-case time disparity in cause-effect chains. In particular, we present two bounds, where the first one assumes all chains are independent from each other and the second one takes the fork-join structures into consideration to perform more precise analysis. Moreover, we propose a solution to cut down the worst-case time disparity for a task by designing buffers with proper sizes. Experiments are conducted to show the correctness and effectiveness of both our analysis and optimization methods.

Place, publisher, year, edition, pages
IEEE, 2023
Series
Design Automation and Test in Europe Conference and Exhibition, ISSN 1530-1591, E-ISSN 1558-1101
Keywords
automotive systems, cause-effect chain, sensor, timestamps, disparity
National Category
Control Engineering
Identifiers
urn:nbn:se:uu:diva-510502 (URN)10.23919/DATE56975.2023.10137138 (DOI)001027444200182 ()979-8-3503-9624-9 (ISBN)978-3-9819263-7-8 (ISBN)
Conference
Design, Automation and Test in Europe Conference and Exhibition (DATE), APR 17-19, 2023, Antwerp, BELGIUM
Available from: 2023-08-31 Created: 2023-08-31 Last updated: 2023-08-31Bibliographically approved
Ma, Y., Jiang, X., Guan, N. & Wang, Y. (2023). Anomaly detection based on multi-teacher knowledge distillation. Journal of systems architecture, 138, Article ID 102861.
Open this publication in new window or tab >>Anomaly detection based on multi-teacher knowledge distillation
2023 (English)In: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 138, article id 102861Article in journal (Refereed) Published
Abstract [en]

Anomaly detection on high-dimensional data is crucial for real-world industrial applications. Recent works adopt the Knowledge Distillation (KD) technique to improve the accuracy of anomaly detection Neural Networks (NN). Most KD-based solutions only adopt a single teacher NN and have not yet fully incorporated the distinct advantages of different NN structures. To fill this gap, this paper proposes a novel Multi-teacher Knowledge Distillation approach, which effectively integrates multiple teachers with importance weights to provide guidance for the accurate anomaly detection of students. However, the importance weights are hard to get when training only with normal data. To overcome this challenge, we use an autoencoder-based reconstruction process to update teacher importance weights. In the meantime, the student model parameters are optimized by giving a set of teacher importance weights. Anomalies are then detected based on the deviations between the outputs of teacher and student, as well as the reconstruction errors through the student network. Our proposed approach is evaluated on both CIFAR10 and MVTec datasets. The results show good performance on both high-level semantic anomaly detection and low-level pixel anomaly detection.

Place, publisher, year, edition, pages
ElsevierELSEVIER, 2023
Keywords
Anomaly detection, Multi-teacher, Knowledge distillation, Semantic and pixel anomaly, Normal feature
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-502125 (URN)10.1016/j.sysarc.2023.102861 (DOI)000971046100001 ()
Available from: 2023-05-23 Created: 2023-05-23 Last updated: 2024-01-15Bibliographically approved
Wang, Y., Li, Y., Peng, X., Ji, D., Guan, N. & Wang, Y. (2023). Design and Blocking Analysis of Locking Protocols for Real-Time DAG Tasks Under Federated Scheduling. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(11), 3720-3732
Open this publication in new window or tab >>Design and Blocking Analysis of Locking Protocols for Real-Time DAG Tasks Under Federated Scheduling
Show others...
2023 (English)In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 42, no 11, p. 3720-3732Article in journal (Refereed) Published
Abstract [en]

Real-time systems require locking protocols to coordinate access to shared resources. With the booming revolution of parallel processing technology in real-time systems, there has been some work addressing the problem of extending classic locking protocols for sequential real-time tasks to parallel tasks. However, it may not be most effective to trivially follow the progress mechanisms and queue orders designed for sequential tasks since the intrastructure information within a parallel task is not taken into consideration. This article investigates the design of locking protocols for parallel tasks using a novel mechanism-longest normal Section first (LNSF)-to consider the impact of normal sections on blocking behavior in parallel tasks and further improve real-time performance. LNSF is then implemented in a locking protocol for parallel tasks named POMIP, and associated blocking analysis techniques are presented. Empirical evaluations show that our proposed analysis dominated other state-of-the-art analysis-in best cases, the acceptance ratio of the task set can be improved by around 17%.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
Task analysis, Protocols, Real-time systems, Program processors, Timing, Directed acyclic graph, Behavioral sciences, Blocking analysis, directed acyclic graph (DAG) tasks, locking protocols, real-time scheduling
National Category
Computer Sciences Computer Engineering
Identifiers
urn:nbn:se:uu:diva-517513 (URN)10.1109/TCAD.2023.3264729 (DOI)001098114300018 ()
Available from: 2023-12-11 Created: 2023-12-11 Last updated: 2023-12-11Bibliographically approved
Projects
Timing Analysis for Future Embedded Systems [2011-06251_VR]; Uppsala UniversityScalable Timing Analysis for complex embedded systems [2015-04595_VR]; Uppsala University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2994-6110

Search in DiVA

Show all publications