Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust and Efficient Federated Learning for IoT Security
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology. RISE Research Institutes of Sweden.ORCID iD: 0000-0002-2772-4661
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Description
Abstract [en]

The widespread adoption of Internet of Things (IoT) devices has led to substantial progress across various industrial sectors, including healthcare, transportation, and manufacturing. However, these devices also introduce significant security vulnerabilities because they are often deployed without adequate security measures, making them susceptible to cyber threats. Meanwhile, the rapid evolution of Artificial Intelligence (AI), specifically in the fields of Machine Learning (ML)  and Deep Learning (DL),  brings convenience and advantages to the community of IoT security. AI-driven solutions can process extensive data from IoT devices and networks, facilitating the identification of intricate and dynamic threats that may go unnoticed through conventional security methods. Nevertheless, typical ML models require a substantial volume of centralized datasets for training, which may conflict with the principles outlined in the GDPR. Recently, Federated Learning (FL) has emerged as a promising decentralized learning paradigm that enables participants to collaboratively train models without sharing private data. However, FL also brings new challenges.

The contributions of this dissertation are presented through six research papers, which address identified shortcomings and challenges of FL and ML. Initially, a comprehensive landscape study is conducted to understand available ML technologies thoroughly. A novel approach to device fingerprinting and identification is proposed to fingerprint and identify IoT devices through the application of FL. Through this work, several limitations of FL and research challenges are identified. To begin with, the challenges of non-IID and imbalanced data are addressed by proposing adaptive data rebalancing techniques in a peer-to-peer FL setup. Subsequently, a communication-efficient and robust federated aggregation rule is proposed to secure the learning process in the FL setup. Furthermore, when the Intrusion Detection System (IDS) detects anomaly records, they are shared as vulnerability alerts with the Cyber Threat Intelligence platform, which is enhanced by the proposed ML-based functionalities to automate threat processing. Lastly, an in-vehicle IDS is analyzed in the context of the automotive use case for its resilience against adversarial attacks.

The overall contribution of this dissertation enhances the aggregation methodology within FL, emphasizes its adaptability in addressing diverse critical scenarios to tackle IoT security challenges, and reinforces ML models to confront adversarial AI challenges. Given that FL is still in its early stages, with numerous unresolved challenges in IoT security, these enhancements and contributions are timely in paving the way for future advancements and providing a clearer path forward.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2023. , p. 59
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2306
Keywords [en]
Internet of Things, Federated Learning, Machine Learning, Intrusion Detection System, Communication Efficiency, Robustness, Adversarial AI, Device Fingerprinting, Device Identification, Cyber Threat Intelligence
National Category
Computer Systems
Research subject
Computer Science with specialization in Computer Communication
Identifiers
URN: urn:nbn:se:uu:diva-511774ISBN: 978-91-513-1895-0 (print)OAI: oai:DiVA.org:uu-511774DiVA, id: diva2:1797504
Public defence
2023-11-02, 80127, Ångström, Lägerhyddsvägen 1, Uppsala, 13:00 (English)
Opponent
Supervisors
Funder
EU, Horizon 2020, 101020259EU, Horizon 2020, 830927Available from: 2023-10-11 Created: 2023-09-15 Last updated: 2023-10-11
List of papers
1. Machine Learning for Security at the IoT Edge: A Feasibility Study
Open this publication in new window or tab >>Machine Learning for Security at the IoT Edge: A Feasibility Study
2019 (English)In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 7-12Conference paper, Published paper (Refereed)
Abstract [en]

Benefits of edge computing include reduced la- tency and bandwidth savings, privacy-by-default and by-design in compliance with new privacy regulations that encourage sharing only the minimal amount of data. This creates a need for processing data locally rather than sending everything to a cloud environment and performing machine learning there. However, most IoT edge devices are resource-constrained in comparison and it is not evident whether current machine learning methods are directly employable on IoT edge devices. In this paper, we analyze the state-of-the-art machine learning (ML) algorithms for solving security problems (e.g. intrusion detection) at the edge. Starting from the characteristics and limitations of edge devices in IoT networks, we assess a selected set of commonly used ML algorithms based on four metrics: computation complexity, memory footprint, storage requirement and accuracy. We also compare the suitability of ML algorithms to different cybersecurity problems and discuss the possibility of utilizing these methods for use cases.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2019
Keywords
Machine Learning, AI, IoT, Security, Edge
National Category
Computer Systems
Research subject
Computer Science with specialization in Computer Communication
Identifiers
urn:nbn:se:uu:diva-511286 (URN)10.1109/MASSW.2019.00009 (DOI)000768255900002 ()978-1-7281-4121-3 (ISBN)978-1-7281-4122-0 (ISBN)
Conference
IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), 4-7 November, 2019, Monterey, CA, USA
Funder
EU, Horizon 2020, 830927Vinnova
Available from: 2023-09-11 Created: 2023-09-11 Last updated: 2023-09-15Bibliographically approved
2. FL4IoT: IoT Device Fingerprinting and Identification Using Federated Learning
Open this publication in new window or tab >>FL4IoT: IoT Device Fingerprinting and Identification Using Federated Learning
2023 (English)In: ACM Transactions on Internet of Things, ISSN 2691-1914, Vol. 4, no 3, p. 1-24, article id 17Article in journal (Refereed) Published
Abstract [en]

Unidentified devices in a network can result in devastating consequences. It is, therefore, necessary to fingerprint and identify IoT devices connected to private or critical networks. With the proliferation of massive but heterogeneous IoT devices, it is getting challenging to detect vulnerable devices connected to networks. Current machine learning-based techniques for fingerprinting and identifying devices necessitate a significant amount of data gathered from IoT networks that must be transmitted to a central cloud. Nevertheless, private IoT data cannot be shared with the central cloud in numerous sensitive scenarios. Federated learning (FL) has been regarded as a promising paradigm for decentralized learning and has been applied in many different use cases. It enables machine learning models to be trained in a privacy-preserving way. In this article, we propose a privacy-preserved IoT device fingerprinting and identification mechanisms using FL; we call it FL4IoT. FL4IoT is a two-phased system combining unsupervised-learning-based device fingerprinting and supervised-learning-based device identification. FL4IoT shows its practicality in different performance metrics in a federated and centralized setup. For instance, in the best cases, empirical results show that FL4IoT achieves ∼99% accuracy and F1-Score in identifying IoT devices using a federated setup without exposing any private data to a centralized cloud entity. In addition, FL4IoT can detect spoofed devices with over 99% accuracy.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
Internet of things, federated learning, identification, fingerprinting, machine learning
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-511288 (URN)10.1145/3603257 (DOI)
Funder
EU, Horizon Europe, 10048312EU, Horizon 2020, 101020259EU, Horizon 2020, 957197
Available from: 2023-09-11 Created: 2023-09-11 Last updated: 2023-09-15Bibliographically approved
3. Non-IID data re-balancing at IoT edge with peer-to-peer federated learning for anomaly detection
Open this publication in new window or tab >>Non-IID data re-balancing at IoT edge with peer-to-peer federated learning for anomaly detection
2021 (English)In: WiSec '21: Proceedings of the 14th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Association for Computing Machinery (ACM), 2021, p. 153-163Conference paper, Published paper (Refereed)
Abstract [en]

The increase of the computational power in edge devices has enabled the penetration of distributed machine learning technologies such as federated learning, which allows to build collaborative models performing the training locally in the edge devices, improving the efficiency and the privacy for training of machine learning models, as the data remains in the edge devices. However, in some IoT networks the connectivity between devices and system components can be limited, which prevents the use of federated learning, as it requires a central node to orchestrate the training of the model. To sidestep this, peer-to-peer learning appears as a promising solution, as it does not require such an orchestrator. On the other side, the security challenges in IoT deployments have fostered the use of machine learning for attack and anomaly detection. In these problems, under supervised learning approaches, the training datasets are typically imbalanced, i.e. the number of anomalies is very small compared to the number of benign data points, which requires the use of re-balancing techniques to improve the algorithms' performance. In this paper, we propose a novel peer-to-peer algorithm,P2PK-SMOTE, to train supervised anomaly detection machine learning models in non-IID scenarios, including mechanisms to locally re-balance the training datasets via synthetic generation of data points from the minority class. To improve the performance in non-IID scenarios, we also include a mechanism for sharing a small fraction of synthetic data from the minority class across devices, aiming to reduce the risk of data de-identification. Our experimental evaluation in real datasets for IoT anomaly detection across a different set of scenarios validates the benefits of our proposed approach.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2021
Keywords
Federated Learning, Imbalanced Data, non-IID Data, Anomaly Detection
National Category
Computer Sciences Computer Systems
Research subject
Computer Science with specialization in Computer Communication
Identifiers
urn:nbn:se:uu:diva-510418 (URN)10.1145/3448300.3467827 (DOI)978-1-4503-8349-3 (ISBN)
Conference
14th ACM Conference on Security and Privacy in Wireless and Mobile Networks (ACM WiSec 2021), Abu Dhabi, United Arab Emirates, 28 June-2 July 2021
Projects
EC H2020 Project CONCORDIA GA 830927RISE Cybersecurity KPEC H2020 nIoVe GA 833742
Funder
EU, Horizon 2020, 830927EU, Horizon 2020, 833742
Available from: 2023-08-29 Created: 2023-08-29 Last updated: 2023-09-15Bibliographically approved
4. SparSFA: Towards robust and communication-efficient peer-to-peer federated learning
Open this publication in new window or tab >>SparSFA: Towards robust and communication-efficient peer-to-peer federated learning
Show others...
2023 (English)In: Computers & security (Print), ISSN 0167-4048, E-ISSN 1872-6208, Vol. 129, article id 103182Article in journal (Refereed) Published
Abstract [en]

Federated Learning (FL) has emerged as a powerful paradigm to train collaborative machine learning (ML) models, preserving the privacy of the participants’ datasets. However, standard FL approaches present some limitations that can hinder their applicability in some applications. Thus, the need of a server or aggregator to orchestrate the learning process may not be possible in scenarios with limited connectivity, as in some IoT applications, and offer less flexibility to personalize the ML models for the different participants. To sidestep these limitations, peer-to-peer FL (P2PFL) provides more flexibility, allowing participants to train their own models in collaboration with their neighbors. However, given the huge number of parameters of typical Deep Neural Network architectures, the communication burden can also be very high. On the other side, it has been shown that standard aggregation schemes for FL are very brittle against data and model poisoning attacks. In this paper, we propose SparSFA, an algorithm for P2PFL capable of reducing the communication costs. We show that our method outperforms competing sparsification methods in P2P scenarios, speeding the convergence and enhancing the stability during training. SparSFA also includes a mechanism to mitigate poisoning attacks for each participant in any random network topology. Our empirical evaluation on real datasets for intrusion detection in IoT, considering both balanced and imbalanced-dataset scenarios, shows that SparSFA is robust to different indiscriminate poisoning attacks launched by one or multiple adversaries, outperforming other robust aggregation methods whilst reducing the communication costs through sparsification.

Place, publisher, year, edition, pages
Elsevier, 2023
Keywords
Peer-to-peer federated learning, Communication efficiency, Poisoning attack, Adversarial machine learning, IDS, IoT
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-511287 (URN)10.1016/j.cose.2023.103182 (DOI)000961304800001 ()
Funder
EU, Horizon 2020, 101020259EU, Horizon 2020, 830927
Available from: 2023-09-11 Created: 2023-09-11 Last updated: 2023-09-15Bibliographically approved
5. On the Resilience of Machine Learning-Based IDS for Automotive Networks
Open this publication in new window or tab >>On the Resilience of Machine Learning-Based IDS for Automotive Networks
Show others...
2023 (English)In: 2023 IEEE Vehicular Networking Conference (VNC), Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 239-246Conference paper, Published paper (Refereed)
Abstract [en]

Modern automotive functions are controlled by a large number of small computers called electronic control units (ECUs). These functions span from safety-critical autonomous driving to comfort and infotainment. ECUs communicate with one another over multiple internal networks using different technologies. Some, such as Controller Area Network (CAN), are very simple and provide minimal or no security services. Machine learning techniques can be used to detect anomalous activities in such networks. However, it is necessary that these machine learning techniques are not prone to adversarial attacks. In this paper, we investigate adversarial sample vulnerabilities in four different machine learning-based intrusion detection systems for automotive networks. We show that adversarial samples negatively impact three of the four studied solutions. Furthermore, we analyze transferability of adversarial samples between different systems. We also investigate detection performance and the attack success rate after using adversarial samples in the training. After analyzing these results, we discuss whether current solutions are mature enough for a use in modern vehicles.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE Vehicular Networking Conference, ISSN 2157-9857, E-ISSN 2157-9865
Keywords
Vehicle Security, Machine Learning, Controller Area Network, Intrusion Detection System, Adversarial AI/ML
National Category
Computer Systems
Identifiers
urn:nbn:se:uu:diva-511291 (URN)10.1109/VNC57357.2023.10136285 (DOI)001011821500047 ()979-8-3503-3549-1 (ISBN)979-8-3503-3550-7 (ISBN)
Conference
2023 IEEE Vehicular Networking Conference (VNC), 26-28 April, Istanbul, Turkiye
Funder
Vinnova, 2019-03071EU, Horizon 2020, 101020259EU, Horizon 2020, 957197
Available from: 2023-09-11 Created: 2023-09-11 Last updated: 2023-09-15Bibliographically approved
6. MAS-CTI: Machine Learning Assisted System for Cyber Threat Intelligence
Open this publication in new window or tab >>MAS-CTI: Machine Learning Assisted System for Cyber Threat Intelligence
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Cyber Threat Intelligence (CTI) is a critical component of modern cybersecurity, providing organizations with essential information to detect, prevent, and respond to cyber threats. However, CTI data is often non-uniform, incomplete, and inconsistent, making it challenging to analyze and manage effectively. Machine Learning (ML) models offer a powerful solution to overcome these challenges, providing advanced tools for data processing, sharing, and analysis. In this paper, we present MAS-CTI, an extended version of the popular CTI platform MISP, leveraging the power of ML for CTI processing. In particular, we address three key challenges in the CTI domain: event type identification, threat ranking, and IoC correlation. Additionally, to address concerns regarding IoC confidentiality, we explore the application of Federated Learning (FL) for event identification. We have conducted extensive testing of the models on three public CTI datasets, and the results obtained demonstrate the potential of ML models to enhance CTI processing and analysis, with only a few exceptions. 

Keywords
Machine Learning, Cyber Threat Intelligence, Federated Learning, Learning to Rank, MISP
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:uu:diva-511773 (URN)
Available from: 2023-09-15 Created: 2023-09-15 Last updated: 2023-09-15Bibliographically approved

Open Access in DiVA

UUThesis_H-Wang-2023(557 kB)658 downloads
File information
File name FULLTEXT01.pdfFile size 557 kBChecksum SHA-512
7ebc294781eb96fd4b718fa9e979def7c73bad6dd5c06302a74a48c267bc38332a922ec3be5868931282cf49e6bd74601fce7bf180e6f47a9cc6f14efdeb1852
Type fulltextMimetype application/pdf

Authority records

Wang, Han

Search in DiVA

By author/editor
Wang, Han
By organisation
Department of Information Technology
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 658 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2328 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf