Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Publications (10 of 11) Show all publications
Li, S., Ngai, E.-H. C. H. & Voigt, T. (2024). An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning. IEEE Transactions on Big Data, 10(6), 975-988
Open this publication in new window or tab >>An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning
2024 (English)In: IEEE Transactions on Big Data, E-ISSN 2332-7790, Vol. 10, no 6, p. 975-988Article in journal (Refereed) Published
Abstract [en]

Byzantine-robust federated learning aims at mitigating Byzantine failures during the federated training process, where malicious participants (known as Byzantine clients) may upload arbitrary local updates to the central server in order to degrade the performance of the global model. In recent years, several robust aggregation schemes have been proposed to defend against malicious updates from Byzantine clients and improve the robustness of federated learning. These solutions were claimed to be Byzantine-robust, under certain assumptions. Other than that, new attack strategies are emerging, striving to circumvent the defense schemes. However, there is a lack of systematical comparison and empirical study thereof. In this paper, we conduct an experimental study of Byzantine-robust aggregation schemes under different attacks using two popular algorithms in federated learning, FedSGD and FedAvg . We first survey existing Byzantine attack strategies, as well as Byzantine-robust aggregation schemes that aim to defend against Byzantine attacks. We also propose a new scheme, ClippedClustering, to enhance the robustness of a clustering-based scheme by automatically clipping the updates. Then we provide an experimental evaluation of eight aggregation schemes in the scenario of five different Byzantine attacks. Our experimental results show that these aggregation schemes sustain relatively high accuracy in some cases, but they are not effective in all cases. In particular, our proposed ClippedClustering successfully defends against most attacks under independent and identically distributed (IID) local datasets. However, when the local datasets are Non-IID, the performance of all the aggregation schemes significantly decreases. With Non-IID data, some of these aggregation schemes fail even in the complete absence of Byzantine clients. Based on our experimental study, we conclude that the robustness of all the aggregation schemes is limited, highlighting the need for new defense strategies, in particular for Non-IID datasets.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Byzantine attacks, distributed learning, federated learning, neural networks, robustness
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:uu:diva-494317 (URN)10.1109/tbdata.2023.3237397 (DOI)001354646300016 ()2-s2.0-85147301735 (Scopus ID)
Funder
Swedish Research Council, 2017-04543EU, Horizon 2020, 101015922
Available from: 2023-01-17 Created: 2023-01-17 Last updated: 2025-02-19Bibliographically approved
Li, S., Ngai, E. C. H., Ye, F., Ju, L., Zhang, T. & Voigt, T. (2024). Blades: A Unified Benchmark Suite for Byzantine Attacks and Defenses in Federated Learning. In: 2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI): . Paper presented at 9th ACM/IEEE Conference on Internet of Things Design and Implementation (IoTDI), May 13-16, 2024, Hong Kong, Hong Kong (pp. 158-169). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Blades: A Unified Benchmark Suite for Byzantine Attacks and Defenses in Federated Learning
Show others...
2024 (English)In: 2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI), Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 158-169Conference paper, Published paper (Refereed)
Abstract [en]

Federated learning (FL) facilitates distributed training across different IoT and edge devices, safeguarding the privacy of their data. The inherent distributed structure of FL introduces vulnerabilities, especially from adversarial devices aiming to skew local updates to their advantage. Despite the plethora of research focusing on Byzantine-resilient FL, the academic community has yet to establish a comprehensive benchmark suite, pivotal for impartial assessment and comparison of different techniques. This paper presents Blades, a scalable, extensible, and easily configurable benchmark suite that supports researchers and developers in efficiently implementing and validating novel strategies against baseline algorithms in Byzantine-resilient FL. Blades contains built-in implementations of representative attack and defense strategies and offers a user-friendly interface that seamlessly integrates new ideas. Using Blades, we re-evaluate representative attacks and defenses on wide-ranging experimental configurations (approximately 1,500 trials in total). Through our extensive experiments, we gained new insights into FL robustness and highlighted previously overlooked limitations due to the absence of thorough evaluations and comparisons of baselines under various attack settings. We maintain the source code and documents at https://github.com/lishenghui/blades.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Byzantine attacks, distributed learning, federated learning, IoT, neural networks, robustness
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-537577 (URN)10.1109/IoTDI61053.2024.00018 (DOI)001261370500014 ()2-s2.0-85196568437 (Scopus ID)979-8-3503-7025-6 (ISBN)979-8-3503-7026-3 (ISBN)
Conference
9th ACM/IEEE Conference on Internet of Things Design and Implementation (IoTDI), May 13-16, 2024, Hong Kong, Hong Kong
Funder
Swedish Research Council, 2017-04543
Available from: 2024-09-05 Created: 2024-09-05 Last updated: 2025-02-11Bibliographically approved
Li, S., Ngait, E.-H. C. -., Ye, F., Ju, L., Zhang, T. & Voigt, T. (2024). Demo Abstract: Blades: A Unified Benchmark Suite for Byzantine-Resilient in Federated Learning. In: 9TH ACM/IEEE CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION, IOTDI 2024: . Paper presented at 9th ACM/IEEE Conference on Internet of Things Design and Implementation (IoTDI), MAY 13-16, 2024, Hong Kong, PEOPLES R CHINA (pp. 229-230). IEEE Computer Society
Open this publication in new window or tab >>Demo Abstract: Blades: A Unified Benchmark Suite for Byzantine-Resilient in Federated Learning
Show others...
2024 (English)In: 9TH ACM/IEEE CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION, IOTDI 2024, IEEE Computer Society, 2024, p. 229-230Conference paper, Published paper (Refereed)
Abstract [en]

Federated learning (FL) facilitates distributed training across different IoT and edge devices, safeguarding the privacy of their data. The inherently distributed nature of FL introduces vulnerabilities, especially from adversarial devices aiming to skew local updates to their desire. Despite the plethora of research focusing on Byzantine-resilient FL, the academic conununity has yet to establish a comprehensive benchmark suite, pivotal for the assessment and comparison of different techniques. This demonstration presents Blades, a scalable, extensible, and easily configurable benchmark suite that supports researchers and developers in efficiently implementing and validating strategies against baseline algorithms in Byzantine-resilient FL.

Place, publisher, year, edition, pages
IEEE Computer Society, 2024
Keywords
Byzantine attacks, distributed learning, federated learning, IoT, neural networks, robustness
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-537570 (URN)10.1109/IoTDI61053.2024.00030 (DOI)001261370500026 ()979-8-3503-7025-6 (ISBN)979-8-3503-7026-3 (ISBN)
Conference
9th ACM/IEEE Conference on Internet of Things Design and Implementation (IoTDI), MAY 13-16, 2024, Hong Kong, PEOPLES R CHINA
Available from: 2024-09-05 Created: 2024-09-05 Last updated: 2024-09-05Bibliographically approved
Li, S. (2024). Robust Federated Learning: Defending Against Byzantine and Jailbreak Attacks. (Doctoral dissertation). Uppsala: Acta Universitatis Upsaliensis
Open this publication in new window or tab >>Robust Federated Learning: Defending Against Byzantine and Jailbreak Attacks
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Federated Learning (FL) has emerged as a promising paradigm for training collaborative machine learning models across multiple participants while preserving data privacy. It is particularly valuable in privacy-sensitive domains like healthcare and finance. Recently, FL has been explored to harness the power of pre-trained Foundation Models (FMs) for downstream task adaptation, enabling customization and personalization while maintaining data locality and privacy. However, FL's distributed nature makes it inherently vulnerable to adversarial attacks. Notable threats include Byzantine attacks, which inject malicious updates to degrade model performance, and jailbreak attacks, which exploit the fine-tuning process to undermine safety alignments of FMs, leading to harmful outputs. This dissertation centers on robust FL, aiming to mitigate these threats and ensure global models remain accurate and safe even under adversarial conditions. To mitigate Byzantine attacks, we propose several Robust Aggregation Schemes (RASs) that decrease the influence of malicious updates. Additionally, we introduce Blades, an open-source benchmarking tool to systematically study the interplay between attacks and defenses in FL, offering insights into the effects of data heterogeneity, differential privacy, and momentum on RAS robustness. Exploring the synergy between FL and FMs, we present a taxonomy of research along with adaptivity, efficiency, and trustworthiness. We uncover a novel attack, “PEFT-as-an-Attack” (PaaA), where malicious FL participants jailbreak FMs through Parameter-Efficient-Fine-Tuning (PEFT) with harmful data. We evaluate defenses against PaaA and highlight critical gaps, emphasizing the need for advanced strategies balancing safety and utility in FL-FM systems. In summary, this dissertation advances FL robustness by proposing novel defenses, tools, and insights while exposing emerging attack vectors. These contributions pave the way for attack-resilient distributed machine learning systems capable of withstanding both current and emerging threats.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2024. p. 54
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2477
Keywords
Federated learning, Jailbreak attack, Parameter-Efficient Fine-Tuning, Pre-trained Language Model, Robustness
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-540441 (URN)978-91-513-2312-1 (ISBN)
Public defence
2025-01-16, 101121, Sonja Lyttkens, Ångström, Regementsvägen 1, Uppsala, 09:00 (English)
Opponent
Supervisors
Available from: 2024-12-17 Created: 2024-11-20 Last updated: 2024-12-17
Li, S., Ngai, E. & Voigt, T. (2023). Byzantine-Robust Aggregation in Federated Learning Empowered Industrial IoT. IEEE Transactions on Industrial Informatics, 19(2), 1165-1175
Open this publication in new window or tab >>Byzantine-Robust Aggregation in Federated Learning Empowered Industrial IoT
2023 (English)In: IEEE Transactions on Industrial Informatics, ISSN 1551-3203, E-ISSN 1941-0050, Vol. 19, no 2, p. 1165-1175Article in journal (Refereed) Published
Abstract [en]

Federated Learning (FL) is a promising paradigm to empower on-device intelligence in Industrial Internet of Things(IIoT) due to its capability of training machine learning models across multiple IIoT devices, while preserving the privacy of their local data. However, the distributed architecture of FL relies on aggregating the parameter list from the remote devices, which poses potential security risks caused by malicious devices. In this paper, we propose a flexible and robust aggregation rule, called Auto-weighted Geometric Median (AutoGM), and analyze the robustness against outliers in the inputs. To obtain the value of AutoGM, we design an algorithm based on alternating optimization strategy. Using AutoGM as aggregation rule, we propose two robust FL solutions, AutoGM_FL and AutoGM_PFL. AutoGM_FL learns a shared global model using the standard FL paradigm, and AutoGM_PFL learns a personalized model for each device. We conduct extensive experiments on the FEMNIST and Bosch IIoT datasets. The experimental results show that our solutions are robust against both model poisoning and data poisoning attacks. In particular, our solutions sustain high performance even when 30% of the nodes perform model or 50% of the nodes perform data poisoning attacks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Electrical and Electronic Engineering, Computer Science Applications, Information Systems, Control and Systems Engineering
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-458900 (URN)10.1109/tii.2021.3128164 (DOI)000926964700005 ()
Funder
Swedish Research Council, 2017-04543EU, Horizon 2020, 101015922
Available from: 2021-11-17 Created: 2021-11-17 Last updated: 2024-11-20Bibliographically approved
Ye, F., Fang, M., Li, S. & Yilmaz, E. (2023). Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting. In: Bouamor, H; Pino, J; Bali K (Ed.), Findings of the association for computational linguistics - EMNLP 2023: . Paper presented at Conference on Empirical Methods in Natural Language Processing (EMNLP), DEC 06-10, 2023, Singapore, SINGAPORE (pp. 5985-6006). Association for Computational Linguistics
Open this publication in new window or tab >>Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting
2023 (English)In: Findings of the association for computational linguistics - EMNLP 2023 / [ed] Bouamor, H; Pino, J; Bali K, Association for Computational Linguistics, 2023, p. 5985-6006Conference paper, Published paper (Refereed)
Abstract [en]

Query rewriting plays a vital role in enhancing conversational search by transforming context-dependent user queries into standalone forms. Existing approaches primarily leverage human-rewritten queries as labels to train query rewriting models. However, human rewrites may lack sufficient information for optimal retrieval performance. To overcome this limitation, we propose utilizing large language models (LLMs) as query rewriters, enabling the generation of informative query rewrites through well-designed instructions. We define four essential properties for well-formed rewrites and incorporate all of them into the instruction. In addition, we introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process. Furthermore, we propose distilling the rewriting capabilities of LLMs into smaller models to reduce rewriting latency. Our experimental evaluation on the QReCC dataset demonstrates that informative query rewrites can yield substantially improved retrieval performance compared to human rewrites, especially with sparse retrievers.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2023
National Category
Computer Sciences Natural Language Processing
Identifiers
urn:nbn:se:uu:diva-557651 (URN)10.18653/v1/2023.findings-emnlp.398 (DOI)001279591706008 ()979-8-89176-061-5 (ISBN)
Conference
Conference on Empirical Methods in Natural Language Processing (EMNLP), DEC 06-10, 2023, Singapore, SINGAPORE
Available from: 2025-06-02 Created: 2025-06-02 Last updated: 2025-06-02Bibliographically approved
Li, S., Ngai, E., Ye, F. & Voigt, T. (2022). Auto-weighted Robust Federated Learning with Corrupted Data Sources. ACM Transactions on Intelligent Systems and Technology, 13(5), 1-20
Open this publication in new window or tab >>Auto-weighted Robust Federated Learning with Corrupted Data Sources
2022 (English)In: ACM Transactions on Intelligent Systems and Technology, ISSN 2157-6904, E-ISSN 2157-6912, Vol. 13, no 5, p. 1-20Article in journal (Refereed) Published
Abstract [en]

Federated learning provides a communication-efficient and privacy-preserving training process by enabling learning statistical models with massive participants without accessing their local data. Standard federated learning techniques that naively minimize an average loss function are vulnerable to data corruptions from outliers, systematic mislabeling, or even adversaries. In this paper, we address this challenge by proposing Auto-weighted Robust Federated Learning (ARFL), a novel approach that jointly learns the global model and the weights of local updates to provide robustness against corrupted data sources. We prove a learning bound on the expected loss with respect to the predictor and the weights of clients, which guides the definition of the objective for robust federated learning. We present an objective that minimizes the weighted sum of empirical risk of clients with a regularization term, where the weights can be allocated by comparing the empirical risk of each client with the average empirical risk of the best p clients. This method can downweight the clients with significantly higher losses, thereby lowering their contributions to the global model. We show that this approach achieves robustness when the data of corrupted clients is distributed differently from the benign ones. To optimize the objective function, we propose a communication-efficient algorithm based on the blockwise minimization paradigm. We conduct extensive experiments on multiple benchmark datasets, including CIFAR-10, FEMNIST, and Shakespeare, considering different neural network models. The results show that our solution is robust against different scenarios including label shuffling, label flipping, and noisy features, and outperforms the state-of-the-art methods in most scenarios.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM)Association for Computing Machinery (ACM), 2022
Keywords
Federated learning, robustness, auto-weighted, distributed learning, neural networks
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-468353 (URN)10.1145/3517821 (DOI)000877952100005 ()
Funder
Swedish Research Council, 2017-0454EU, Horizon 2020, 101015922
Available from: 2022-02-24 Created: 2022-02-24 Last updated: 2024-11-20Bibliographically approved
Ye, F., Wang, X., Huang, J., Li, S., Stern, S. & Yilmaz, E. (2022). MetaASSIST: Robust Dialogue State Tracking with Meta Learning. In: Goldberg, Y Kozareva, Z Zhang, Y (Ed.), 2022 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2022: . Paper presented at 2022 Conference on Empirical Methods in Natural Language Processing, DEC 07-11, 2022, Abu Dhabi, U ARAB EMIRATES (pp. 1157-1169). Association for Computational Linguistics
Open this publication in new window or tab >>MetaASSIST: Robust Dialogue State Tracking with Meta Learning
Show others...
2022 (English)In: 2022 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2022 / [ed] Goldberg, Y Kozareva, Z Zhang, Y, Association for Computational Linguistics, 2022, p. 1157-1169Conference paper, Published paper (Refereed)
Abstract [en]

Existing dialogue datasets contain lots of noise in their state annotations. Such noise can hurt model training and ultimately lead to poor generalization performance. A general framework named ASSIST has recently been proposed to train robust dialogue state tracking (DST) models. It introduces an auxiliary model to generate pseudo labels for the noisy training set. These pseudo labels are combined with vanilla labels by a common fixed weighting parameter to train the primary DST model. Notwithstanding the improvements of ASSIST on DST, tuning the weighting parameter is challenging. Moreover, a single parameter shared by all slots and all instances may be suboptimal. To overcome these limitations, we propose a meta learning-based framework MetaASSIST to adaptively learn the weighting parameter. Specifically, we propose three schemes with varying degrees of flexibility, ranging from slot-wise to both slot-wise and instance-wise, to convert the weighting parameter into learnable functions. These functions are trained in a meta-learning manner by taking the validation set as meta data. Experimental results demonstrate that all three schemes can achieve competitive performance. Most impressively, we achieve a state-of-the-art joint goal accuracy of 80.10% on MultiWOZ 2.4.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2022
National Category
Signal Processing Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:uu:diva-586581 (URN)10.18653/v1/2022.emnlp-main.76 (DOI)001456575700076 ()978-1-959429-40-1 (ISBN)
Conference
2022 Conference on Empirical Methods in Natural Language Processing, DEC 07-11, 2022, Abu Dhabi, U ARAB EMIRATES
Available from: 2026-05-20 Created: 2026-05-20 Last updated: 2026-05-20Bibliographically approved
Ye, F., Manotumruksa, J., Zhang, Q., Li, S. & Yilmaz, E. (2021). Slot Self-Attentive Dialogue State Tracking. In: Proceedings of the  World Wide Web Conference 2021 (WWW 2021): . Paper presented at 30th World Wide Web Conference (WWW), APR 12-23, 2021, ELECTR NETWORK (pp. 1598-1608). Association for Computing Machinery (ACM) Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Slot Self-Attentive Dialogue State Tracking
Show others...
2021 (English)In: Proceedings of the  World Wide Web Conference 2021 (WWW 2021), Association for Computing Machinery (ACM) Association for Computing Machinery (ACM), 2021, p. 1598-1608Conference paper, Published paper (Refereed)
Abstract [en]

An indispensable component in task-oriented dialogue systems is the dialogue state tracker, which keeps track of users' intentions in the course of conversation. The typical approach towards this goal is to fill in multiple pre-defined slots that are essential to complete the task. Although various dialogue state tracking methods have been proposed in recent years, most of them predict the value of each slot separately and fail to consider the correlations among slots. In this paper, we propose a slot self-attention mechanism that can learn the slot correlations automatically. Specifically, a slot-token attention is first utilized to obtain slot-specific features from the dialogue context. Then a stacked slot self-attention is applied on these features to learn the correlations among slots. We conduct comprehensive experiments on two multi-domain task-oriented dialogue datasets, including MultiWOZ 2.0 and MultiWOZ 2.1. The experimental results demonstrate that our approach achieves state-of-the-art performance on both datasets, verifying the necessity and effectiveness of taking slot correlations into consideration.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM)Association for Computing Machinery (ACM), 2021
Keywords
dialogue state tracking, belief tracking, slot self-attention, task-oriented dialogue system
National Category
Communication Systems
Identifiers
urn:nbn:se:uu:diva-470152 (URN)10.1145/3442381.3449939 (DOI)000733621801053 ()978-1-4503-8312-7 (ISBN)
Conference
30th World Wide Web Conference (WWW), APR 12-23, 2021, ELECTR NETWORK
Available from: 2022-03-21 Created: 2022-03-21 Last updated: 2024-01-15Bibliographically approved
Li, S., Ngai, E., Ye, F. & Voigt, T.PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning.
Open this publication in new window or tab >>PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Federated Parameter-Efficient Fine-Tuning (FedPEFT) has emerged as a promising paradigm for privacy-preserving and efficient adaptation of Pre-trained Language Models (PLMs) in Federated Learning (FL) settings. It preserves data privacy by keeping the data decentralized and training the model on local devices, ensuring that raw data never leaves the user's device. Moreover, the integration of PEFT methods such as LoRA significantly reduces the number of trainable parameters compared to fine-tuning the entire model, thereby minimizing communication costs and computational overhead. Despite its potential, the security implications of FedPEFT remain underexplored. This paper introduces a novel security threat to FedPEFT, termed PEFT-as-an-Attack (PaaA), which exposes how PEFT methods can be exploited as an attack vector to circumvent PLMs' safety alignment and generate harmful content in response to malicious prompts. Our evaluation of PaaA reveals that with less than 1% of the model's parameters set as trainable, and a small subset of clients acting maliciously, the attack achieves an approximate 80% attack success rate using representative PEFT methods such as LoRA. To mitigate this threat, we further investigate potential defense strategies, including Robust Aggregation Schemes (RASs) and Post-PEFT Safety Alignment (PPSA). However, our empirical analysis highlights the limitations of these defenses, i.e., even the most advanced RASs, such as DnC and ClippedClustering, struggle to defend against PaaA in scenarios with highly heterogeneous data distributions. Similarly, while PPSA can reduce attack success rates to below 10%, it severely degrades the model's accuracy on the target task. Our results underscore the urgent need for more effective defense mechanisms that simultaneously ensure security and maintain the performance advantages of the FedPEFT paradigm.

National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-543432 (URN)
Available from: 2024-11-20 Created: 2024-11-20 Last updated: 2024-11-28
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-0145-3127

Search in DiVA

Show all publications