Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust machine learning methods
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Automatic control.ORCID iD: 0000-0002-2294-004X
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We are surrounded by data in our daily lives. The rent of our houses, the amount of electricity units consumed, the prices of different products at a supermarket, the daily temperature, our medicine prescriptions, our internet search history are all different forms of data. Data can be used in a wide range of applications. For example, one can use data to predict product prices in the future; to predict tomorrow's temperature; to recommend videos; or suggest better prescriptions. However in order to do the above, one is required to learn a model from data. A model is a mathematical description of how the phenomena we are interested in behaves e.g. how does the temperature vary? Is it periodic? What kinds of patterns does it have? Machine learning is about this process of learning models from data by building on disciplines such as statistics and optimization. 

Learning models comes with many different challenges. Some challenges are related to how flexible the model is, some are related to the size of data, some are related to computational efficiency etc. One of the challenges is that of data outliers. For instance, due to war in a country exports could stop and there could be a sudden spike in prices of different products. This sudden jump in prices is an outlier or corruption to the normal situation and must be accounted for when learning the model. Another challenge could be that data is collected in one situation but the model is to be used in another situation. For example, one might have data on vaccine trials where the participants were mostly old people. But one might want to make a decision on whether to use the vaccine or not for the whole population that contains people of all age groups. So one must also account for this difference when learning models because the conclusion drawn may not be valid for the young people in the population. Yet another challenge  could arise when data is collected from different sources or contexts. For example, a shopkeeper might have data on sales of paracetamol when there was flu and when there was no flu and she might want to decide how much paracetamol to stock for the next month. In this situation, it is difficult to know whether there will be a flu next month or not and so deciding on how much to stock is a challenge. This thesis tries to address these and other similar challenges.

In paper I, we address the challenge of data corruption i.e., learning models in a robust way when some fraction of the data is corrupted. In paper II, we apply the methodology of paper I to the problem of localization in wireless networks. Paper III addresses the challenge of estimating causal effect between an exposure and an outcome variable from spatially collected data (e.g. whether increasing number of police personnel in an area reduces number of crimes there). Paper IV addresses the challenge of learning improved decision policies e.g. which treatment to assign to which patient given past data on treatment assignments. In paper V, we look at the challenge of learning models when data is acquired from different contexts and the future context is unknown. In paper VI, we address the challenge of predicting count data across space e.g. number of crimes in an area and quantify its uncertainty. In paper VII, we address the challenge of learning models when data points arrive in a streaming fashion i.e., point by point. The proposed method enables online training and also yields some robustness properties.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2022. , p. 50
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2147
Keywords [en]
artificial intelligence, machine learning, risk minimization, data corruption, decision policy, conformal methods, data from contexts, online learning, spice, robust, causal inference, point process, localization, distribution uncertainty, treatment rules, quantile treatment, predicting count data
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing Probability Theory and Statistics
Research subject
Electrical Engineering with specialization in Signal Processing
Identifiers
URN: urn:nbn:se:uu:diva-472453ISBN: 978-91-513-1492-1 (print)OAI: oai:DiVA.org:uu-472453DiVA, id: diva2:1651294
Public defence
2022-06-09, 101195, Ångström, Lägerhyddsvägen 1, Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2022-05-12 Created: 2022-04-11 Last updated: 2022-06-15
List of papers
1. Robust Risk Minimization for Statistical Learning From Corrupted Data
Open this publication in new window or tab >>Robust Risk Minimization for Statistical Learning From Corrupted Data
2020 (English)In: IEEE Open Journal of Signal Processing, E-ISSN 2644-1322, Vol. 1, p. 287-294Article in journal (Refereed) Published
Abstract [en]

We consider a general statistical learning problem where an unknown fraction of the training data is corrupted. We develop a robust learning method that only requires specifying an upper bound on the corrupted data fraction. The method minimizes a risk function defined by a non-parametric distribution with unknown probability weights. We derive and analyse the optimal weights and show how they provide robustness against corrupted data. Furthermore, we give a computationally efficient coordinate descent algorithm to solve the risk minimization problem. We demonstrate the wide range applicability of the method, including regression, classification, unsupervised learning and classic parameter estimation, with state-of-the-art performance.

Keywords
Data corruption, Huber contamination model, risk minimization, robustness
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-429036 (URN)10.1109/OJSP.2020.3039632 (DOI)000722891600021 ()
Funder
Swedish Research Council, 2017-04610Swedish Research Council, 2018-05040
Available from: 2020-12-18 Created: 2020-12-18 Last updated: 2024-01-08Bibliographically approved
2. Robust localization in wireless networks from corrupted signals
Open this publication in new window or tab >>Robust localization in wireless networks from corrupted signals
2021 (English)In: EURASIP Journal on Advances in Signal Processing, ISSN 1687-6172, E-ISSN 1687-6180, Vol. 2021, no 1, article id 79Article in journal (Refereed) Published
Abstract [en]

We address the problem of timing-based localization in wireless networks, when an unknown fraction of data is corrupted by non-ideal propagation conditions. While timing-based techniques can enable accurate localization, they are sensitive to corrupted data. We develop a robust method that is applicable to a range of localization techniques, including time-of-arrival, time-difference-of-arrival and time-difference in schedule-based transmissions. The method is distribution-free, is computationally efficient and requires only an upper bound on the fraction of corrupted data, thus obviating distributional assumptions on the corrupting noise. The robustness of the method is demonstrated in numerical experiments.

Place, publisher, year, edition, pages
SpringerSPRINGER, 2021
Keywords
Localization, Robustness, Wireless networks, Time-of-arrival, Time-difference-of-arrival
National Category
Signal Processing
Identifiers
urn:nbn:se:uu:diva-456481 (URN)10.1186/s13634-021-00786-8 (DOI)000695828100001 ()
Funder
Swedish Research Council, 2016-06079Swedish Research Council, 2017-04610Swedish Research Council, 2018-05040
Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2024-01-15Bibliographically approved
3. Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding
Open this publication in new window or tab >>Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding
2019 (English)In: Proceedings of the 36th International Conference on Machine Learning, 2019, p. 4942-4950Conference paper, Published paper (Refereed)
Abstract [en]

We address the problem of inferring the causal effect of an exposure on an outcome across space, using observational data. The data is possibly subject to unmeasured confounding variables which, in a standard approach, must be adjusted for by estimating a nuisance function. Here we develop a method that eliminates the nuisance function, while mitigating the resulting errors-in-variables. The result is a robust and accurate inference method for spatially varying heterogeneous causal effects. The properties of the method are demonstrated on synthetic as well as real data from Germany and the US.

Series
Proceedings of Machine Learning Research, ISSN 2640-3498 ; 97
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-429033 (URN)000684034305010 ()
Conference
International Conference on Machine Learning (ICML), 9-15 June 2019, Long Beach, California, USA
Funder
Swedish Research Council, 2018-05040Swedish Foundation for Strategic Research, RIT15-0012Swedish Research Council, 621-2016-06079
Available from: 2020-12-18 Created: 2020-12-18 Last updated: 2022-06-16Bibliographically approved
4. Learning Robust Decision Policies from Observational Data
Open this publication in new window or tab >>Learning Robust Decision Policies from Observational Data
2020 (English)In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020) / [ed] H. Larochelle; M. Ranzato; R. Hadsell; M.F. Balcan; H. Lin, Neural Information Processing Systems, 2020Conference paper, Published paper (Refereed)
Abstract [en]

We address the problem of learning a decision policy from observational data of past decisions in contexts with features and associated outcomes. The past policy maybe unknown and in safety-critical applications, such as medical decision support, it is of interest to learn robust policies that reduce the risk of outcomes with high costs. In this paper, we develop a method for learning policies that reduce tails of the cost distribution at a specified level and, moreover, provide a statistically valid bound on the cost of each decision. These properties are valid under finite samples -- even in scenarios with uneven or no overlap between features for different decisions in the observed data -- by building on recent results in conformal prediction. The performance and statistical properties of the proposed method are illustrated using both real and synthetic data. 

Place, publisher, year, edition, pages
Neural Information Processing Systems, 2020
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-429039 (URN)001207696401072 ()9781713829546 (ISBN)
Conference
34th Conference on Neural Information Processing Systems (NeurIPS 2020), 6-12 December, 2020, Online
Funder
Swedish Research Council, 2018-05040Knut and Alice Wallenberg FoundationWallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2020-12-18 Created: 2020-12-18 Last updated: 2024-12-12Bibliographically approved
5. Robust learning in heterogeneous contexts
Open this publication in new window or tab >>Robust learning in heterogeneous contexts
(English)In: Article in journal, Editorial material (Refereed) Submitted
Abstract [en]

We consider the problem of learning decision parameters from data obtained in different contexts. When future context information is inaccessible, we consider the resulting (i) worst-case and (ii) overall out-of-sample performance of the learned parameters. We propose a robust approach that trades off these two performance criteria based on the partial information obtained about the unknown context distribution. The proposed method overcomes the overly conservative nature of the minimax method, while robustifying the empirical risk minimization method in a statistically motivated manner. We illustrate the performance of the method in a classification task.

National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-472095 (URN)
Available from: 2022-04-05 Created: 2022-04-05 Last updated: 2022-07-22Bibliographically approved
6. Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees
Open this publication in new window or tab >>Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees
2019 (English)In: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) / [ed] Wallach, H Larochelle, H Beygelzimer, A d'Alche-Buc, F Fox, E Garnett, R, Neural Information Processing Systems, 2019Conference paper, Published paper (Refereed)
Abstract [en]

A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.

Place, publisher, year, edition, pages
Neural Information Processing Systems, 2019
Series
Advances in Neural Information Processing Systems, ISSN 1049-5258 ; 32
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-418895 (URN)000535866903054 ()
Conference
33rd Conference on Neural Information Processing Systems (NeurIPS), DEC 08-14, 2019, Vancouver, CANADA
Funder
Swedish Research Council, 2017 -04610Swedish Research Council, 2018 -05040
Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2022-04-11Bibliographically approved
7. Online Learning for Prediction via Covariance Fitting: Computation, Performance and Robustness
Open this publication in new window or tab >>Online Learning for Prediction via Covariance Fitting: Computation, Performance and Robustness
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856Article in journal (Refereed) Published
Abstract [en]

We consider the online learning of linear smoother predictors based on a covariance model of the outcomes. To control its degrees of freedom in an appropriate manner, the covariance model parameters are often learned using cross-validation or maximum-likelihood techniques. However, neither technique is suitable when training data arrives in a streaming fashion. Here we consider a covariance-fitting method to learn the model parameters, initially used  in spectral estimation. We show that this results in a computation efficient online learning method in which the resulting predictor can be updated sequentially. We prove that, with high probability, its out-of-sample error approaches the minimum achievable level at root-$n$ rate. Moreover, we show that the resulting predictor enjoys two different robustness properties. First, it minimizes the out-of-sample error with respect to the least favourable distribution within a given Wasserstein distance from the empirical distribution. Second, it is robust against errors in the covariate training data. We illustrate the performance of the proposed method in a numerical experiment.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research, 2023
National Category
Probability Theory and Statistics Engineering and Technology
Identifiers
urn:nbn:se:uu:diva-472451 (URN)
Available from: 2022-04-11 Created: 2022-04-11 Last updated: 2024-01-08Bibliographically approved

Open Access in DiVA

UUThesis_M-Osama-2022(1194 kB)852 downloads
File information
File name FULLTEXT01.pdfFile size 1194 kBChecksum SHA-512
0d0707e4d987d7fe280bf3bcb496654002b76f645f6f276f1290948341a2110b01b95ba6b0e1cbf2e1ac6bb0abaf17c5f20d5522d8c2c7bead917735198a3ca4
Type fulltextMimetype application/pdf

Authority records

Osama, Muhammad

Search in DiVA

By author/editor
Osama, Muhammad
By organisation
Division of Systems and ControlAutomatic control
Electrical Engineering, Electronic Engineering, Information EngineeringSignal ProcessingProbability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 853 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1174 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf