Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.ORCID iD: 0000-0002-2294-004X
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Automatic control.ORCID iD: 0000-0002-6698-0166
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Automatic control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.
2019 (English)In: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) / [ed] Wallach, H Larochelle, H Beygelzimer, A d'Alche-Buc, F Fox, E Garnett, R, Neural Information Processing Systems, 2019Conference paper, Published paper (Refereed)
Abstract [en]

A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.

Place, publisher, year, edition, pages
Neural Information Processing Systems, 2019.
Series
Advances in Neural Information Processing Systems, ISSN 1049-5258 ; 32
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:uu:diva-418895ISI: 000535866903054OAI: oai:DiVA.org:uu-418895DiVA, id: diva2:1465207
Conference
33rd Conference on Neural Information Processing Systems (NeurIPS), DEC 08-14, 2019, Vancouver, CANADA
Funder
Swedish Research Council, 2017 -04610Swedish Research Council, 2018 -05040Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2022-04-11Bibliographically approved
In thesis
1. Machine learning for spatially varying data
Open this publication in new window or tab >>Machine learning for spatially varying data
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Many physical quantities around us vary across space or space-time. An example of a spatial quantity is provided by the temperature across Sweden on a given day and as an example of a spatio-temporal quantity we observe the counts of the corona virus cases across the globe. Spatial and spatio-temporal data enable opportunities to answer many important questions. For example, what the weather would be like tomorrow or where the highest risk for occurrence of a disease is in the next few days? Answering questions such as these requires formulating and learning statistical models.

One of the challenges with spatial and spatio-temporal data is that the size of data can be extremely large which makes learning a model computationally costly. There are several means of overcoming this problem by means of matrix manipulations and approximations. In paper I, we propose a solution to this problem where the model islearned in a streaming fashion, i.e., as the data arrives point by point. This also allows for efficient updating of the learned model based on newly arriving data which is very pertinent to spatio-temporal data.

Another interesting problem in the spatial context is to study the causal effect that an exposure variable has on a response variable. For instance, policy makers might be interested in knowing whether increasing the number of police in a district has the desired effect of reducing crimes there. The challenge here is that of spatial confounding. A spatial map of the number of police against the spatial map of the number of crimes in different districts might show a clear association between these two quantities. However, there might be a third unobserved confounding variable that makes both quantities small and large together. In paper II, we propose a solution for estimating causal effects in the presence of such a confounding variable.

Another common type of spatial data is point or event data, i.e., the occurrence of events across space. The event could for example be a reported disease or crime and one may be interested in predicting the counts of the event in a given region. A fundamental challenge here is to quantify the uncertainty in the predicted counts in a model in a robust manner. In paper III, we propose a regularized criterion for learning a predictive model of counts of events across spatial regions.The regularization ensures tighter prediction intervals around the predicted counts and have valid coverage irrespective of the degree of model misspecification.

Place, publisher, year, edition, pages
Uppsala: Uppsala University, 2020. p. 33
Series
Information technology licentiate theses: Licentiate theses from the Department of Information Technology, ISSN 1404-5117 ; 2020-004
Keywords
Machine learning, spatio-temporal, spatial
National Category
Probability Theory and Statistics Signal Processing
Research subject
Electrical Engineering with specialization in Signal Processing
Identifiers
urn:nbn:se:uu:diva-429234 (URN)
Presentation
2020-04-22, Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2021-01-04 Created: 2020-12-21 Last updated: 2021-12-17Bibliographically approved
2. Robust machine learning methods
Open this publication in new window or tab >>Robust machine learning methods
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We are surrounded by data in our daily lives. The rent of our houses, the amount of electricity units consumed, the prices of different products at a supermarket, the daily temperature, our medicine prescriptions, our internet search history are all different forms of data. Data can be used in a wide range of applications. For example, one can use data to predict product prices in the future; to predict tomorrow's temperature; to recommend videos; or suggest better prescriptions. However in order to do the above, one is required to learn a model from data. A model is a mathematical description of how the phenomena we are interested in behaves e.g. how does the temperature vary? Is it periodic? What kinds of patterns does it have? Machine learning is about this process of learning models from data by building on disciplines such as statistics and optimization. 

Learning models comes with many different challenges. Some challenges are related to how flexible the model is, some are related to the size of data, some are related to computational efficiency etc. One of the challenges is that of data outliers. For instance, due to war in a country exports could stop and there could be a sudden spike in prices of different products. This sudden jump in prices is an outlier or corruption to the normal situation and must be accounted for when learning the model. Another challenge could be that data is collected in one situation but the model is to be used in another situation. For example, one might have data on vaccine trials where the participants were mostly old people. But one might want to make a decision on whether to use the vaccine or not for the whole population that contains people of all age groups. So one must also account for this difference when learning models because the conclusion drawn may not be valid for the young people in the population. Yet another challenge  could arise when data is collected from different sources or contexts. For example, a shopkeeper might have data on sales of paracetamol when there was flu and when there was no flu and she might want to decide how much paracetamol to stock for the next month. In this situation, it is difficult to know whether there will be a flu next month or not and so deciding on how much to stock is a challenge. This thesis tries to address these and other similar challenges.

In paper I, we address the challenge of data corruption i.e., learning models in a robust way when some fraction of the data is corrupted. In paper II, we apply the methodology of paper I to the problem of localization in wireless networks. Paper III addresses the challenge of estimating causal effect between an exposure and an outcome variable from spatially collected data (e.g. whether increasing number of police personnel in an area reduces number of crimes there). Paper IV addresses the challenge of learning improved decision policies e.g. which treatment to assign to which patient given past data on treatment assignments. In paper V, we look at the challenge of learning models when data is acquired from different contexts and the future context is unknown. In paper VI, we address the challenge of predicting count data across space e.g. number of crimes in an area and quantify its uncertainty. In paper VII, we address the challenge of learning models when data points arrive in a streaming fashion i.e., point by point. The proposed method enables online training and also yields some robustness properties.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2022. p. 50
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2147
Keywords
artificial intelligence, machine learning, risk minimization, data corruption, decision policy, conformal methods, data from contexts, online learning, spice, robust, causal inference, point process, localization, distribution uncertainty, treatment rules, quantile treatment, predicting count data
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Signal Processing Probability Theory and Statistics
Research subject
Electrical Engineering with specialization in Signal Processing
Identifiers
urn:nbn:se:uu:diva-472453 (URN)978-91-513-1492-1 (ISBN)
Public defence
2022-06-09, 101195, Ångström, Lägerhyddsvägen 1, Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2022-05-12 Created: 2022-04-11 Last updated: 2022-06-15

Open Access in DiVA

No full text in DiVA

Other links

https://papers.nips.cc/paper/9363-prediction-of-spatial-point-processes-regularized-method-with-out-of-sample-guarantees

Authority records

Osama, MuhammadZachariah, DaveStoica, Peter

Search in DiVA

By author/editor
Osama, MuhammadZachariah, DaveStoica, Peter
By organisation
Division of Systems and ControlAutomatic control
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 234 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf