uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Gauraha, Niharika
Publications (6 of 6) Show all publications
Spjuth, O., Brännström, R. C., Carlsson, L. & Gauraha, N. (2019). Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets. In: Proceedings of the Eighth Symposium on Conformal and Probabilistic Prediction and Applications: . Paper presented at Conformal and Probabilistic Prediction and Applications (pp. 53-65). PMLR, 105
Open this publication in new window or tab >>Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets
2019 (English)In: Proceedings of the Eighth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR , 2019, Vol. 105, p. 53-65Conference paper, Published paper (Refereed)
Abstract [en]

Conformal Prediction is a framework that produces prediction intervals based on the output from a machine learning algorithm. In this paper we explore the case when training data is made up of multiple parts available in different sources that cannot be pooled. We here consider the regression case and propose a method where a conformal predictor is trained on each data source independently, and where the prediction intervals are then combined into a single interval. We call the approach Non-Disclosed Conformal Prediction (NDCP), and we evaluate it on a regression dataset from the UCI machine learning repository using support vector regression as the underlying machine learning algorithm, with varying number of data sources and sizes. The results show that the proposed method produces conservatively valid prediction intervals, and while we cannot retain the same efficiency as when all data is used, efficiency is improved through the proposed approach as compared to predicting using a single arbitrarily chosen source.

Place, publisher, year, edition, pages
PMLR, 2019
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:uu:diva-400588 (URN)
Conference
Conformal and Probabilistic Prediction and Applications
Funder
Swedish Foundation for Strategic Research , HASTE
Available from: 2019-12-27 Created: 2019-12-27 Last updated: 2019-12-27
Gauraha, N., Söderdahl, F. & Spjuth, O. (2019). Split knowledge transfer in learning under privileged information framework. In: Proceedings of the Eighth Symposium on Conformal and Probabilistic Prediction and Applications: . Paper presented at Conformal and Probabilistic Prediction and Applications (pp. 43-52). PMLR, 105
Open this publication in new window or tab >>Split knowledge transfer in learning under privileged information framework
2019 (English)In: Proceedings of the Eighth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR , 2019, Vol. 105, p. 43-52Conference paper, Published paper (Refereed)
Abstract [en]

Learning Under Privileged Information (LUPI) enables the inclusion of additional (privileged) information when training machine learning models, data that is not available when making predictions. The methodology has been successfully applied to a diverse set of problems from various fields. SVM+ was the first realization of the LUPI paradigm which showed fast convergence but did not scale well. To address the scalability issue, knowledge transfer approaches were proposed to estimate privileged information from standard features in order to construct improved decision rules. Most available knowledge transfer methods use regression techniques and the same data for approximating the privileged features as for learning the transfer function. Inspired by the cross-validation approach, we propose to partition the training data into $K$ folds and use each fold for learning a transfer function and the remaining folds for approximations of privileged features—we refer to this as split knowledge transfer. We evaluate the method using four different experimental setups comprising one synthetic and three real datasets. The results indicate that our approach leads to improved accuracy as compared to LUPI with standard knowledge transfer.

Place, publisher, year, edition, pages
PMLR, 2019
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:uu:diva-400587 (URN)
Conference
Conformal and Probabilistic Prediction and Applications
Funder
Swedish Foundation for Strategic Research , HASTE
Available from: 2019-12-27 Created: 2019-12-27 Last updated: 2019-12-27
Gauraha, N. (2018). Introduction to the LASSO: A Convex Optimization Approach for High-dimensional Problems. Resonance, 23(4), 439-464
Open this publication in new window or tab >>Introduction to the LASSO: A Convex Optimization Approach for High-dimensional Problems
2018 (English)In: Resonance, ISSN 0971-8044, Vol. 23, no 4, p. 439-464Article in journal (Refereed) Published
Abstract [en]

The term ‘high-dimensional’ refers to the case where the number of unknown parameters to be estimated, p, is of much larger order than the number of observations, n, that is pn. Since traditional statistical methods assume many observations and a few unknown variables, they can not cope up with the situations when pn. In this article, we study a statistical method, called the ‘Least Absolute Shrinkage and Selection Operator’ (LASSO), that has got much attention in solving high-dimensional problems. In particular, we consider the LASSO for high-dimensional linear regression models. We aim to provide an introduction of the LASSO method as a constrained quadratic programming problem, and we discuss the convex optimization based approach to solve the LASSO problem. We also illustrate applications of LASSO method using a simulated and a real data examples.

Place, publisher, year, edition, pages
INDIAN ACAD SCIENCES, 2018
Keywords
LASSO, high-dimensional statistics, regularized regression, least squares regression, variable selection
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-357327 (URN)10.1007/s12045-018-0635-x (DOI)000430459600005 ()
Available from: 2018-08-16 Created: 2018-08-16 Last updated: 2018-08-16Bibliographically approved
Gauraha, N., Söderdahl, F. & Spjuth, O.Robust Knowledge Transfer in Learning Under Privileged Information Framework.
Open this publication in new window or tab >>Robust Knowledge Transfer in Learning Under Privileged Information Framework
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Learning Under Privileged Information (LUPI) enables the inclusion of additional (privileged) information when training machine learning models; data that is not available when making predictions. The methodology has been successfully applied to a diverse set of problems from various fields. SVM+ was the first realization of the LUPI paradigm which showed fast convergence but did not scale well. To address the scalability issue, knowledge  transfer  approaches were proposed to estimate privileged information from standard features in order to construct improved decision rules.Most available knowledge transfer methods use regression techniques and the same data for approximating the privileged features as for learning the transfer function.Inspired by the cross-validation approach, we propose to partition the training data into K folds and use each fold for learning a transfer function and the remaining folds for approximations of privileged features - we refer to this a robust knowledge transfer. We conduct empirical evaluation considering four different experimental setups using one synthetic and three real datasets. These experiments demonstrate that our approach yields improved accuracy as compared to LUPI with standard knowledge transfer.

Keywords
Knowledge Transfer, Machine Learning, LUPI, Privileged Information
National Category
Engineering and Technology Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:uu:diva-383240 (URN)
Available from: 2019-05-10 Created: 2019-05-10 Last updated: 2019-05-15Bibliographically approved
Gauraha, N. & Spjuth, O.Synergy Conformal Prediction.
Open this publication in new window or tab >>Synergy Conformal Prediction
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Conformal Prediction is a machine learning methodology that produces valid prediction regions under mild conditions. Ensembles of conformal predictors have been proposed to improve the informational efficiency of inductive conformal predictors by combining p-values, however, the validity of such methods has been an open problem. We introduce Synergy Conformal Prediction which is an ensemble method that combines monotonic conformity scores, and is capable of producing valid prediction intervals. We study the applicability in two scenarios; where data is partitioned in order to reduce the total model training time, and where an ensemble of different machine learning methods is used to improve the overall efficiency of predictions. We evaluate the method on 10 data sets and show that the synergy conformal predictor produces valid predictions and improves informational efficiency as compared to inductive conformal prediction and existing ensemble methods. The results indicate that synergy conformal prediction has advantageous properties compared to contemporary approaches, and we also envision that it will have an impact in Big Data and federated environments.

Keywords
Conformal Prediction, Machine Learning, Synergy Conformal Prediction, Big Data, Federated Learning, Conformal Predictor Ensembles
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-360504 (URN)
Available from: 2018-09-13 Created: 2018-09-13 Last updated: 2018-09-14Bibliographically approved
Gauraha, N. & Spjuth, O.Synergy Conformal Prediction for Regression.
Open this publication in new window or tab >>Synergy Conformal Prediction for Regression
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Large and distributed data sets pose many challenges for machine learning, including requirements on computational resources and training time. One approach is to train multiple models in parallel on subsets of data and aggregate the resulting predictions. Large data sets can then be partitioned into smaller chunks, and for distributed data the need for pooling can be avoided. Combining results from conformal predictors using synergy rules has been shown to have advantageous properties for classification problems. In this paper we extend the methodology to regression problems, and we show that it produces valid and efficient predictors compared to inductive conformal predictors and cross-conformal predictors for 10 different data sets from the UCI machine learning repository using three different machine learning methods. The approach offers a straightforward and compelling alternative to pooling data, such as when working in distributed environments.

Keywords
Conformal Prediction, Machine Learning, Regression, Synergy, Ensemble Methods
National Category
Engineering and Technology
Research subject
Computing Science
Identifiers
urn:nbn:se:uu:diva-377134 (URN)
Available from: 2019-02-14 Created: 2019-02-14 Last updated: 2019-02-14Bibliographically approved
Organisations

Search in DiVA

Show all publications