Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Publications (10 of 67) Show all publications
Lysenkova, M., Zachariah, D., Krali, O. & Nordlund, J. (2025). Error Reduction in Leukemia Machine Learning Classification With Conformal Prediction. JCO Clinical Cancer Informatics, 9, Article ID e2400324.
Open this publication in new window or tab >>Error Reduction in Leukemia Machine Learning Classification With Conformal Prediction
2025 (English)In: JCO Clinical Cancer Informatics, E-ISSN 2473-4276, Vol. 9, article id e2400324Article in journal (Refereed) Published
Abstract [en]

PURPOSE

Recent advances in machine learning have led to the development of classifiers that predict molecular subtypes of acute lymphoblastic leukemia (ALL) using RNA-sequencing (RNA-seq) data. Although these models have shown promising results, they often lack robust performance guarantees. The aim of this study was three-fold: to quantify the uncertainty of these classifiers, to provide prediction sets that control the false-negative rate (FNR), and to perform implicit error reduction by transforming incorrect predictions into uncertain predictions.

METHODS

Conformal prediction (CP) is a distribution-agnostic framework for generating statistically calibrated prediction sets whose size reflects model uncertainty. In this study, we applied an extension called conformal risk control to three RNA-seq ALL subtype classifiers. Leveraging RNA-seq data from 1,227 patient samples taken at diagnosis, we developed a multiclass conformal predictor ALLCoP, which generates statistically guaranteed FNR-controlled prediction sets.

RESULTS

ALLCoP was able to create prediction sets with specified FNR tolerances ranging from 7.5% to 30%. In a validation cohort, ALLCoP successfully reduced the FNR of the ALLIUM RNA-seq ALL subtype classifier from 8.95% to 3.5%. For patients whose subtype was not previously known, the use of ALLCoP was able to reduce the occurrence of empty predictions from 37% to 17%. Notably, up to 34% of the multiple-class prediction sets included the PAX5alt subtype, suggesting that increased prediction set size may reflect secondary aberrations and biological complexity, contributing to classifier uncertainty. Finally, ALLCoP was validated on two additional RNA-seq ALL subtype classifiers, ALLSorts and ALLCatchR.

CONCLUSION

Our results highlight the potential of CP in enhancing the use of oncologic RNA-seq subtyping classifiers and also in uncovering additional molecular aberrations of potential clinical importance.

Place, publisher, year, edition, pages
Lippincott Williams & Wilkins, 2025
National Category
Bioinformatics (Computational Biology) Cancer and Oncology Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:uu:diva-558737 (URN)10.1200/CCI-24-00324 (DOI)001495051400001 ()40435436 (PubMedID)2-s2.0-105006706341 (Scopus ID)
Funder
Swedish Research Council, 2022-06725
Available from: 2025-06-12 Created: 2025-06-12 Last updated: 2025-06-12Bibliographically approved
Karakulev, A., Zachariah, D. & Singh, P. (2024). Adaptive Robust Learning using Latent Bernoulli Variables. In: Ruslan Salakhutdinov; Zico Kolter; Katherine Heller, Adrian Weller; Nuria Oliver; Jonathan Scarlett; Felix Berkenkamp (Ed.), Proceedings of the 41st International Conference on Machine Learning: . Paper presented at The 41st International Conference on Machine Learning, Vienna, Austria, 21-27 July, 2024 (pp. 23105-23122). PLMR -Proceedings of Machine Learning Research
Open this publication in new window or tab >>Adaptive Robust Learning using Latent Bernoulli Variables
2024 (English)In: Proceedings of the 41st International Conference on Machine Learning / [ed] Ruslan Salakhutdinov; Zico Kolter; Katherine Heller, Adrian Weller; Nuria Oliver; Jonathan Scarlett; Felix Berkenkamp, PLMR -Proceedings of Machine Learning Research , 2024, p. 23105-23122Conference paper, Published paper (Refereed)
Abstract [en]

We present an adaptive approach for robust learning from corrupted training sets. We identify corrupted and non-corrupted samples with latent Bernoulli variables and thus formulate the learning problem as maximization of the likelihood where latent variables are marginalized. The resulting problem is solved via variational inference, using an efficient Expectation-Maximization based method. The proposed approach improves over the state-of-the-art by automatically inferring the corruption level, while adding minimal computational overhead. We demonstrate our robust learning method and its parameter-free nature on a wide variety of machine learning tasks including online learning and deep learning where it adapts to different levels of noise and maintains high prediction accuracy.

Place, publisher, year, edition, pages
PLMR -Proceedings of Machine Learning Research, 2024
Series
Proceedings of Machine Learning Research, ISSN 2640-3498 ; 235
Keywords
robustness, statistical learning, adaptive method, probabilistic method, latent variables, variational inference
National Category
Probability Theory and Statistics
Research subject
Machine learning
Identifiers
urn:nbn:se:uu:diva-535364 (URN)
Conference
The 41st International Conference on Machine Learning, Vienna, Austria, 21-27 July, 2024
Projects
eSSENCE - An eScience Collaboration
Funder
Swedish Research Council, 2018-05040Swedish Research Council, 2023-05593
Available from: 2024-07-27 Created: 2024-07-27 Last updated: 2025-01-07Bibliographically approved
Brunacci, V., De Angelis, A. & Zachariah, D. (2024). Experimental Characterization of a Robust Localization Method Based on UWB Ranging. In: 2024 IEEE International Instrumentation and Measurement Technology Conference, I2MTC 2024: . Paper presented at IEEE International Instrumentation and Measurement Technology Conference (I2MTC), May 20-23, 2024, Glasgow, Scotland (pp. 1-5). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Experimental Characterization of a Robust Localization Method Based on UWB Ranging
2024 (English)In: 2024 IEEE International Instrumentation and Measurement Technology Conference, I2MTC 2024, Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 1-5Conference paper, Published paper (Refereed)
Abstract [en]

Robust localization, which is the accurate measurement of the position of a user or device in the presence of obstructions or challenging environments, is a fundamental building block for numerous applications. The standard positioning methods and technologies do not provide satisfactory measurement accuracy in such challenging environments. Therefore, in this paper, a robust localization method is investigated and characterized experimentally. The robust method is capable of estimating the position of a mobile node based on distance measurements with respect to known-position anchors. The method is characterized by means of experiments using the ultra wide band ranging technology. Experimental results in a non line of sight (NLOS) scenario show that the robust localization method may reduce the error by a factor of 4 with respect to the standard method, i.e. the nonlinear least squares. In fact, the robust method results in a median error of approximately 5 cm, whereas the standard method results in a median positioning error of approximately 20 cm.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
IEEE Instrumentation and Measurement Technology Conference, ISSN 2642-2069
Keywords
UWB ranging, UWB positioning, Robust localization, Non line of Sight (NLOS), Time of arrival (TOA)
National Category
Signal Processing Communication Systems
Identifiers
urn:nbn:se:uu:diva-544040 (URN)10.1109/I2MTC60896.2024.10560560 (DOI)001261521400021 ()2-s2.0-85197763228 (Scopus ID)979-8-3503-8090-3 (ISBN)979-8-3503-8091-0 (ISBN)
Conference
IEEE International Instrumentation and Measurement Technology Conference (I2MTC), May 20-23, 2024, Glasgow, Scotland
Available from: 2024-11-28 Created: 2024-11-28 Last updated: 2024-11-28Bibliographically approved
Ek, S. & Zachariah, D. (2024). Externally Valid Policy Evaluation from Randomized Trials Using Additional Observational Data. In: : . Paper presented at The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS).
Open this publication in new window or tab >>Externally Valid Policy Evaluation from Randomized Trials Using Additional Observational Data
2024 (English)Conference paper, Published paper (Other academic)
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-565528 (URN)
Conference
The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)
Available from: 2025-08-22 Created: 2025-08-22 Last updated: 2025-09-15Bibliographically approved
Galos, P., Hult, L., Zachariah, D., Lewén, A., Hånell, A., Howells, T., . . . Enblad, P. (2024). Machine Learning Based Prediction of Imminent ICP Insults During Neurocritical Care of Traumatic Brain Injury. Neurocritical Care, 42(2), 387-397
Open this publication in new window or tab >>Machine Learning Based Prediction of Imminent ICP Insults During Neurocritical Care of Traumatic Brain Injury
Show others...
2024 (English)In: Neurocritical Care, ISSN 1541-6933, E-ISSN 1556-0961, Vol. 42, no 2, p. 387-397Article in journal (Refereed) Published
Abstract [en]

Background

In neurointensive care, increased intracranial pressure (ICP) is a feared secondary brain insult in traumatic brain injury (TBI). A system that predicts ICP insults before they emerge may facilitate early optimization of the physiology, which may in turn lead to stopping the predicted ICP insult from occurring. The aim of this study was to evaluate the performance of different artificial intelligence models in predicting the risk of ICP insults.

Methods

The models were trained to predict risk of ICP insults starting within 30 min, using the Uppsala high frequency TBI dataset. A restricted dataset consisting of only monitoring data were used, and an unrestricted dataset using monitoring data as well as clinical data, demographic data, and radiological evaluations was used. Four different model classes were compared: Gaussian process regression, logistic regression, random forest classifier, and Extreme Gradient Boosted decision trees (XGBoost).

Results

Six hundred and two patients with TBI were included (total monitoring 138,411 h). On the task of predicting upcoming ICP insults, the Gaussian process regression model performed similarly on the Uppsala high frequency TBI dataset (sensitivity 93.2%, specificity 93.9%, area under the receiver operating characteristic curve [AUROC] 98.3%), as in earlier smaller studies. Using a more flexible model (XGBoost) resulted in a comparable performance (sensitivity 93.8%, specificity 94.6%, AUROC 98.7%). Adding more clinical variables and features further improved the performance of the models slightly (XGBoost: sensitivity 94.1%, specificity of 94.6%, AUROC 98.8%).

Conclusions

Artificial intelligence models have potential to become valuable tools for predicting ICP insults in advance during neurointensive care. The fact that common off-the-shelf models, such as XGBoost, performed well in predicting ICP insults opens new possibilities that can lead to faster advances in the field and earlier clinical implementations.

Place, publisher, year, edition, pages
Springer, 2024
Keywords
TBI, AI, Machine learning, Intracranial hypertension, Critical care
National Category
Signal Processing Neurology
Identifiers
urn:nbn:se:uu:diva-533622 (URN)10.1007/s12028-024-02119-7 (DOI)001320211100001 ()39322847 (PubMedID)2-s2.0-85205051256 (Scopus ID)
Funder
Swedish Research Council, 2022-06725Swedish Research Council, 2018-05973Kjell and Marta Beijer FoundationSwedish National Infrastructure for Computing (SNIC)National Academic Infrastructure for Supercomputing in Sweden (NAISS)Uppsala UniversityRegion Uppsala
Note

De två första författarna delar förstaförfattarskapet

Available from: 2024-06-27 Created: 2024-06-27 Last updated: 2025-06-25Bibliographically approved
Zhang, R., Mattsson, P. & Zachariah, D. (2024). Safe Output Feedback Improvement with Baselines. In: 2024 IEEE 63rd Conference on Decision and Control (CDC): . Paper presented at The 63rd IEEE Conference on Decision and Control, 16-19 December, 2024, Milan, Italy (pp. 1899-1904). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Safe Output Feedback Improvement with Baselines
2024 (English)In: 2024 IEEE 63rd Conference on Decision and Control (CDC), Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 1899-1904Conference paper, Published paper (Refereed)
Abstract [en]

In data-driven control design, an important prob-lem is to deal with uncertainty due to limited and noisydata. One way to do this is to use a min-max approach,which aims to minimize some design criteria for the worst-case scenario. However, a strategy based on this approachcan lead to overly conservative controllers. To overcome thisissue, we apply the idea of baseline regret, and it is seen thatminimizing the baseline regret under model uncertainty canguarantee safe controller improvement with less conservatismand variance in the resulting controllers. To exemplify theuse of baseline controllers, we focus on the output feedbacksetting and propose a two-step control design method; first,an uncertainty set is constructed by a data-driven systemidentification approach based on finite impulse response models;then a control design criterion based on model reference controlis used. To solve the baseline regret optimization problemefficiently, we use a convex approximation of the criterion andapply the scenario approach in optimization. The numericalexamples show that the inclusion of baseline regret indeedimproves the performance and reduces the variance of theresulting controller.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Control Engineering
Identifiers
urn:nbn:se:uu:diva-565075 (URN)10.1109/CDC56724.2024.10886290 (DOI)001445827201100 ()2-s2.0-86000617184 (Scopus ID)979-8-3503-1633-9 (ISBN)979-8-3503-1632-2 (ISBN)979-8-3503-1634-6 (ISBN)
Conference
The 63rd IEEE Conference on Decision and Control, 16-19 December, 2024, Milan, Italy
Available from: 2025-08-14 Created: 2025-08-14 Last updated: 2025-10-14Bibliographically approved
Mattsson, P., Zachariah, D. & Stoica, P. (2023). Analysis of the Minimum-Norm Least-Squares Estimator and Its Double-Descent Behavior [Lecture Notes]. IEEE signal processing magazine (Print), 40(3), 39-75
Open this publication in new window or tab >>Analysis of the Minimum-Norm Least-Squares Estimator and Its Double-Descent Behavior [Lecture Notes]
2023 (English)In: IEEE signal processing magazine (Print), ISSN 1053-5888, E-ISSN 1558-0792, Vol. 40, no 3, p. 39-75Article in journal (Refereed) Published
Abstract [en]

Linear regression models have a wide range of applications in statistics, signal processing, and machine learning. In this Lecture Notes column we will examine the performance of the least-squares (LS) estimator with a focus on the case when there are more parameters than training samples, which is often overlooked in textbooks on estimation.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
Least squares methods, Linear regression, Estimation, Machine learning, Signal processing, Behavioral sciences
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-504033 (URN)10.1109/MSP.2023.3242083 (DOI)000981974000005 ()
Available from: 2023-06-28 Created: 2023-06-28 Last updated: 2023-06-28Bibliographically approved
Hult, L., Zachariah, D. & Stoica, P. (2023). Diagnostic Tool for Out-of-Sample Model Evaluation. Transactions on Machine Learning Research (10)
Open this publication in new window or tab >>Diagnostic Tool for Out-of-Sample Model Evaluation
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, no 10Article in journal (Refereed) Published
Abstract [en]

Assessment of model fitness is a key part of machine learning. The standard paradigm of model evaluation is analysis of the average loss over future data. This is often explicit in model fitting, where we select models that minimize the average loss over training data asa surrogate, but comes with limited theoretical guarantees. In this paper, we consider the problem of characterizing a batch of out-of-sample losses of a model using a calibration dataset. We provide finite-sample limits on the out-of-sample losses that are statistically valid under quite general conditions and propose a diagonistic tool that is simple to compute andinterpret. Several numerical experiments are presented to show how the proposed  method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyperparameter tuning.

Place, publisher, year, edition, pages
OpenReview, 2023
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-533345 (URN)
Funder
Swedish Research Council
Available from: 2024-06-25 Created: 2024-06-25 Last updated: 2024-06-27Bibliographically approved
Wang, Z., Stoica, P., Zachariah, D., Babu, P. & Yang, Z. (2023). Min-Max Probe Placement and Extended Relaxation Estimation Method for Processing Blade Tip Timing Signals. IEEE Transactions on Instrumentation and Measurement, 72, Article ID 3535509.
Open this publication in new window or tab >>Min-Max Probe Placement and Extended Relaxation Estimation Method for Processing Blade Tip Timing Signals
Show others...
2023 (English)In: IEEE Transactions on Instrumentation and Measurement, ISSN 0018-9456, E-ISSN 1557-9662, Vol. 72, article id 3535509Article in journal (Refereed) Published
Abstract [en]

Measuring blade displacement using blade tip timing (BTT) enables nonintrusive monitoring of rotating blades and their vibration frequencies. The average sampling frequency of BTT is the product of the number of measurement probes and rotational frequency, which is usually far less than the blade natural frequency due to the limited number of probes. The pattern of the aliasing that arises from under-sampling is rather complex under uneven probe placement. In this article, we consider a probe placement design that is based on minimizing the maximum sidelobe level of the spectral window to suppress the aliasing frequencies in the spectrum. Based on a signal model containing both asynchronous and synchronous sinusoids, we then develop an extended version of the RELAX method (ERELAX) to estimate their parameters simultaneously. Model order selection rules are also used to determine the number of asynchronous sinusoids. The frequency ambiguity that arises from periodic nonuniform sampling (PNS) is also discussed based on the convolution in the frequency domain. Numerical simulations and results of a curved-blade experiment show that the proposed method has a mean squared estimation error less than 25% of that of two state-of-the-art methods (Block-OMP and MUSIC), requires 40% of the data length needed by the latter methods to achieve the same estimation accuracy, and has the smallest standard deviation of the reconstruction errors. Simulation codes are available at https://github.com/superjdg/RELAX_BTT.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Probes, Blades, Vibrations, Vibration measurement, Frequency synchronization, Time-frequency analysis, Frequency estimation, Blade tip timing (BTT), frequency ambiguity, min-max placement, model order selection, relax
National Category
Signal Processing
Identifiers
urn:nbn:se:uu:diva-516652 (URN)10.1109/TIM.2023.3324671 (DOI)001093394400004 ()
Funder
Swedish Research Council, 2017-04610Swedish Research Council, 2016-06079Swedish Research Council, 2021-05022
Available from: 2023-11-28 Created: 2023-11-28 Last updated: 2023-11-28Bibliographically approved
Ek, S., Zachariah, D., Johansson, F. D. & Stoica, P. (2023). Off-Policy Evaluation with Out-of-Sample Guarantees. Transactions on Machine Learning Research (06/2023)
Open this publication in new window or tab >>Off-Policy Evaluation with Out-of-Sample Guarantees
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, no 06/2023Article in journal (Refereed) Published
Abstract [en]

We consider the problem of evaluating the performance of a decision policy using past observational data. The outcome of a policy is measured in terms of a loss (aka. disutility or negative reward) and the main problem is making valid inferences about its out-of-sample loss when the past data was observed under a different and possibly unknown policy. Using a sample-splitting method, we show that it is possible to draw such inferences with finite-sample coverage guarantees about the entire loss distribution, rather than just its mean. Importantly, the method takes into account model misspecifications of the past policy - including unmeasured confounding. The evaluation method can be used to certify the performance of a policy using observational data under a specified range of credible model assumptions.

National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:uu:diva-519244 (URN)
Available from: 2024-01-04 Created: 2024-01-04 Last updated: 2025-08-22Bibliographically approved
Projects
Counterfactual Prediction Methods for Heterogeneous Populations [2018-05040_VR]; Uppsala University; Publications
Osama, M., Zachariah, D. & Stoica, P. (2020). Learning Robust Decision Policies from Observational Data. In: H. Larochelle; M. Ranzato; R. Hadsell; M.F. Balcan; H. Lin (Ed.), Advances in Neural Information Processing Systems 33 (NeurIPS 2020): . Paper presented at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), 6-12 December, 2020, Online. Neural Information Processing Systems
Robust learning methods for out-of-distribution tasks [2021-05022_VR]; Uppsala University; Publications
Tang, B., Li, D., Wu, W., Saini, A., Babu, P. & Stoica, P. (2025). Dual-Function Beamforming Design for Multi-Target Localization and Reliable Communications. IEEE Transactions on Signal Processing, 73, 559-573Saini, A., Stoica, P. & Babu, P. (2025). Maximum Likelihood Method for Received Signal Strength-Based Source Localization. IEEE Transactions on Aerospace and Electronic Systems, 61(4), 10889-10895
Trustworthy Bandit Algorithms for Precision Medicine [2024-03903_VR]; Uppsala University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-6698-0166

Search in DiVA

Show all publications