uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology. (Cancer Pharmacology and Informatics/Rolf Larsson)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. (Cancer Pharmacology and Informatics)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. (Cancer Pharmacology and Informatics/Rolf Larsson)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
Show others and affiliations
2010 (English)In: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 49, no 2, 93-104 p.Article in journal (Refereed) Published
Abstract [en]

Objective:

Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (Cl) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the Cl is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice.

Method and material:

It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples.

Results:

Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets.

Conclusions:

An empirically derived ME prior seems promising for improving the Bayesian Cl for the unknown error rate of a designed classifier.

Place, publisher, year, edition, pages
2010. Vol. 49, no 2, 93-104 p.
National Category
Medical and Health Sciences Computer and Information Science
Identifiers
URN: urn:nbn:se:uu:diva-96787DOI: 10.1016/j.artmed.2010.02.004ISI: 000279172200003OAI: oai:DiVA.org:uu-96787DiVA: diva2:171475
Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2015-02-11
In thesis
1. Fusing Domain Knowledge with Data: Applications in Bioinformatics
Open this publication in new window or tab >>Fusing Domain Knowledge with Data: Applications in Bioinformatics
2008 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Massively parallel measurement techniques can be used for generating hypotheses about the molecular underpinnings of a biological systems. This thesis investigates how domain knowledge can be fused to data from different sources in order to generate more sophisticated hypotheses and improved analyses. We find our applications in the related fields of cell cycle regulation and cancer chemotherapy. In our cell cycle studies we design a detector of periodic expression and use it to generate hypotheses about transcriptional regulation during the course of the cell cycle in synchronized yeast cultures as well as investigate if domain knowledge about gene function can explain whether a gene is periodically expressed or not. We then generate hypotheses that suggest how periodic expression that depends on how the cells were perturbed into synchrony are regulated. The hypotheses suggest where and which transcription factors bind upstreams of genes that are regulated by the cell cycle. In our cancer chemotherapy investigations we first study how a method for identifiyng co-regulated genes associated with chemoresponse to drugs in cell lines is affected by domain knowledge about the genetic relationships between the cell lines. We then turn our attention to problems that arise in microarray based predictive medicine, were there typically are few samples available for learning the predictor and study two different means of alleviating the inherent trade-off betweeen allocation of design and test samples. First we investigate whether independent tests on the design data can be used for improving estimates of a predictors performance without inflicting a bias in the estimate. Then, motivated by recent developments in microarray based predictive medicine, we propose an algorithm that can use unlabeled data for selecting features and consequently improve predictor performance without wasting valuable labeled data.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2008. 55 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 401
Keyword
Bioinformatics, cell cycle, cancer chemotherapy, predictive tests, performance estimation, bioinformatics, Bioinformatik
Identifiers
urn:nbn:se:uu:diva-8477 (URN)978-91-554-7094-4 (ISBN)
Public defence
2008-03-13, Fåhraeussalen, Rudbecklaboratoriet, hus C:5, Dag Hammarskjölds väg 20, Uppsala, 09:00 (English)
Opponent
Supervisors
Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2009-05-12Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Göransson, HannaFryknäs, MårtenIsaksson, Anders

Search in DiVA

By author/editor
Göransson, HannaFryknäs, MårtenIsaksson, Anders
By organisation
Clinical PharmacologyDepartment of Medical SciencesThe Linnaeus Centre for Bioinformatics
In the same journal
Artificial Intelligence in Medicine
Medical and Health SciencesComputer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 530 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf