uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Feature Selection using Classification of Unlabeled Data
Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
Manuscript (Other academic)
URN: urn:nbn:se:uu:diva-96788OAI: oai:DiVA.org:uu-96788DiVA: diva2:171476
Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2010-01-13Bibliographically approved
In thesis
1. Fusing Domain Knowledge with Data: Applications in Bioinformatics
Open this publication in new window or tab >>Fusing Domain Knowledge with Data: Applications in Bioinformatics
2008 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Massively parallel measurement techniques can be used for generating hypotheses about the molecular underpinnings of a biological systems. This thesis investigates how domain knowledge can be fused to data from different sources in order to generate more sophisticated hypotheses and improved analyses. We find our applications in the related fields of cell cycle regulation and cancer chemotherapy. In our cell cycle studies we design a detector of periodic expression and use it to generate hypotheses about transcriptional regulation during the course of the cell cycle in synchronized yeast cultures as well as investigate if domain knowledge about gene function can explain whether a gene is periodically expressed or not. We then generate hypotheses that suggest how periodic expression that depends on how the cells were perturbed into synchrony are regulated. The hypotheses suggest where and which transcription factors bind upstreams of genes that are regulated by the cell cycle. In our cancer chemotherapy investigations we first study how a method for identifiyng co-regulated genes associated with chemoresponse to drugs in cell lines is affected by domain knowledge about the genetic relationships between the cell lines. We then turn our attention to problems that arise in microarray based predictive medicine, were there typically are few samples available for learning the predictor and study two different means of alleviating the inherent trade-off betweeen allocation of design and test samples. First we investigate whether independent tests on the design data can be used for improving estimates of a predictors performance without inflicting a bias in the estimate. Then, motivated by recent developments in microarray based predictive medicine, we propose an algorithm that can use unlabeled data for selecting features and consequently improve predictor performance without wasting valuable labeled data.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2008. 55 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 401
Bioinformatics, cell cycle, cancer chemotherapy, predictive tests, performance estimation, bioinformatics, Bioinformatik
urn:nbn:se:uu:diva-8477 (URN)978-91-554-7094-4 (ISBN)
Public defence
2008-03-13, Fåhraeussalen, Rudbecklaboratoriet, hus C:5, Dag Hammarskjölds väg 20, Uppsala, 09:00 (English)
Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2009-05-12Bibliographically approved

Open Access in DiVA

No full text

By organisation
The Linnaeus Centre for Bioinformatics

Search outside of DiVA

GoogleGoogle Scholar

Total: 197 hits
ReferencesLink to record
Permanent link

Direct link