Avoiding pitfalls in L-1-regularised inference of gene networks
2015 (English)In: Molecular Biosystems, ISSN 1742-206X, Vol. 11, no 1, 287-296 p.Article in journal (Refereed) Published
Statistical regularisation methods such as LASSO and related L-1 regularised regression methods are commonly used to construct models of gene regulatory networks. Although they can theoretically infer the correct network structure, they have been shown in practice to make errors, i.e. leave out existing links and include non-existing links. We show that L-1 regularisation methods typically produce a poor network model when the analysed data are ill-conditioned, i.e. the gene expression data matrix has a high condition number, even if it contains enough information for correct network inference. However, the correct structure of network models can be obtained for informative data, data with such a signal to noise ratio that existing links can be proven to exist, when these methods fail, by using least-squares regression and setting small parameters to zero, or by using robust network inference, a recent method taking the intersection of all non-rejectable models. Since available experimental data sets are generally ill-conditioned, we recommend to check the condition number of the data matrix to avoid this pitfall of L-1 regularised inference, and to also consider alternative methods.
Place, publisher, year, edition, pages
2015. Vol. 11, no 1, 287-296 p.
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
IdentifiersURN: urn:nbn:se:uu:diva-241396DOI: 10.1039/c4mb00419aISI: 000345897600028PubMedID: 25377664OAI: oai:DiVA.org:uu-241396DiVA: diva2:782857