uu.seUppsala University Publications
Change search
Refine search result
12345 1 - 50 of 223
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the 'Create feeds' function.
  • 1.
    Aguileta, Gabriela
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bielawski, Joseph P.
    Yang, Ziheng
    Proposed standard nomenclature for the alpha- and beta-globin gene families2006In: Genes & Genetic Systems, ISSN 1341-7568, E-ISSN 1880-5779, Vol. 81, no 5, p. 367-371Article in journal (Refereed)
    Abstract [en]

    The globin family of genes and proteins has been a recurrent object of study for many decades. This interest has generated a vast amount of knowledge. However it has also created an inconsistent and confusing nomenclature, due to the lack of a systematic approach to naming genes and failure to reflect the phylogenetic relationships among genes of the gene family. To alleviate the problems with the existing system, here we propose a standardized nomenclature for the alpha and beta globin family of genes, based on a phylogenetic analysis of vertebrate alpha and beta globins, and following the Guidelines for Human Gene Nomenclature.

  • 2.
    Alvarez-Castro, Jose M.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Le Rouzic, Arnaud
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    How to perform meaningful estimates of genetic effects2008In: PLoS Genetics, ISSN 1553-7390, Vol. 4, no 5, p. e1000062-Article in journal (Refereed)
    Abstract [en]

    Although the genotype-phenotype map plays a central role both in Quantitative and Evolutionary Genetics, the formalization of a completely general and satisfactory model of genetic effects, particularly accounting for epistasis, remains a theoretical challenge. Here, we use a two-locus genetic system in simulated populations with epistasis to show the convenience of using a recently developed model, NOIA, to perform estimates of genetic effects and the decomposition of the genetic variance that are orthogonal even under deviations from the Hardy-Weinberg proportions. We develop the theory for how to use this model in interval mapping of quantitative trait loci using Halley-Knott regressions, and we analyze a real data set to illustrate the advantage of using this approach in practice. In this example, we show that departures from the Hardy-Weinberg proportions that are expected by sampling alone substantially alter the orthogonal estimates of genetic effects when other statistical models, like F-2 or G2A, are used instead of NOIA. Finally, for the first time from real data, we provide estimates of functional genetic effects as sets of effects of natural allele substitutions in a particular genotype, which enriches the debate on the interpretation of genetic effects as implemented both in functional and in statistical models. We also discuss further implementations leading to a completely general genotype-phenotype map.

  • 3.
    Ameur, Adam
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    A Bioinformatics Study of Human Transcriptional Regulation2008Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Regulation of transcription is a central mechanism in all living cells that now can be investigated with high-throughput technologies. Data produced from such experiments give new insights to how transcription factors (TFs) coordinate the gene transcription and thereby regulate the amounts of proteins produced. These studies are also important from a medical perspective since TF proteins are often involved in disease. To learn more about transcriptional regulation, we have developed strategies for analysis of data from microarray and massively parallel sequencing (MPS) experiments.

    Our computational results consist of methods to handle the steadily increasing amount of data from high-throughput technologies. Microarray data analysis tools have been assembled in the LCB-Data Warehouse (LCB-DWH) (paper I), and other analysis strategies have been developed for MPS data (paper V). We have also developed a de novo motif search algorithm called BCRANK (paper IV).

    The analysis has lead to interesting biological findings in human liver cells (papers II-V). The investigated TFs appeared to bind at several thousand sites in the genome, that we have identified at base pair resolution. The investigated histone modifications are mainly found downstream of transcription start sites, and correlated to transcriptional activity. These histone marks are frequently found for pairs of genes in a bidirectional conformation. Our results suggest that a TF can bind in the shared promoter of two genes and regulate both of them.

    From a medical perspective, the genes bound by the investigated TFs are candidates to be involved in metabolic disorders. Moreover, we have developed a new strategy to detect single nucleotide polymorphisms (SNPs) that disrupt the binding of a TF (paper IV). We further demonstrated that SNPs can affect transcription in the immediate vicinity. Ultimately, our method may prove helpful to find disease-causing regulatory SNPs.

    List of papers
    1. The LCB Data Warehouse
    Open this publication in new window or tab >>The LCB Data Warehouse
    Show others...
    2006 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 22, no 8, p. 1024-1026Article in journal (Refereed) Published
    Abstract [en]

    The Linnaeus Centre for Bioinformatics Data Warehouse (LCB-DWH) is a web-based infrastructure for reliable and secure microarray gene expression data management and analysis that provides an online service for the scientific community. The LCB-DWH is an effort towards a complete system for storage (using the BASE system), analysis and publication of microarray data. Important features of the system include: access to established methods within R/Bioconductor for data analysis, built-in connection to the Gene Ontology database and a scripting facility for automatic recording and re-play of all the steps of the analysis. The service is up and running on a high performance server. At present there are more than 150 registered users.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:uu:diva-97704 (URN)10.1093/bioinformatics/btl036 (DOI)16455749 (PubMedID)
    Available from: 2008-11-06 Created: 2008-11-06 Last updated: 2017-12-14Bibliographically approved
    2. Binding sites for metabolic disease related transcription factors inferred at base pair resolution by chromatin immunoprecipitation and genomic microarrays
    Open this publication in new window or tab >>Binding sites for metabolic disease related transcription factors inferred at base pair resolution by chromatin immunoprecipitation and genomic microarrays
    Show others...
    2005 (English)In: Human Molecular Genetics, ISSN 0964-6906, E-ISSN 1460-2083, Vol. 14, no 22, p. 3435-3447Article in journal (Refereed) Published
    Abstract [en]

    We present a detailed in vivo characterization of hepatocyte transcriptional regulation in HepG2 cells, using chromatin immunoprecipitation and detection on PCR fragment-based genomic tiling path arrays covering the encyclopedia of DNA element (ENCODE) regions. Our data suggest that HNF-4α and HNF-3β, which were commonly bound to distal regulatory elements, may cooperate in the regulation of a large fraction of the liver transcriptome and that both HNF-4α and USF1 may promote H3 acetylation to many of their targets. Importantly, bioinformatic analysis of the sequences bound by each transcription factor (TF) shows an over-representation of motifs highly similar to the in vitro established consensus sequences. On the basis of these data, we have inferred tentative binding sites at base pair resolution. Some of these sites have been previously found by in vitro analysis and some were verified in vitro in this study. Our data suggests that a similar approach could be used for the in vivo characterization of all predicted/uncharacterized TF and that the analysis could be scaled to the whole genome.

    Keyword
    Base Pairing/*genetics, Binding Sites/genetics, Cell Line; Tumor, Chromatin/*metabolism, Chromatin Immunoprecipitation/methods, Consensus Sequence, Genome; Human, Hepatocyte Nuclear Factor 3-beta/physiology, Hepatocyte Nuclear Factor 4/physiology, Hepatocytes/metabolism, Histones/metabolism, Humans, Metabolic Diseases/*metabolism, Oligonucleotide Array Sequence Analysis/methods, Promoter Regions (Genetics), Research Support; N.I.H.; Extramural, Research Support; Non-U.S. Gov't, Sequence Analysis; DNA, Transcription Factors/genetics/*metabolism, Upstream Stimulatory Factors/metabolism
    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:uu:diva-80603 (URN)10.1093/hmg/ddi378 (DOI)16221759 (PubMedID)
    Available from: 2006-05-19 Created: 2006-05-19 Last updated: 2017-12-14Bibliographically approved
    3. Whole-genome maps of USF1 and USF2 binding and histone H3 acetylation reveal new aspects of promoter structure and candidate genes for common human disorders
    Open this publication in new window or tab >>Whole-genome maps of USF1 and USF2 binding and histone H3 acetylation reveal new aspects of promoter structure and candidate genes for common human disorders
    Show others...
    2008 (English)In: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 18, no 3, p. 380-392Article in journal (Refereed) Published
    Abstract [en]

    Transcription factors and histone modifications are crucial regulators of gene expression that mutually influence each other. We present the DNA binding profiles of upstream stimulatory factors 1 and 2 (USF1, USF2) and acetylated histone H3 (H3ac) in a liver cell line for the whole human genome using ChIP-chip at a resolution of 35 base pairs. We determined that these three proteins bind mostly in proximity of protein coding genes transcription start sites (TSSs), and their bindings are positively correlated with gene expression levels. Based on the spatial and functional relationship between USFs and H3ac at protein coding gene promoters, we found similar promoter architecture for known genes and the novel and less-characterized transcripts human mRNAs and spliced ESTs. Furthermore, our analysis revealed a previously underestimated abundance of genes in a bidirectional conformation, where USFs are bound in between TSSs. After taking into account this promoter conformation, the results indicate that H3ac is mainly located downstream of TSS, and it is at this genomic location where it positively correlates with gene expression. Finally, USF1, which is associated to familial combined hyperlipidemia, was found to bind and potentially regulate nuclear mitochondrial genes as well as genes for lipid and cholesterol metabolism, frequently in collaboration with GA binding protein transcription factor alpha (GABPA, nuclear respiratory factor 2 [NRF-2]). This expands our understanding about the transcriptional control of metabolic processes and its alteration in metabolic disorders.

    National Category
    Bioinformatics and Systems Biology
    Identifiers
    urn:nbn:se:uu:diva-97706 (URN)10.1101/gr.6880908 (DOI)000253766700004 ()18230803 (PubMedID)
    Available from: 2008-11-06 Created: 2008-11-06 Last updated: 2017-12-14Bibliographically approved
    4. New algorithm and ChIP-analysis identifies candidate functional SNPs
    Open this publication in new window or tab >>New algorithm and ChIP-analysis identifies candidate functional SNPs
    Show others...
    In: PNASArticle in journal (Refereed) Submitted
    Identifiers
    urn:nbn:se:uu:diva-97707 (URN)
    Available from: 2008-11-06 Created: 2008-11-06Bibliographically approved
    5. Differential binding and co-binding pattern of FOXA1 and FOXA3 and their relation to H3K4me3 in HepG2 cells revealed by ChIP-seq
    Open this publication in new window or tab >>Differential binding and co-binding pattern of FOXA1 and FOXA3 and their relation to H3K4me3 in HepG2 cells revealed by ChIP-seq
    Show others...
    2009 (English)In: Genome Biology, ISSN 1465-6906, E-ISSN 1474-760X, Vol. 10, no 11, p. R129-Article in journal (Refereed) Published
    Abstract [en]

    BACKGROUND: The forkhead box/winged helix family members FOXA1, FOXA2, and FOXA3 are of high importance in development and specification of the hepatic linage and the continued expression of liver-specific genes. RESULTS: Here, we present a genome-wide location analysis of FOXA1 and FOXA3 binding sites in HepG2 cells through chromatin immunoprecipitation with detection by sequencing (ChIP-seq) studies and compare these with our previous results on FOXA2. We found that these factors often bind close to each other in different combinations and consecutive immunoprecipitation of chromatin for one and then a second factor (ChIP-reChIP) shows that this occurs in the same cell and on the same DNA molecule, suggestive of molecular interactions. Using co-immunoprecipitation, we further show that FOXA2 interacts with both FOXA1 and FOXA3 in vivo, while FOXA1 and FOXA3 do not appear to interact. Additionally, we detected diverse patterns of trimethylation of lysine 4 on histone H3 (H3K4me3) at transcriptional start sites and directionality of this modification at FOXA binding sites. Using the sequence reads at polymorphic positions, we were able to predict allele specific binding for FOXA1, FOXA3, and H3K4me3. Finally, several SNPs associated with diseases and quantitative traits were located in the enriched regions. CONCLUSIONS: We find that ChIP-seq can be used not only to create gene regulatory maps but also to predict molecular interactions and to inform on the mechanisms for common quantitative variation.

    National Category
    Medical and Health Sciences Biological Sciences
    Identifiers
    urn:nbn:se:uu:diva-119751 (URN)10.1186/gb-2009-10-11-r129 (DOI)000273344600016 ()19919681 (PubMedID)
    Note

    De två (2) första författarna delar förstaförfattarskapet.

    Available from: 2010-03-01 Created: 2010-03-01 Last updated: 2017-12-12Bibliographically approved
  • 4.
    Ameur, Adam
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Rada-Iglesias, Alvaro
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Identification of candidate regulatory SNPs by combination of transcription-factor-binding site prediction, SNP genotyping and haploChIP2009In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 37, no 12, p. e85-Article in journal (Refereed)
    Abstract [en]

    Disease-associated SNPs detected in large-scale association studies are   frequently located in non-coding genomic regions, suggesting that they may be involved in transcriptional regulation. Here we describe a new strategy for detecting regulatory SNPs (rSNPs), by combining   computational and experimental approaches. Whole genome ChIP-chip data   for USF1 was analyzed using a novel motif finding algorithm called   BCRANK. 1754 binding sites were identified and 140 candidate rSNPs were   found in the predicted sites. For validating their regulatory function,   seven SNPs found to be heterozygous in at least one of four human cell   samples were investigated by ChIP and sequence analysis (haploChIP). In   four of five cases where the SNP was predicted to affect binding, USF1   was preferentially bound to the allele containing the consensus motif.   Allelic differences in binding for other proteins and histone marks   further reinforced the SNPs regulatory potential. Moreover, for one of   these SNPs, H3K36me3 and POLR2A levels at neighboring heterozygous SNPs   indicated effects on transcription. Our strategy, which is entirely   based on in vivo data for both the prediction and validation steps, can   identify individual binding sites at base pair resolution and predict   rSNPs. Overall, this approach can help to pinpoint the causative SNPs   in complex disorders where the associated haplotypes are located in regulatory regions. Availability: BCRANK is available from Bioconductor  (http://www.bioconductor.org).

  • 5.
    Ameur, Adam
    et al.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Rada-Iglesias, Alvaro
    Komorowski, Jan
    Wadelius, Claes
    Ameur, Adam
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    New algorithm and ChIP-analysis identifies candidate functional SNPsIn: PNASArticle in journal (Refereed)
  • 6.
    Ameur, Adam
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Yankovski, Vladimir
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Spjuth, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    The LCB Data Warehouse2006In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 22, no 8, p. 1024-1026Article in journal (Refereed)
    Abstract [en]

    The Linnaeus Centre for Bioinformatics Data Warehouse (LCB-DWH) is a web-based infrastructure for reliable and secure microarray gene expression data management and analysis that provides an online service for the scientific community. The LCB-DWH is an effort towards a complete system for storage (using the BASE system), analysis and publication of microarray data. Important features of the system include: access to established methods within R/Bioconductor for data analysis, built-in connection to the Gene Ontology database and a scripting facility for automatic recording and re-play of all the steps of the analysis. The service is up and running on a high performance server. At present there are more than 150 registered users.

  • 7. Anderson, Frank E.
    et al.
    Córdoba, Alonso J.
    Thollesson, Mikael
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bilaterian phylogeny based on analyses of a region of the sodium-potassium ATPase alpha-subunit gene2004In: Journal of Molecular Evolution, ISSN 0022-2844, E-ISSN 1432-1432, Vol. 58, no 3, p. 252-268Article in journal (Refereed)
    Abstract [en]

    Molecular investigations of deep-level relationships within and among the animal phyla have been hampered by a lack of slowly evolving genes that are amenable to study by molecular systematists. To provide new data for use in deep-level metazoan phylogenetic studies, primers were developed to amplify a 1.3-kb region of the subunit of the nuclear-encoded sodium–potassium ATPase gene from 31 bilaterians representing several phyla. Maximum parsimony, maximum likelihood, and Bayesian analyses of these sequences (combined with ATPase sequences for 23 taxa downloaded from GenBank) yield congruent trees that corroborate recent findings based on analyses of other data sets (e.g., the 18S ribosomal RNA gene). The ATPase-based trees support monophyly for several clades (including Lophotrochozoa, a form of Ecdysozoa, Vertebrata, Mollusca, Bivalvia, Gastropoda, Arachnida, Hexapoda, Coleoptera, and Diptera) but do not support monophyly for Deuterostomia, Arthropoda, or Nemertea. Parametric bootstrapping tests reject monophyly for Arthropoda and Nemertea but are unable to reject deuterostome monophyly. Overall, the sodium–potassium ATPase -subunit gene appears to be useful for deep-level studies of metazoan phylogeny.

  • 8.
    Andersson, Claes
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fusing Domain Knowledge with Data: Applications in Bioinformatics2008Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Massively parallel measurement techniques can be used for generating hypotheses about the molecular underpinnings of a biological systems. This thesis investigates how domain knowledge can be fused to data from different sources in order to generate more sophisticated hypotheses and improved analyses. We find our applications in the related fields of cell cycle regulation and cancer chemotherapy. In our cell cycle studies we design a detector of periodic expression and use it to generate hypotheses about transcriptional regulation during the course of the cell cycle in synchronized yeast cultures as well as investigate if domain knowledge about gene function can explain whether a gene is periodically expressed or not. We then generate hypotheses that suggest how periodic expression that depends on how the cells were perturbed into synchrony are regulated. The hypotheses suggest where and which transcription factors bind upstreams of genes that are regulated by the cell cycle. In our cancer chemotherapy investigations we first study how a method for identifiyng co-regulated genes associated with chemoresponse to drugs in cell lines is affected by domain knowledge about the genetic relationships between the cell lines. We then turn our attention to problems that arise in microarray based predictive medicine, were there typically are few samples available for learning the predictor and study two different means of alleviating the inherent trade-off betweeen allocation of design and test samples. First we investigate whether independent tests on the design data can be used for improving estimates of a predictors performance without inflicting a bias in the estimate. Then, motivated by recent developments in microarray based predictive medicine, we propose an algorithm that can use unlabeled data for selecting features and consequently improve predictor performance without wasting valuable labeled data.

    List of papers
    1. In vitro drug sensitivity-gene expression correlations involve a tissue of origin dependency
    Open this publication in new window or tab >>In vitro drug sensitivity-gene expression correlations involve a tissue of origin dependency
    Show others...
    2007 (English)In: Journal of chemical information and modeling, ISSN 1549-9596, Vol. 47, no 1, p. 239-248Article in journal (Refereed) Published
    Abstract [en]

    A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper shows that these discrepancies can be viewed as a weighting of drug-drug distances, the weights depending on which cell lines the two drugs differ in.

    Keyword
    computers in chemistry, computer program
    National Category
    Medical and Health Sciences Signal Processing
    Research subject
    Electrical Engineering with specialization in Signal Processing
    Identifiers
    urn:nbn:se:uu:diva-20891 (URN)10.1021/ci060073n (DOI)000243577400029 ()17238270 (PubMedID)
    Available from: 2007-10-28 Created: 2007-10-28 Last updated: 2016-09-25Bibliographically approved
    2. Bayesian detection of periodic mRNA time profiles withouth use of training examples
    Open this publication in new window or tab >>Bayesian detection of periodic mRNA time profiles withouth use of training examples
    2006 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 7, p. 63-Article in journal (Refereed) Published
    Abstract [en]

    BACKGROUND: Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at a particular frequency that characterizes the process under study but this frequency is seldom exactly known. Previously proposed detector designs require access to labelled training examples and do not allow systematic incorporation of diffuse prior knowledge available about the period time. RESULTS: A learning-free Bayesian detector that does not rely on labelled training examples and allows incorporation of prior knowledge about the period time is introduced. It is shown to outperform two recently proposed alternative learning-free detectors on simulated data generated with models that are different from the one used for detector design. Results from applying the detector to mRNA expression time profiles from S. cerevisiae showsthat the genes detected as periodically expressed only contain a small fraction of the cell-cycle genes inferred from mutant phenotype. For example, when the probability of false alarm was equal to 7%, only 12% of the cell-cycle genes were detected. The genes detected as periodically expressed were found to have a statistically significant overrepresentation of known cell-cycle regulated sequence motifs. One known sequence motif and 18 putative motifs, previously not associated with periodic expression, were also over represented. CONCLUSION: In comparison with recently proposed alternative learning-free detectors for periodic gene expression, Bayesian inference allows systematic incorporation of diffuse a priori knowledge about, e.g. the period time. This results in relative performance improvements due to increased robustness against errors in the underlying assumptions. Results from applying the detector to mRNA expression time profiles from S. cerevisiae include several new findings that deserve further experimental studies.

    National Category
    Medical and Health Sciences Engineering and Technology
    Identifiers
    urn:nbn:se:uu:diva-96785 (URN)10.1186/1471-2105-7-63 (DOI)16469110 (PubMedID)
    Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2017-12-14Bibliographically approved
    3. Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors
    Open this publication in new window or tab >>Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors
    Show others...
    2007 (English)In: BMC Systems Biology, ISSN 1752-0509, Vol. 1, p. 45-Article in journal (Refereed) Published
    Abstract [en]

    Background: We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results: We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion: The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes.

    National Category
    Biological Sciences Signal Processing
    Research subject
    Electrical Engineering with specialization in Signal Processing
    Identifiers
    urn:nbn:se:uu:diva-96786 (URN)10.1186/1752-0509-1-45 (DOI)000252363100001 ()17939860 (PubMedID)
    Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2016-09-25Bibliographically approved
    4. Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors
    Open this publication in new window or tab >>Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors
    Show others...
    2010 (English)In: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 49, no 2, p. 93-104Article in journal (Refereed) Published
    Abstract [en]

    Objective:

    Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (Cl) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the Cl is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice.

    Method and material:

    It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples.

    Results:

    Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets.

    Conclusions:

    An empirically derived ME prior seems promising for improving the Bayesian Cl for the unknown error rate of a designed classifier.

    National Category
    Medical and Health Sciences Computer and Information Sciences
    Identifiers
    urn:nbn:se:uu:diva-96787 (URN)10.1016/j.artmed.2010.02.004 (DOI)000279172200003 ()
    Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2018-01-13
    5. Feature Selection using Classification of Unlabeled Data
    Open this publication in new window or tab >>Feature Selection using Classification of Unlabeled Data
    Manuscript (Other academic)
    Identifiers
    urn:nbn:se:uu:diva-96788 (URN)
    Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2010-01-13Bibliographically approved
  • 9.
    Andersson, Claes R.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology.
    Fryknäs, Mårten
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Rickardson, Linda
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology.
    Larsson, Rolf
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology.
    Isaksson, Anders
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology.
    Gustafsson, Mats G.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology. Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences, Signals and Systems Group.
    In vitro drug sensitivity-gene expression correlations involve a tissue of origin dependency2007In: Journal of chemical information and modeling, ISSN 1549-9596, Vol. 47, no 1, p. 239-248Article in journal (Refereed)
    Abstract [en]

    A major concern of chemogenomics is to associate drug activity with biological variables. Several reports have clustered cell line drug activity profiles as well as drug activity-gene expression correlation profiles and noted that the resulting groupings differ but still reflect mechanism of action. The present paper shows that these discrepancies can be viewed as a weighting of drug-drug distances, the weights depending on which cell lines the two drugs differ in.

  • 10.
    Andersson, Claes R.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Hvidsten, Torgeir R.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Isaksson, Anders
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
    Gustafsson, Mats G.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology. Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences, Signals and Systems Group.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors2007In: BMC Systems Biology, ISSN 1752-0509, Vol. 1, p. 45-Article in journal (Refereed)
    Abstract [en]

    Background: We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results: We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion: The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes.

  • 11.
    Andersson, Claes R.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Isaksson, Anders
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Gustafsson, Mats G.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology. Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences, Signal Processing.
    Bayesian detection of periodic mRNA time profiles withouth use of training examples2006In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 7, p. 63-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: Detection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at a particular frequency that characterizes the process under study but this frequency is seldom exactly known. Previously proposed detector designs require access to labelled training examples and do not allow systematic incorporation of diffuse prior knowledge available about the period time. RESULTS: A learning-free Bayesian detector that does not rely on labelled training examples and allows incorporation of prior knowledge about the period time is introduced. It is shown to outperform two recently proposed alternative learning-free detectors on simulated data generated with models that are different from the one used for detector design. Results from applying the detector to mRNA expression time profiles from S. cerevisiae showsthat the genes detected as periodically expressed only contain a small fraction of the cell-cycle genes inferred from mutant phenotype. For example, when the probability of false alarm was equal to 7%, only 12% of the cell-cycle genes were detected. The genes detected as periodically expressed were found to have a statistically significant overrepresentation of known cell-cycle regulated sequence motifs. One known sequence motif and 18 putative motifs, previously not associated with periodic expression, were also over represented. CONCLUSION: In comparison with recently proposed alternative learning-free detectors for periodic gene expression, Bayesian inference allows systematic incorporation of diffuse a priori knowledge about, e.g. the period time. This results in relative performance improvements due to increased robustness against errors in the underlying assumptions. Results from applying the detector to mRNA expression time profiles from S. cerevisiae include several new findings that deserve further experimental studies.

  • 12.
    Andersson, Claes R.
    et al.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Larsson, Rolf
    Isaksson, Anders
    Gustafsson, Mats G.
    Feature Selection using Classification of Unlabeled DataManuscript (Other academic)
  • 13.
    Andersson, Robin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Decoding the Structural Layer of Transcriptional Regulation: Computational Analyses of Chromatin and Chromosomal Aberrations2010Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Gene activity is regulated at two separate layers. Through structural and chemical properties of DNA – the primary layer of encoding – local signatures may enable, or disable, the binding of proteins or complexes of them with regulatory potential to the DNA. At a higher level – the structural layer of encoding – gene activity is regulated through the properties of higher order DNA structure, chromatin, and chromosome organization. Cells with abnormal chromosome compaction or organization, e.g. cancer cells, may thus have perturbed regulatory activities resulting in abnormal gene activity.

    Hence, there is a great need to decode the transcriptional regulation encoded in both layers to further our understanding of the factors that control activity and life of a cell and, ultimately, an organism. Modern genome-wide studies with those aims rely on data-intense experiments requiring sophisticated computational and statistical methods for data handling and analyses. This thesis describes recent advances of analyzing experimental data from quantitative biological studies to decipher the structural layer of encoding in human cells.

    Adopting an integrative approach when possible, combining multiple sources of data, allowed us to study the influences of chromatin (Papers I and II) and chromosomal aberrations (Paper IV) on transcription. Combining chromatin data with chromosomal aberration data allowed us to identify putative driver oncogenes and tumor-suppressor genes in cancer (Paper IV).

    Bayesian approaches enabling the incorporation of background information in the models and the adaptability of such models to data have been very useful. Their usages yielded accurate and narrow detection of chromosomal breakpoints in cancer (Papers III and IV) and reliable positioning of nucleosomes and their dynamics during transcriptional regulation at functionally relevant regulatory elements (Paper II).

    Using massively parallel sequencing data, we explored the chromatin landscapes of human cells (Papers I and II) and concluded that there is a preferential and evolutionary conserved positioning at internal exons nearly unaffected by the transcriptional level. We also observed a strong association between certain histone modifications and the inclusion or exclusion of an exon in the mature gene transcript, suggesting a functional role in splicing.

    List of papers
    1. Nucleosomes are well positioned in exons and carry characteristic histone modifications
    Open this publication in new window or tab >>Nucleosomes are well positioned in exons and carry characteristic histone modifications
    Show others...
    2009 (English)In: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 19, no 10, p. 1732-1741Article in journal (Refereed) Published
    Abstract [en]

    The genomes of higher organisms are packaged in nucleosomes with functional histone modifications. Until now, genome-wide nucleosome and histone modification studies have focused on transcription start sites (TSSs) where nucleosomes in RNA polymerase II (RNAPII) occupied genes are well positioned and have histone modifications that are characteristic of expression status. Using public data, we here show that there is a higher nucleosome-positioning signal in internal human exons and that this positioning is independent of expression. We observed a similarly strong nucleosome-positioning signal in internal exons of C. elegans. Among the 38 histone modifications analyzed in man, H3K36me3, H3K79me1, H2BK5me1, H3K27me1, H3K27me2 and H3K27me3 had evidently higher signal in internal exons than in the following introns and were clearly related to exon expression. These observations are suggestive of roles in splicing. Thus, exons are not only characterized by their coding capacity but also by their nucleosome organization, which seems evolutionary conserved since it is present in both primates and nematodes.

    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:uu:diva-107609 (URN)10.1101/gr.092353.109 (DOI)000270389700005 ()19687145 (PubMedID)
    Note

    De tre första författarna delar första författarskapet.

    Available from: 2009-08-19 Created: 2009-08-19 Last updated: 2017-12-13Bibliographically approved
    2. Strand-based mixture modeling of nucleosome positioning in HepG2 cells and their regulatory dynamics in response to TGF-beta treatment
    Open this publication in new window or tab >>Strand-based mixture modeling of nucleosome positioning in HepG2 cells and their regulatory dynamics in response to TGF-beta treatment
    Show others...
    (English)Manuscript (preprint) (Other academic)
    Identifiers
    urn:nbn:se:uu:diva-130998 (URN)
    Available from: 2010-09-20 Created: 2010-09-20 Last updated: 2010-11-11
    3. A Segmental Maximum A Posteriori Approach to Genome-wide Copy Number Profiling
    Open this publication in new window or tab >>A Segmental Maximum A Posteriori Approach to Genome-wide Copy Number Profiling
    Show others...
    2008 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 6, p. 751-758Article in journal (Other academic) Published
    Abstract [en]

    MOTIVATION: Copy number profiling methods aim at assigning DNA copy numbers to chromosomal regions using measurements from microarray-based comparative genomic hybridizations. Among the proposed methods to this end, Hidden Markov Model (HMM)-based approaches seem promising since DNA copy number transitions are naturally captured in the model. Current discrete-index HMM-based approaches do not, however, take into account heterogeneous information regarding the genomic overlap between clones. Moreover, the majority of existing methods are restricted to chromosome-wise analysis. RESULTS: We introduce a novel Segmental Maximum A Posteriori approach, SMAP, for DNA copy number profiling. Our method is based on discrete-index Hidden Markov Modeling and incorporates genomic distance and overlap between clones. We exploit a priori information through user-controllable parameterization that enables the identification of copy number deviations of various lengths and amplitudes. The model parameters may be inferred at a genome-wide scale to avoid overfitting of model parameters often resulting from chromosome-wise model inference. We report superior performances of SMAP on synthetic data when compared with two recent methods. When applied on our new experimental data, SMAP readily recognizes already known genetic aberrations including both large-scale regions with aberrant DNA copy number and changes affecting only single features on the array. We highlight the differences between the prediction of SMAP and the compared methods and show that SMAP accurately determines copy number changes and benefits from overlap consideration.

    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:uu:diva-13616 (URN)10.1093/bioinformatics/btn003 (DOI)000254010400003 ()18204059 (PubMedID)
    Available from: 2008-08-21 Created: 2008-08-21 Last updated: 2017-12-11Bibliographically approved
    4. Integrative epigenomic and genomic analysis of malignant pheochromocytoma
    Open this publication in new window or tab >>Integrative epigenomic and genomic analysis of malignant pheochromocytoma
    Show others...
    2010 (English)In: Experimental and Molecular Medicine, ISSN 1226-3613, E-ISSN 2092-6413, Vol. 42, no 7, p. 484-502Article in journal (Refereed) Published
    Abstract [en]

    Epigenomic and genomic changes affect gene expression and contribute to tumor development. The histone modifications trimethylated histone H3 lysine 4 (H3K4me3) and lysine 27 (H3K27me3) are epigenetic regulators associated to active and silenced genes, respectively and alterations of these modifications have been observed in cancer. Furthermore, genomic aberrations such as DNA copy number changes are common events in tumors. Pheochromocytoma is a rare endocrine tumor of the adrenal gland that mostly occurs sporadic with unknown epigenetic/genetic cause. The majority of cases are benign. Here we aimed to combine the genome-wide profiling of H3K4me3 and H3K27me3, obtained by the ChIP-chip methodology, and DNA copy number data with global gene expression examination in a malignant pheochromocytoma sample. The integrated analysis of the tumor expression levels, in relation to normal adrenal medulla, indicated that either histone modifications or chromosomal alterations, or both, have great impact on the expression of a substantial fraction of the genes in the investigated sample. Candidate tumor suppressor genes identified with decreased expression, a H3K27me3 mark and/or in regions of deletion were for instance TGIF1, DSC3, TNFRSF10B, RASSF2, HOXA9, PTPRE and CDH11. More genes were found with increased expression, a H3K4me3 mark, and/or in regions of gain. Potential oncogenes detected among those were GNAS, INSM1, DOK5, ETV1, RET, NTRK1, IGF2, and the H3K27 trimethylase gene EZH2. Our approach to associate histone methylations and DNA copy number changes to gene expression revealed apparent impact on global gene transcription, and enabled the identification of candidate tumor genes for further exploration.

    Keyword
    histone code, DNA copy number changes, gene expression, oncogenes, pheochromocytoma, tumor suppressor genes
    National Category
    Medical and Health Sciences
    Identifiers
    urn:nbn:se:uu:diva-129532 (URN)10.3858/emm.2010.42.7.050 (DOI)000280558100002 ()20534969 (PubMedID)
    Available from: 2010-08-18 Created: 2010-08-18 Last updated: 2017-12-12Bibliographically approved
  • 14.
    Andersson, Robin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bruder, Carl E G
    Piotrowski, Arkadiusz
    Menzel, Uwe
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Nord, Helena
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Sandgren, Johanna
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Surgical Sciences.
    Hvidsten, Torgeir R
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    de Ståhl, Teresita Diaz
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Dumanski, Jan P
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    A Segmental Maximum A Posteriori Approach to Genome-wide Copy Number Profiling2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 6, p. 751-758Article in journal (Other academic)
    Abstract [en]

    MOTIVATION: Copy number profiling methods aim at assigning DNA copy numbers to chromosomal regions using measurements from microarray-based comparative genomic hybridizations. Among the proposed methods to this end, Hidden Markov Model (HMM)-based approaches seem promising since DNA copy number transitions are naturally captured in the model. Current discrete-index HMM-based approaches do not, however, take into account heterogeneous information regarding the genomic overlap between clones. Moreover, the majority of existing methods are restricted to chromosome-wise analysis. RESULTS: We introduce a novel Segmental Maximum A Posteriori approach, SMAP, for DNA copy number profiling. Our method is based on discrete-index Hidden Markov Modeling and incorporates genomic distance and overlap between clones. We exploit a priori information through user-controllable parameterization that enables the identification of copy number deviations of various lengths and amplitudes. The model parameters may be inferred at a genome-wide scale to avoid overfitting of model parameters often resulting from chromosome-wise model inference. We report superior performances of SMAP on synthetic data when compared with two recent methods. When applied on our new experimental data, SMAP readily recognizes already known genetic aberrations including both large-scale regions with aberrant DNA copy number and changes affecting only single features on the array. We highlight the differences between the prediction of SMAP and the compared methods and show that SMAP accurately determines copy number changes and benefits from overlap consideration.

  • 15.
    Andersson, Robin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Barbacioru, Catalin
    Reddy Bysani, Madhu Sudhan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Wallerman, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Tuch, Brian
    Lee, Clarence
    Peckham, Heather
    McKernan, Kevin
    de la Vega, Francisco
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Strand-based mixture modeling of nucleosome positioning in HepG2 cells and their regulatory dynamics in response to TGF-beta treatmentManuscript (preprint) (Other academic)
  • 16.
    Andersson, Robin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Rada-Iglesias, Alvaro
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Nucleosomes are well positioned in exons and carry characteristic histone modifications2009In: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 19, no 10, p. 1732-1741Article in journal (Refereed)
    Abstract [en]

    The genomes of higher organisms are packaged in nucleosomes with functional histone modifications. Until now, genome-wide nucleosome and histone modification studies have focused on transcription start sites (TSSs) where nucleosomes in RNA polymerase II (RNAPII) occupied genes are well positioned and have histone modifications that are characteristic of expression status. Using public data, we here show that there is a higher nucleosome-positioning signal in internal human exons and that this positioning is independent of expression. We observed a similarly strong nucleosome-positioning signal in internal exons of C. elegans. Among the 38 histone modifications analyzed in man, H3K36me3, H3K79me1, H2BK5me1, H3K27me1, H3K27me2 and H3K27me3 had evidently higher signal in internal exons than in the following introns and were clearly related to exon expression. These observations are suggestive of roles in splicing. Thus, exons are not only characterized by their coding capacity but also by their nucleosome organization, which seems evolutionary conserved since it is present in both primates and nematodes.

  • 17.
    Andersson, Robin
    et al.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Vitoria, Aida
    Maluszynski, Jan
    Komorowski, Jan
    RoSy: A Rough Knowledge Base System2005In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing: 10th International Conference, RSFDGrC 2005, Regina, Canada, August 31 - September 3, 2005, Proceedings, Part II, 2005, p. 48-58Conference paper (Refereed)
    Abstract [en]

    This paper presents a user-oriented view of RoSy, a Rough Knowledge Base System. The system tackles two problems not fully answered by previous research: the ability to define rough sets in terms of other rough sets and incorporation of domain or expert knowledge. We describe two main components of RoSy: knowledge base creation and query answering. The former allows the user to create a knowledge base of rough concepts and checks that the definitions do not cause what we will call a model failure. The latter gives the user a possibility to query rough concepts defined in the knowledge base. The features of RoSy are described using examples. The system is currently available on a web site for online interactions.

  • 18.
    Ardell, David H
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    SCANMS: adjusting for multiple comparisons in sliding window neutrality tests.2004In: Bioinformatics, ISSN 1367-4803, Vol. 20, no 12, p. 1986-8Article in journal (Other scientific)
  • 19.
    Ardell, David H.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution.
    Andersson, Siv G. E.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution.
    TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA sythetase2006In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 34, no 3, p. 893-904Article in journal (Refereed)
    Abstract [en]

    We present TFAM, an automated, statistical method to classify the identity of tRNAs. TFAM, currently optimized for bacteria, classifies initiator tRNAs and predicts the charging identity of both typical and atypical tRNAs such as suppressors with high confidence. We show statistical evidence for extensive variation in tRNA identity determinants among bacterial genomes due to variation in overall tDNA base content. With TFAM we have detected the first case of eukaryotic-like tRNA identity rules in bacteria. An alpha-proteobacterial clade encompassing Rhizobiales, Caulobacter crescentus and Silicibacter pomeroyi, unlike a sister clade containing the Rickettsiales, Zymomonas mobilis and Gluconobacter oxydans, uses the eukaryotic identity element A73 instead of the highly conserved prokaryotic element C73. We confirm divergence of bacterial histidylation rules by demonstrating perfect covariation of alpha-proteobacterial tRNA(His) acceptor stems and residues in the motif IIb tRNA-binding pocket of their histidyl-tRNA synthetases (HisRS). Phylogenomic analysis supports lateral transfer of a eukaryotic-like HisRS into the alpha-proteobacteria followed by in situ adaptation of the bacterial tDNA(His) and identity rule divergence. Our results demonstrate that TFAM is an effective tool for the bioinformatics, comparative genomics and evolutionary study of tRNA identity.

  • 20.
    Ardell, David Herman
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Informatic Approaches to Molecular Translation2005In: Intelligent Information Processing and Web Mining: Advances in Soft Computing, Proceedings of the IIS'2005 Symposium, 2005, p. 684-Conference paper (Other scientific)
  • 21. Bandelt, HJ
    et al.
    Huber, KT
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Moulton, V
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Quasi-median graphs from sets of partitions2002In: Discrete Applied Mathematics, ISSN 0166-218X, Vol. 122, no 23-35, p. 23-35Article in journal (Other (popular scientific, debate etc.))
    Abstract [en]

    In studies of molecular evolution, one is typically confronted with the task of inferring a phylogenetic tree from a set X of sequences of length n over a finite alphabet Lambda. For studies that invoke parsimony, it has been found helpful to consider the quasi-median graph generated by X in the Hamming graph Lambda(n). Although a great deal is already known about quasi-median graphs (and their algebraic counterparts), little is known about the quasi-median generation in Lambda(n) starting from a set X of vertices. We describe the vertices of the quasi-median graph generated by X in terms of the coordinatewise partitions of X. In particular, we clarify when the generated quasi-median graph is the so-called relation graph associated with X. This immediately characterizes the instances where either a block graph or the total Hamming graph is generated.

  • 22.
    Bang S, Koolen JH, Moulton V
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    A bound for the number of columns l((c,a,b)) in the intersection array of a distance-regular graph2003In: European Journal of Combinatorics, ISSN 0195-6698, Vol. 24, no 7, p. 785-795Article in journal (Refereed)
  • 23.
    Barrio, Alvaro Martinez
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Eriksson, Oskar
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology, Medical Genetics. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
    Badhai, Jitendra
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology, Medical Genetics.
    Fröjmark, Anne-Sophie
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology, Medical Genetics.
    Bongcam-Rudloff, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Dahl, Niklas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology, Medical Genetics.
    Schuster, Jens
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology, Medical Genetics.
    Targeted Resequencing and Analysis of the Diamond-Blackfan Anemia Disease Locus RPS192009In: PLoS ONE, ISSN 1932-6203, Vol. 4, no 7, p. e6172-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: The Ribosomal protein S19 gene locus (RPS19) has been linked to two kinds of red cell aplasia, Diamond-Blackfan Anemia (DBA) and Transient Erythroblastopenia in Childhood (TEC). Mutations in RPS19 coding sequences have been found in 25% of DBA patients, but not in TEC patients. It has been suggested that non-coding RPS19 sequence variants contribute to the considerable clinical variability in red cell aplasia. We therefore aimed at identifying non-coding variations associated with DBA or TEC phenotypes. METHODOLOGY/PRINCIPAL FINDINGS: We targeted a region of 19'980 bp encompassing the RPS19 gene in a cohort of 89 DBA and TEC patients for resequencing. We provide here a catalog of the considerable, previously unrecognized degree of variation in this region. We identified 73 variations (65 SNPs, 8 indels) that all are located outside of the RPS19 open reading frame, and of which 67.1% are classified as novel. We hypothesize that specific alleles in non-coding regions of RPS19 could alter the binding of regulatory proteins or transcription factors. Therefore, we carried out an extensive analysis to identify transcription factor binding sites (TFBS). A series of putative interaction sites coincide with detected variants. Sixteen of the corresponding transcription factors are of particular interest, as they are housekeeping genes or show a direct link to hematopoiesis, tumorigenesis or leukemia (e.g. GATA-1/2, PU.1, MZF-1). CONCLUSIONS: Specific alleles at predicted TFBSs may alter the expression of RPS19, modify an important interaction between transcription factors with overlapping TFBS or remove an important stimulus for hematopoiesis. We suggest that the detected interactions are of importance for hematopoiesis and could provide new insights into individual response to treatment.

  • 24.
    Barrio, Alvaro Martínez
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Lagercrantz, Erik
    Sperber, Göran O.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience, Physiology.
    Blomberg, Jonas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Virology.
    Bongcam-Rudloff, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX2009In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10 Suppl. 6, p. S18-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. RESULTS: An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. CONCLUSION: The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, http://www.dasregistry.org, or at http://loka.bmc.uu.se/das/sources.

  • 25. Beisvåg, Vidar
    et al.
    Kauffmann, Audrey
    Malone, James
    Foy, Carole
    Salit, Marc
    Schimmel, Heinz
    Bongcam-Rudloff, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Landegren, Ulf
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Molecular tools. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Parkinson, Helen
    Huber, Wolfgang
    Brazma, Alvis
    Sandvik, Arne K.
    Kuiper, Martin
    Contributions of the EMERALD project to assessing and improving microarray data quality2011In: BioTechniques, ISSN 0736-6205, E-ISSN 1940-9818, Vol. 50, no 1, p. 27-31Article in journal (Refereed)
    Abstract [en]

    While minimum information about a microarray experiment (MIAME) standards have helped to increase the value of the microarray data deposited into public databases like ArrayExpress and Gene Expression Omnibus (GEO), limited means have been available to assess the quality of this data or to identify the procedures used to normalize and transform raw data. The EMERALD FP6 Coordination Action was designed to deliver approaches to assess and enhance the overall quality of microarray data and to disseminate these approaches to the microarray community through an extensive series of workshops, tutorials, and symposia. Tools were developed for assessing data quality and used to demonstrate how the removal of poor-quality data could improve the power of statistical analyses and facilitate analysis of multiple joint microarray data sets. These quality metrics tools have been disseminated through publications and through the software package arrayQualityMetrics. Within the framework provided by the Ontology of Biomedical Investigations, ontology was developed to describe data transformations, and software ontology was developed for gene expression analysis software. In addition, the consortium has advocated for the development and use of external reference standards in microarray hybridizations and created the Molecular Methods (MolMeth) database, which provides a central source for methods and protocols focusing on microarray-based technologies.

  • 26. Beldiceanu, Nicolas
    et al.
    Flener, Pierre
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
    Lorca, Xavier
    Combining tree partitioning, precedence, and incomparability constraints2008In: Constraints, ISSN 1383-7133, E-ISSN 1572-9354, Vol. 13, no 4, p. 459-489Article in journal (Refereed)
  • 27. Beldiceanu, Nicolas
    et al.
    Flener, Pierre
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology. Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics. Department of Ecology and Evolution, Computing Science. Datalogi.
    Lorca, Xavier
    The tree constraint2005In: Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems: Second International Conference, CPAIOR 2005, Proceedings, 2005, p. 64-78Conference paper (Refereed)
  • 28.
    Berglund, Ann-Charlotte
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Sjölund, Erik
    Östlund, Gabriel
    Sonnhammer, Erik L. L.
    InParanoid 6: eukaryotic ortholog clusters with inparalogs2008In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 36, p. D263-D266Article in journal (Refereed)
    Abstract [en]

    The InParanoid eukaryotic ortholog database (http://InParanoid.sbc.su.se/) has been updated to version 6 and is now based on 35 species. We collected all available complete eukaryotic proteomes and Escherichia coli, and calculated ortholog groups for all 595 species pairs using the InParanoid program. This resulted in 2 642 187 pairwise ortholog groups in total. The orthology-based species relations are presented in an orthophylogram. InParanoid clusters contain one or more orthologs from each of the two species. Multiple orthologs in the same species, i.e. inparalogs, result from gene duplications after the species divergence. A new InParanoid website has been developed which is optimized for speed both for users and for updating the system. The XML output format has been improved for efficient processing of the InParanoid ortholog clusters.

  • 29.
    Berglund-Sonnhammer, Ann-Charlotte
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Steffansson, Pär
    Betts, Matthew J.
    Liberles, David A.
    Optimal gene trees from sequences and species trees using a soft interpretation of parsimony2006In: Journal of Molecular Evolution, ISSN 0022-2844, E-ISSN 1432-1432, Vol. 63, no 2, p. 240-250Article in journal (Refereed)
    Abstract [en]

    Gene duplication and gene loss as well as other biological events can result in multiple copies of genes in a given species. Because of these gene duplication and loss dynamics, in addition to variation in sequence evolution and other sources of uncertainty, different gene trees ultimately present different evolutionary histories. All of this together results in gene trees that give different topologies from each other, making consensus species trees ambiguous in places. Other sources of data to generate species trees are also unable to provide completely resolved binary species trees. However, in addition to gene duplication events, speciation events have provided some underlying phylogenetic signal, enabling development of algorithms to characterize these processes. Therefore, a soft parsimony algorithm has been developed that enables the mapping of gene trees onto species trees and modification of uncertain or weakly supported branches based on minimizing the number of gene duplication and loss events implied by the tree. The algorithm also allows for rooting of unrooted trees and for removal of in-paralogues (lineage-specific duplicates and redundant sequences masquerading as such). The algorithm has also been made available for download as a software package, Softparsmap.

  • 30.
    Besnier, Francois
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Development of Variance Component Methods for Genetic Dissection of Complex Traits2009Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This thesis presents several developments on Variance component (VC) approach for Quantitative Trait Locus (QTL) mapping.

    The first part consists of methodological improvements: a new fast and efficient method for estimating IBD matrices, have been developed. The new method makes a better use of the computer resources in terms of computational power and storage memory, facilitating further improvements by resolving methodological bottlenecks in algorithms to scan multiple QTL. A new VC model have also been developed in order to consider and evaluate the correlation of the allelic effects within parental lines origin in experimental outbred crosses. The method was tested on simulated and experimental data and revealed a higher or similar power to detect QTL than linear regression based QTL mapping.

    The second part focused on the prospect to analyze multi-generational pedigrees by VC approach. The IBD estimation algorithm was extended to include haplotype information in addition to genotype and pedigree to improve the accuracy of the IBD estimates, and a new haplotyping algorithm was developed for limiting the risk of haplotyping errors in multigenerational pedigrees. Those newly developed methods where subsequently applied for the analysis of a nine generations AIL pedigree obtained after crossing two chicken lines divergently selected for body weight. Nine QTL described in a F2 population were replicated in the AIL pedigree, and our strategy to use both genotype and phenotype information from all individuals in the entire pedigree clearly made efficient use of the available genotype information provided in AIL.

    List of papers
    1. An Improved Method for Quantitative Trait Loci Detection and Identification of Within-Line Segregation in F2 Intercross Designs
    Open this publication in new window or tab >>An Improved Method for Quantitative Trait Loci Detection and Identification of Within-Line Segregation in F2 Intercross Designs
    2008 (English)In: Genetics, ISSN 0016-6731, E-ISSN 1943-2631, Vol. 178, no 4, p. 2315-2326Article in journal (Refereed) Published
    Abstract [en]

    We present a new flexible, simple, and power ful genome-scan method (flexible intercross analysis, FIA) for detecting quantitative trait loci (QTL) in experimental line crosses. The method is based on a pure random-effects model that simultaneously models between- and within-line QTL variation for single as well as epistatic QTL. It utilizes the score statistic and thereby facilitates computationally efficient significance testing based on empirical significance thresholds obtained by means of permutations. The properties of the method are explored using simulations and analyses of experimental data. The simulations showed that the power of FIA was as good as, or better than, Haley–Knott regression and that FIA was rather insensitive to the level of allelic fixation in the founders, especially for pedigrees with few founders. A chromosome scan was conducted for a meat quality trait in an F2 intercross in pigs where a mutation in the halothane (Ryanodine receptor, RYR1) gene with a large effect on meat quality was known to segregate in one founder line. FIA obtained significant support for the halothane-associated QTL and identified the base generation allele with the mutated allele. A genome scan was also performed in a previously analyzed chicken F2 intercross. In the chicken intercross analysis, four previously detected QTL were confirmed at a 5% genomewide significance level, and FIA gave strong evidence (P , 0.01) for two of these QTL to be segregating within the founder lines. FIA was also extended to account for epistasis and using simulations we show that the method provides good estimates of epistatic QTL variance even for segregating QTL. Extensions of FIA and its applications on other intercross populations including backcrosses, advanced intercross lines, and heterogeneous stocks are also discussed.

    National Category
    Genetics
    Research subject
    Genetics
    Identifiers
    urn:nbn:se:uu:diva-101358 (URN)10.1534/genetics.107.083162 (DOI)000255239600039 ()18430952 (PubMedID)
    Available from: 2009-05-06 Created: 2009-04-23 Last updated: 2017-12-13Bibliographically approved
    2. Fine mapping and replication of QTL in outbred chicken advanced intercross lines
    Open this publication in new window or tab >>Fine mapping and replication of QTL in outbred chicken advanced intercross lines
    Show others...
    2011 (English)In: Genetics Selection Evolution, ISSN 0999-193X, E-ISSN 1297-9686, Vol. 43, p. 3-Article in journal (Refereed) Published
    Abstract [en]

    BACKGROUND: Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature.

    METHODS: We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight.

    RESULTS: Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL.

    CONCLUSIONS: Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.

     

    National Category
    Genetics
    Research subject
    Genetics
    Identifiers
    urn:nbn:se:uu:diva-101398 (URN)10.1186/1297-9686-43-3 (DOI)000287133300001 ()21241486 (PubMedID)
    Available from: 2009-04-24 Created: 2009-04-24 Last updated: 2017-12-13
    3.
    The record could not be found. The reason may be that the record is no longer available or you may have typed in a wrong id in the address field.
    4. A genetic algorithm based method for stringent haplotyping of family data
    Open this publication in new window or tab >>A genetic algorithm based method for stringent haplotyping of family data
    2009 (English)In: BMC Genetics, ISSN 1471-2156, E-ISSN 1471-2156, Vol. 10, article id 57Article in journal (Refereed) Published
    Abstract [en]

    Background: The linkage phase, or haplotype, is an extra level of information that in addition to genotype and pedigree can be useful for reconstructing the inheritance pattern of the alleles in a pedigree, and computing for example Identity By Descent probabilities. If a haplotype is provided, the precision of estimated IBD probabilities increases, as long as the haplotype is estimated without errors. It is therefore important to only use haplotypes that are strongly supported by the available data for IBD estimation, to avoid introducing new errors due to erroneous linkage phases.

    Results: We propose a genetic algorithm based method for haplotype estimation in family data that includes a stringency parameter. This allows the user to decide the error tolerance level when inferring parental origin of the alleles. This is a novel feature compared to existing methods for haplotype estimation. We show that using a high stringency produces haplotype data with few errors, whereas a low stringency provides haplotype estimates in most situations, but with an increased number of errors.

    Conclusion: By including a stringency criterion in our haplotyping method, the user is able to maintain the error rate at a suitable level for the particular study; one can select anything from haplotyped data with very small proportion of errors and a higher proportion of non-inferred haplotypes, to data with phase estimates for every marker, when haplotype errors are tolerable. Giving this choice makes the method more flexible and useful in a wide range of applications as it is able to fulfil different requirements regarding the tolerance for haplotype errors, or uncertain marker-phases.

    National Category
    Bioinformatics and Systems Biology
    Research subject
    Genetics
    Identifiers
    urn:nbn:se:uu:diva-101397 (URN)10.1186/1471-2156-10-57 (DOI)000270360900001 ()19761594 (PubMedID)
    Note

    Manuscripttitle in list of papers in thesis: A genetic algorithm based haplotyping method provides better control on haplotype error rate

    Available from: 2009-04-24 Created: 2009-04-24 Last updated: 2017-12-13Bibliographically approved
  • 31.
    Besnier, Francois
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Swedish University of Agricultural Sciences, Uppsala, Sweden.
    A general and efficient method for estimating continuous IBD functions for use in genome scans for QTL2007In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 8, article id 440Article in journal (Refereed)
    Abstract [en]

    Background: Identity by descent (IBD) matrix estimation is a central component in mapping of Quantitative Trait Loci (QTL) using variance component models. A large number of algorithms have been developed for estimation of IBD between individuals in populations at discrete locations in the genome for use in genome scans to detect QTL affecting various traits of interest in experimental animal, human and agricultural pedigrees. Here, we propose a new approach to estimate IBD as continuous functions rather than as discrete values. Results: Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL. We have explored two approaches to obtain continuous marker-bracket IBD-functions. By re-implementing an existing and fast deterministic IBD-estimation method, we show that this approach results in IBD functions that produces the exact same IBD as the original algorithm, but with a greater than 2-fold improvement of the computational efficiency and a considerably lower memory requirement for storing the resulting genome-wide IBD. By developing a general IBD function approximation algorithm, we show that it is possible to estimate marker-bracket IBD functions from IBD matrices estimated at marker locations by any existing IBD estimation algorithm. The general algorithm provides approximations that lead to QTL variance component estimates that even in worst-case scenarios are very similar to the true values. The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL. Conclusion: In addition to direct improvements in computational and memory efficiency, estimation of IBD-functions is a fundamental step needed to develop and implement new efficient optimization algorithms for high precision localization of QTL. Here, we discuss and test two approaches for estimating IBD functions based on existing IBD estimation algorithms. Our approaches provide immediately useful techniques for use in single QTL analyses in the variance component QTL mapping framework. They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.

  • 32.
    Besnier, Francois
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    A genetic algorithm based method for stringent haplotyping of family data2009In: BMC Genetics, ISSN 1471-2156, E-ISSN 1471-2156, Vol. 10, article id 57Article in journal (Refereed)
    Abstract [en]

    Background: The linkage phase, or haplotype, is an extra level of information that in addition to genotype and pedigree can be useful for reconstructing the inheritance pattern of the alleles in a pedigree, and computing for example Identity By Descent probabilities. If a haplotype is provided, the precision of estimated IBD probabilities increases, as long as the haplotype is estimated without errors. It is therefore important to only use haplotypes that are strongly supported by the available data for IBD estimation, to avoid introducing new errors due to erroneous linkage phases.

    Results: We propose a genetic algorithm based method for haplotype estimation in family data that includes a stringency parameter. This allows the user to decide the error tolerance level when inferring parental origin of the alleles. This is a novel feature compared to existing methods for haplotype estimation. We show that using a high stringency produces haplotype data with few errors, whereas a low stringency provides haplotype estimates in most situations, but with an increased number of errors.

    Conclusion: By including a stringency criterion in our haplotyping method, the user is able to maintain the error rate at a suitable level for the particular study; one can select anything from haplotyped data with very small proportion of errors and a higher proportion of non-inferred haplotypes, to data with phase estimates for every marker, when haplotype errors are tolerable. Giving this choice makes the method more flexible and useful in a wide range of applications as it is able to fulfil different requirements regarding the tolerance for haplotype errors, or uncertain marker-phases.

  • 33.
    Besnier, Francois
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wahlberg, Per
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Rönnegård, Lars
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Ek, Weronica
    Swedish University of Agricultural Sciences .
    Andersson, Leif
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Siegel, Paul
    virginia polytechnic institute and state university.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fine mapping and replication of QTL in outbred chicken advanced intercross lines2011In: Genetics Selection Evolution, ISSN 0999-193X, E-ISSN 1297-9686, Vol. 43, p. 3-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature.

    METHODS: We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight.

    RESULTS: Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL.

    CONCLUSIONS: Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.

     

  • 34. Birney, Ewan
    et al.
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigó, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    Vetrie, David
    Weaver, Molly
    Wilcox, Sarah
    Yu, Man
    Collins, Francis S.
    Dekker, Job
    Lieb, Jason D.
    Tullius, Thomas D.
    Crawford, Gregory E.
    Sunyaev, Shamil
    Noble, William S.
    Dunham, Ian
    Denoeud, France
    Reymond, Alexandre
    Kapranov, Philipp
    Rozowsky, Joel
    Zheng, Deyou
    Castelo, Robert
    Frankish, Adam
    Harrow, Jennifer
    Ghosh, Srinka
    Sandelin, Albin
    Hofacker, Ivo L.
    Baertsch, Robert
    Keefe, Damian
    Dike, Sujit
    Cheng, Jill
    Hirsch, Heather A.
    Sekinger, Edward A.
    Lagarde, Julien
    Abril, Josep F.
    Shahab, Atif
    Flamm, Christoph
    Fried, Claudia
    Hackermüller, Jörg
    Hertel, Jana
    Lindemeyer, Manja
    Missal, Kristin
    Tanzer, Andrea
    Washietl, Stefan
    Korbel, Jan
    Emanuelsson, Olof
    Pedersen, Jakob S.
    Holroyd, Nancy
    Taylor, Ruth
    Swarbreck, David
    Matthews, Nicholas
    Dickson, Mark C.
    Thomas, Daryl J.
    Weirauch, Matthew T.
    Gilbert, James
    Drenkow, Jorg
    Bell, Ian
    Zhao, XiaoDong
    Srinivasan, K. G.
    Sung, Wing-Kin
    Ooi, Hong Sain
    Chiu, Kuo Ping
    Foissac, Sylvain
    Alioto, Tyler
    Brent, Michael
    Pachter, Lior
    Tress, Michael L.
    Valencia, Alfonso
    Choo, Siew Woh
    Choo, Chiou Yu
    Ucla, Catherine
    Manzano, Caroline
    Wyss, Carine
    Cheung, Evelyn
    Clark, Taane G.
    Brown, James B.
    Ganesh, Madhavan
    Patel, Sandeep
    Tammana, Hari
    Chrast, Jacqueline
    Henrichsen, Charlotte N.
    Kai, Chikatoshi
    Kawai, Jun
    Nagalakshmi, Ugrappa
    Wu, Jiaqian
    Lian, Zheng
    Lian, Jin
    Newburger, Peter
    Zhang, Xueqing
    Bickel, Peter
    Mattick, John S.
    Carninci, Piero
    Hayashizaki, Yoshihide
    Weissman, Sherman
    Hubbard, Tim
    Myers, Richard M.
    Rogers, Jane
    Stadler, Peter F.
    Lowe, Todd M.
    Wei, Chia-Lin
    Ruan, Yijun
    Struhl, Kevin
    Gerstein, Mark
    Antonarakis, Stylianos E.
    Fu, Yutao
    Green, Eric D.
    Karaöz, U.
    Siepel, Adam
    Taylor, James
    Liefer, Laura A
    Wetterstrand, Kris A.
    Good, Peter J.
    Feingold, Elise A.
    Guyer, Mark S.
    Cooper, Gregory M.
    Asimenos, George
    Dewey, Colin N.
    Hou, Minmei
    Nikolaev, Sergey
    Montoya-Burgos, Juan I.
    Löytynoja, Ari
    Whelan, Simon
    Pardi, Fabio
    Massingham, Tim
    Huang, Haiyan
    Zhang, Nancy R.
    Holmes, Ian
    Mullikin, James C.
    Ureta-Vidal, Abel
    Paten, Benedict
    Seringhaus, Michael
    Church, Deanna
    Rosenbloom, Kate
    Kent, W. James
    Stone, Eric A.
    Batzoglou, Serafim
    Goldman, Nick
    Hardison, Ross C.
    Haussler, David
    Miller, Webb
    Sidow, Arend
    Trinklein, Nathan D.
    Zhang, Zhengdong D.
    Barrera, Leah
    Stuart, Rhona
    King, David C.
    Ameur, Adam
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bieda, Mark C.
    Kim, Jonghwan
    Bhinge, Akshay A.
    Jiang, Nan
    Liu, Jun
    Yao, Fei
    Vega, Vinsensius B.
    Lee, Charlie W.
    Ng, Patrick
    Shahab, Atif
    Yang, Annie
    Moqtaderi, Zarmik
    Zhu, Zhou
    Xu, Xiaoqin
    Squazzo, Sharon
    Oberley, Matthew J.
    Inman, David
    Singer, Michael A.
    Richmond, Todd A.
    Munn, Kyle J.
    Rada-Iglesias, Alvaro
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Wallerman, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fowler, Joanna C.
    Couttet, Phillippe
    Bruce, Alexander W.
    Dovey, Oliver M.
    Ellis, Peter D.
    Langford, Cordelia F.
    Nix, David A.
    Euskirchen, Ghia
    Hartman, Stephen
    Urban, Alexander E.
    Kraus, Peter
    Van Calcar, Sara
    Heintzman, Nate
    Kim, Tae Hoon
    Wang, Kun
    Qu, Chunxu
    Hon, Gary
    Luna, Rosa
    Glass, Christopher K.
    Rosenfeld, M. Geoff
    Aldred, Shelley Force
    Cooper, Sara J.
    Halees, Anason
    Lin, Jane M.
    Shulha, Hennady P.
    Zhang, Xiaoling
    Xu, Mousheng
    Haidar, Jaafar N.
    Yu, Yong
    Ruan, Yijun
    Iyer, Vishwanath R.
    Green, Roland D.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Farnham, Peggy J.
    Ren, Bing
    Harte, Rachel A.
    Hinrichs, Angie S.
    Trumbower, Heather
    Clawson, Hiram
    Hillman-Jackson, Jennifer
    Zweig, Ann S.
    Smith, Kayla
    Thakkapallayil, Archana
    Barber, Galt
    Kuhn, Robert M.
    Karolchik, Donna
    Armengol, Lluis
    Bird, Christine P.
    de Bakker, Paul I.
    Kern, Andrew D.
    Lopez-Bigas, Nuria
    Martin, Joel D.
    Stranger, Barbara E.
    Woodroffe, Abigail
    Davydov, Eugene
    Dimas, Antigone
    Eyras, Eduardo
    Hallgrí­msdóttir, Ingileif B.
    Huppert, Julian
    Zody, Michael C.
    Abecasis, G. R.
    Estivill, Xavier
    Bouffard, Gerard G.
    Guan, Xiaobin
    Hansen, Nancy F.
    Idol, Jacquelyn R.
    Maduro, Valerie V.
    Maskeri, Baishali
    McDowell, Jennifer C.
    Park, Morgan
    Thomas, Pamela J.
    Young, Alice C.
    Blakesley, Robert W.
    Muzny, Donna M.
    Sodergren, Erica
    Wheeler, David A.
    Worley, Kim C.
    Jiang, Huaiyang
    Weinstock, George M.
    Gibbs, Richard A.
    Graves, Tina
    Fulton, Robert
    Mardis, Elaine R.
    Wilson, Richard K.
    Clamp, Michele
    Cuff, James
    Gnerre, Sante
    Jaffe, David B.
    Chang, Jean L.
    Lindblad-Toh, Kerstin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Lander, Eric S.
    Koriabine, Maxim
    Nefedov, Mikhail
    Osoegawa, Kazutoyo
    Yoshinaga, Yuko
    Zhu, Baoli
    de Jong, Pieter J.
    Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project2007In: Nature, ISSN 0028-0836, E-ISSN 1476-4687, Vol. 447, no 7146, p. 799-816Article in journal (Refereed)
    Abstract [en]

    We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

  • 35.
    Björkholm, Patrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Daniluk, Pawel
    Kryshtafovych, Andriy
    Fidelis, Krzysztof
    Andersson, Robin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Hvidsten, Torgeir R.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 10, p. 1264-1270Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Correct prediction of residue-residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail. RESULTS: We propose a novel hidden Markov model based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary struc-ture and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities in-corporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 L predictions (L = sequence length), our hidden Markov models obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature.

  • 36.
    Borovicanin B, Grunewald S, Gutman I, Petrovic M
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Harmonic graphs with small number of cycles2003In: Discrete mathematics, ISSN 0012-365X, Vol. 265, no 1-3, p. 31-44Article in journal (Refereed)
  • 37. Bruder, Carl E G
    et al.
    Piotrowski, Arkadiusz
    Gijsbers, Antoinet A C J
    Andersson, Robin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Erickson, Stephen
    de Ståhl, Teresita Diaz
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Menzel, Uwe
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Sandgren, Johanna
    von Tell, Desiree
    Poplawski, Andrzej
    Crowley, Michael
    Crasto, Chiquito
    Partridge, E Christopher
    Tiwari, Hemant
    Allison, David B
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    van Ommen, Gert-Jan B
    Boomsma, Dorret I
    Pedersen, Nancy L
    den Dunnen, Johan T
    Wirdefeldt, Karin
    Dumanski, Jan P
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles2008In: American Journal of Human Genetics, ISSN 0002-9297, E-ISSN 1537-6605, Vol. 82, no 3, p. 763-71Article in journal (Refereed)
    Abstract [en]

    The exploration of copy-number variation (CNV), notably of somatic cells, is an understudied aspect of genome biology. Any differences in the genetic makeup between twins derived from the same zygote represent an irrefutable example of somatic mosaicism. We studied 19 pairs of monozygotic twins with either concordant or discordant phenotype by using two platforms for genome-wide CNV analyses and showed that CNVs exist within pairs in both groups. These findings have an impact on our views of genotypic and phenotypic diversity in monozygotic twins and suggest that CNV analysis in phenotypically discordant monozygotic twins may provide a powerful tool for identifying disease-predisposition loci. Our results also imply that caution should be exercised when interpreting disease causality of de novo CNVs found in patients based on analysis of a single tissue in routine disease-related DNA diagnostics.

  • 38. Bryant, D
    et al.
    Moulton, V
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Neighbor-Net: An agglomerative method for the construction of phylogenetic networks2004In: Molecular Biology and Evolution, ISSN 0737-4038, Vol. 21, no 2, p. 255-265Article in journal (Refereed)
    Abstract [en]

    We present Neighbor-Net, a distance based method for constructing phylogenetic networks that is based on the Neighbor-Joining (NJ) algorithm of Saitou and Nei. Neighbor-Net provides a snapshot of the data that can guide more detailed analysis. Unlike split decomposition, Neighbor-Net scales well and can quickly produce detailed and informative networks for several hundred taxa. We illustrate the method by reanalyzing three published data sets: a collection of 110 highly recombinant Salmonella multi-locus sequence typing sequences, the 135 "African Eve" human mitochondrial sequences published by Vigilant et al., and a collection of 12 Archeal chaperonin sequences demonstrating strong evidence for gene conversion. Neighbor-Net is available as part of the SplitsTree4 software package.

  • 39.
    Carlborg, Orjan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Haley, Chris S
    Epistasis: too often neglected in complex trait studies?2004In: Nat Rev Genet, ISSN 1471-0056, Vol. 5, no 8, p. 618-25Article in journal (Other academic)
  • 40.
    Carlborg, Örjan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Jacobsson, Lina
    Ahgren, Per
    Siegel, Paul
    Andersson, Leif
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Epistasis and the release of genetic variation during long-term selection.2006In: Nat Genet, ISSN 1061-4036, Vol. 38, no 4, p. 418-20Article in journal (Refereed)
  • 41.
    Cerenius, Lage
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Comparative Physiology.
    Andersson, M. Gunnar
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Söderhäll, Kenneth
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Comparative Physiology.
    Aphanomyces astaci and crustaceans.2009In: Oomycete Genetics and Genomics.: Diversity, Interactions, and Research Tools. / [ed] Kurt Lamour and Sophien Kamoun, Hoboken, New Jersey: John Wiley & Sons, Inc. , 2009, p. 425-433Chapter in book (Other academic)
  • 42.
    Cerenius, Lage
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Physiology and Developmental Biology, Comparative Physiology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology.
    Haipeng, Liu
    State Key Laboratory of Marine Environmental Science, College of Oceanography and Environmental Science, Xiamen University, Xiamen, 361005 Fujian, China.
    Zhang, Yanjiao
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Physiology and Developmental Biology, Comparative Physiology.
    Rimphanitchayakit, Vichien
    Center of Excellence for Molecular Biology and Genomics of Shrimp, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
    Tassanakajon, Anchalee
    Center of Excellence for Molecular Biology and Genomics of Shrimp, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
    Andersson, M. Gunnar
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Söderhäll, Kenneth
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Physiology and Developmental Biology, Comparative Physiology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology.
    Söderhäll, Irene
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Physiology and Developmental Biology, Comparative Physiology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology.
    High sequence variability among hemocyte-specific Kazal-type proteinase inhibitors in decapod crustaceans2010In: Developmental and Comparative Immunology, ISSN 0145-305X, E-ISSN 1879-0089, Vol. 34, no 1, p. 69-75Article in journal (Refereed)
    Abstract [en]

    Crustacean hemocytes were found to produce a large number of transcripts coding for Kazal-type proteinase inhibitors (KPIs). A detailed study performed with the crayfish Pacifastacus leniusculus and the shrimp Penaeus monodon revealed the presence of at least 26 and 20 different Kazal domains from the hemocyte KPIs, respectively. Comparisons with KPIs from other taxa indicate that the sequences of these domains evolve rapidly. A few conserved positions, e.g. six invariant cysteines were present in all domain sequences whereas the position of P1 amino acid, a determinant for substrate specificity, varied highly. A study with a single crayfish animal suggested that even at the individual level considerable sequence variability among hemocyte KPIs produced exist. Expression analysis of four crayfish KPI transcripts in hematopoietic tissue cells and different hemocyte types suggest that some of these KPIs are likely to be involved in hematopoiesis or hemocyte release as they were produced in particular hemocyte types or maturation stages only.

  • 43. Chang, W.-J.
    et al.
    Addis, V.M.
    Li, A.J.
    Axelsson, Elin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Ardell, David H.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Landweber, L.F.
    Intron Evolution and Information processing in the DNA polymerase alpha gene in spirotrichous ciliates: A hypothesis for interconversion between DNA and RNA deletion2007In: Biology Direct, ISSN 1745-6150, E-ISSN 1745-6150, Vol. 2, p. 6-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: The somatic DNA molecules of spirotrichous ciliates are present as linear chromosomes containing mostly single-gene coding sequences with short 5' and 3' flanking regions. Only a few conserved motifs have been found in the flanking DNA. Motifs that may play roles in promoting and/or regulating transcription have not been consistently detected. Moreover, comparing subtelomeric regions of 1,356 end-sequenced somatic chromosomes failed to identify more putatively conserved motifs. RESULTS: We sequenced and compared DNA and RNA versions of the DNA polymerase alpha (pol alpha) gene from nine diverged spirotrichous ciliates. We identified a G-C rich motif aaTACCGC(G/C/T) upstream from transcription start sites in all nine pol alpha orthologs. Furthermore, we consistently found likely polyadenylation signals, similar to the eukaryotic consensus AAUAAA, within 35 nt upstream of the polyadenylation sites. Numbers of introns differed among orthologs, suggesting independent gain or loss of some introns during the evolution of this gene. Finally, we discuss the occurrence of short direct repeats flanking some introns in the DNA pol alpha genes. These introns flanked by direct repeats resemble a class of DNA sequences called internal eliminated sequences (IES) that are deleted from ciliate chromosomes during development. CONCLUSIONS: Our results suggest that conserved motifs are present at both 5' and 3' untranscribed regions of the DNA pol alpha genes in nine spirotrichous ciliates. We also show that several independent gains and losses of introns in the DNA pol alpha genes have occurred in the spirotrichous ciliate lineage. Finally, our statistical results suggest that proven introns might also function in an IES removal pathway. This could strengthen a recent hypothesis that introns evolve into IESs, explaining the scarcity of introns in spirotrichs. Alternatively, the analysis suggests that ciliates might occasionally use intron splicing to correct, at the RNA level, failures in IES excision during developmental DNA elimination.

  • 44. Clark, Andrew G.
    et al.
    Eisen, Michael B.
    Smith, Douglas R.
    Bergman, Casey M.
    Oliver, Brian
    Markow, Therese A.
    Kaufman, Thomas C.
    Kellis, Manolis
    Gelbart, William
    Iyer, Venky N.
    Pollard, Daniel A.
    Sackton, Timothy B.
    Larracuente, Amanda M.
    Singh, Nadia D.
    Abad, Jose P.
    Abt, Dawn N.
    Adryan, Boris
    Aguade, Montserrat
    Akashi, Hiroshi
    Anderson, Wyatt W.
    Aquadro, Charles F.
    Ardell, David H.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Arguello, Roman
    Artieri, Carlo G.
    Barbash, Daniel A.
    Barker, Daniel
    Barsanti, Paolo
    Batterham, Phil
    Batzoglou, Serafim
    Begun, Dave
    Bhutkar, Arjun
    Blanco, Enrico
    Bosak, Stephanie A.
    Bradley, Robert K.
    Brand, Adrianne D.
    Brent, Michael R.
    Brooks, Angela N.
    Brown, Randall H.
    Butlin, Roger K.
    Caggese, Corrado
    Calvi, Brian R.
    de Carvalho, A. Bernardo
    Caspi, Anat
    Castrezana, Sergio
    Celniker, Susan E.
    Chang, Jean L.
    Chapple, Charles
    Chatterji, Sourav
    Chinwalla, Asif
    Civetta, Alberto
    Clifton, Sandra W.
    Comeron, Josep M.
    Costello, James C.
    Coyne, Jerry A.
    Daub, Jennifer
    David, Robert G.
    Delcher, Arthur L.
    Delehaunty, Kim
    Do, Chuong B.
    Ebling, Heather
    Edwards, Kevin
    Eickbush, Thomas
    Evans, Jay D.
    Filipski, Alan
    Findeiss, Sven
    Freyhult, Eva
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fulton, Lucinda
    Fulton, Robert
    Garcia, Ana C. L.
    Gardiner, Anastasia
    Garfield, David A.
    Garvin, Barry E.
    Gibson, Greg
    Gilbert, Don
    Gnerre, Sante
    Godfrey, Jennifer
    Good, Robert
    Gotea, Valer
    Gravely, Brenton
    Greenberg, Anthony J.
    Griffiths-Jones, Sam
    Gross, Samuel
    Guigo, Roderic
    Gustafson, Erik A.
    Haerty, Wilfried
    Hahn, Matthew W.
    Halligan, Daniel L.
    Halpern, Aaron L.
    Halter, Gillian M.
    Han, Mira V.
    Heger, Andreas
    Hillier, LaDeana
    Hinrichs, Angie S.
    Holmes, Ian
    Hoskins, Roger A.
    Hubisz, Melissa J.
    Hultmark, Dan
    Huntley, Melanie A.
    Jaffe, David B.
    Jagadeeshan, Santosh
    Jeck, William R.
    Johnson, Justin
    Jones, Corbin D.
    Jordan, William C.
    Karpen, Gary H.
    Kataoka, Eiko
    Keightley, Peter D.
    Kheradpour, Pouya
    Kirkness, Ewen F.
    Koerich, Leonardo B.
    Kristiansen, Karsten
    Kudrna, Dave
    Kulathinal, Rob J.
    Kumar, Sudhir
    Kwok, Roberta
    Lander, Eric
    Langley, Charles H.
    Lapoint, Richard
    Lazzaro, Brian P.
    Lee, So-Jeong
    Levesque, Lisa
    Li, Ruiqiang
    Lin, Chiao-Feng
    Lin, Michael F.
    Lindblad-Toh, Kerstin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Llopart, Ana
    Long, Manyuan
    Low, Lloyd
    Lozovsky, Elena
    Lu, Jian
    Luo, Meizhong
    Machado, Carlos A.
    Makalowski, Wojciech
    Marzo, Mar
    Matsuda, Muneo
    Matzkin, Luciano
    McAllister, Bryant
    McBride, Carolyn S.
    McKernan, Brendan
    McKernan, Kevin
    Mendez-Lago, Maria
    Minx, Patrick
    Mollenhauer, Michael U.
    Montooth, Kristi
    Mount, Stephen M.
    Mu, Xu
    Myers, Eugene
    Negre, Barbara
    Newfeld, Stuart
    Nielsen, Rasmus
    Noor, Mohamed A. F.
    O'Grady, Patrick
    Pachter, Lior
    Papaceit, Montserrat
    Parisi, Matthew J.
    Parisi, Michael
    Parts, Leopold
    Pedersen, Jakob S.
    Pesole, Graziano
    Phillippy, Adam M.
    Ponting, Chris P.
    Pop, Mihai
    Porcelli, Damiano
    Powell, Jeffrey R.
    Prohaska, Sonja
    Pruitt, Kim
    Puig, Marta
    Quesneville, Hadi
    Ram, Kristipati Ravi
    Rand, David
    Rasmussen, Matthew D.
    Reed, Laura K.
    Reenan, Robert
    Reily, Amy
    Remington, Karin A.
    Rieger, Tania T.
    Ritchie, Michael G.
    Robin, Charles
    Rogers, Yu-Hui
    Rohde, Claudia
    Rozas, Julio
    Rubenfield, Marc J.
    Ruiz, Alfredo
    Russo, Susan
    Salzberg, Steven L.
    Sanchez-Gracia, Alejandro
    Saranga, David J.
    Sato, Hajime
    Schaeffer, Stephen W.
    Schatz, Michael C.
    Schlenke, Todd
    Schwartz, Russell
    Segarra, Carmen
    Singh, Rama S.
    Sirot, Laura
    Sirota, Marina
    Sisneros, Nicholas B.
    Smith, Chris D.
    Smith, Temple F.
    Spieth, John
    Stage, Deborah E.
    Stark, Alexander
    Stephan, Wolfgang
    Strausberg, Robert L.
    Strempel, Sebastian
    Sturgill, David
    Sutton, Granger
    Sutton, Granger G.
    Tao, Wei
    Teichmann, Sarah
    Tobari, Yoshiko N.
    Tomimura, Yoshihiko
    Tsolas, Jason M.
    Valente, Vera L. S.
    Venter, Eli
    Venter, J. Craig
    Vicario, Saverio
    Vieira, Filipe G.
    Vilella, Albert J.
    Villasante, Alfredo
    Walenz, Brian
    Wang, Jun
    Wasserman, Marvin
    Watts, Thomas
    Wilson, Derek
    Wilson, Richard K.
    Wing, Rod A.
    Wolfner, Mariana F.
    Wong, Alex
    Wong, Gane Ka-Shu
    Wu, Chung-I
    Wu, Gabriel
    Yamamoto, Daisuke
    Yang, Hsiao-Pei
    Yang, Shiaw-Pyng
    Yorke, James A.
    Yoshida, Kiyohito
    Zdobnov, Evgeny
    Zhang, Peili
    Zhang, Yu
    Zimin, Aleksey V.
    Baldwin, Jennifer
    Abdouelleil, Amr
    Abdulkadir, Jamal
    Abebe, Adal
    Abera, Brikti
    Abreu, Justin
    Acer, St Christophe
    Aftuck, Lynne
    Alexander, Allen
    An, Peter
    Anderson, Erica
    Anderson, Scott
    Arachi, Harindra
    Azer, Marc
    Bachantsang, Pasang
    Barry, Andrew
    Bayul, Tashi
    Berlin, Aaron
    Bessette, Daniel
    Bloom, Toby
    Blye, Jason
    Boguslavskiy, Leonid
    Bonnet, Claude
    Boukhgalter, Boris
    Bourzgui, Imane
    Brown, Adam
    Cahill, Patrick
    Channer, Sheridon
    Cheshatsang, Yama
    Chuda, Lisa
    Citroen, Mieke
    Collymore, Alville
    Cooke, Patrick
    Costello, Maura
    D'Aco, Katie
    Daza, Riza
    De Haan, Georgius
    DeGray, Stuart
    DeMaso, Christina
    Dhargay, Norbu
    Dooley, Kimberly
    Dooley, Erin
    Doricent, Missole
    Dorje, Passang
    Dorjee, Kunsang
    Dupes, Alan
    Elong, Richard
    Falk, Jill
    Farina, Abderrahim
    Faro, Susan
    Ferguson, Diallo
    Fisher, Sheila
    Foley, Chelsea D.
    Franke, Alicia
    Friedrich, Dennis
    Gadbois, Loryn
    Gearin, Gary
    Gearin, Christina R.
    Giannoukos, Georgia
    Goode, Tina
    Graham, Joseph
    Grandbois, Edward
    Grewal, Sharleen
    Gyaltsen, Kunsang
    Hafez, Nabil
    Hagos, Birhane
    Hall, Jennifer
    Henson, Charlotte
    Hollinger, Andrew
    Honan, Tracey
    Huard, Monika D.
    Hughes, Leanne
    Hurhula, Brian
    Husby, M. Erii
    Kamat, Asha
    Kanga, Ben
    Kashin, Seva
    Khazanovich, Dmitry
    Kisner, Peter
    Lance, Krista
    Lara, Marcia
    Lee, William
    Lennon, Niall
    Letendre, Frances
    LeVine, Rosie
    Lipovsky, Alex
    Liu, Xiaohong
    Liu, Jinlei
    Liu, Shangtao
    Lokyitsang, Tashi
    Lokyitsang, Yeshi
    Lubonja, Rakela
    Lui, Annie
    MacDonald, Pen
    Magnisalis, Vasilia
    Maru, Kebede
    Matthews, Charles
    McCusker, William
    McDonough, Susan
    Mehta, Teena
    Meldrim, James
    Meneus, Louis
    Mihai, Oana
    Mihalev, Atanas
    Mihova, Tanya
    Mittelman, Rachel
    Mlenga, Valentine
    Montmayeur, Anna
    Mulrain, Leonidas
    Navidi, Adam
    Naylor, Jerome
    Negash, Tamrat
    Nguyen, Thu
    Nguyen, Nga
    Nicol, Robert
    Norbu, Choe
    Norbu, Nyima
    Novod, Nathaniel
    O'Neill, Barry
    Osman, Sahal
    Markiewicz, Eva
    Oyono, Otero L.
    Patti, Christopher
    Phunkhang, Pema
    Pierre, Fritz
    Priest, Margaret
    Raghuraman, Sujaa
    Rege, Filip
    Reyes, Rebecca
    Rise, Cecil
    Rogov, Peter
    Ross, Keenan
    Ryan, Elizabeth
    Settipalli, Sampath
    Shea, Terry
    Sherpa, Ngawang
    Shi, Lu
    Shih, Diana
    Sparrow, Todd
    Spaulding, Jessica
    Stalker, John
    Stange-Thomann, Nicole
    Stavropoulos, Sharon
    Stone, Catherine
    Strader, Christopher
    Tesfaye, Senait
    Thomson, Talene
    Thoulutsang, Yama
    Thoulutsang, Dawa
    Topham, Kerri
    Topping, Ira
    Tsamla, Tsamla
    Vassiliev, Helen
    Vo, Andy
    Wangchuk, Tsering
    Wangdi, Tsering
    Weiand, Michael
    Wilkinson, Jane
    Wilson, Adam
    Yadav, Shailendra
    Young, Geneva
    Yu, Qing
    Zembek, Lisa
    Zhong, Danni
    Zimmer, Andrew
    Zwirko, Zac
    Alvarez, Pablo
    Brockman, Will
    Butler, Jonathan
    Chin, CheeWhye
    Grabherr, Manfred
    Kleber, Michael
    Mauceli, Evan
    MacCallum, Iain
    Evolution of genes and genomes on the Drosophila phylogeny.2007In: Nature, ISSN 0028-0836, E-ISSN 1476-4687, Vol. 450, no 7167, p. 203-218Article in journal (Refereed)
    Abstract [en]

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.

  • 45.
    de Ståhl, Teresita Díaz
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Sandgren, Johanna
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Surgical Sciences.
    Piotrowski, Arkadiusz
    Nord, Helena
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Andersson, Robin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Menzel, Uwe
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bogdan, Adam
    Thuresson, Ann-Charlotte
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Poplawski, Andrzej
    von Tell, Desiree
    Hansson, Caisa M.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Elshafie, Amir I.
    Elghazali, Gehad
    Imreh, Stephan
    Nordenskjöld, Magnus
    Upadhyaya, Meena
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bruder, Carl E. G.
    Dumanski, Jan P.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Profiling of copy number variations (CNVs) in healthy individuals from three ethnic groups using a human genome 32 K BAC-clone-based array2008In: Human Mutation, ISSN 1059-7794, E-ISSN 1098-1004, Vol. 29, no 3, p. 398-408Article in journal (Refereed)
    Abstract [en]

    To further explore the extent of structural large-scale variation in the human genome, we assessed copy number variations (CNVs) in a series of 71 healthy subjects from three ethnic groups. CNVs were analyzed using comparative genomic hybridization (CGH) to a BAC array covering the human genome, using DNA extracted from peripheral blood, thus avoiding any culture-induced rearrangements. By applying a newly developed computational algorithm based on Hidden Markov modeling, we identified 1,078 autosomal CNVs, including at least two neighboring/overlapping BACs, which represent 315 distinct regions. The average size of the sequence polymorphisms was approximately 350 kb and involved in total approximately 117 Mb or approximately 3.5% of the genome. Gains were about four times more common than deletions, and segmental duplications (SDs) were overrepresented, especially in larger deletion variants. This strengthens the notion that SDs often define hotspots of chromosomal rearrangements. Over 60% of the identified autosomal rearrangements match previously reported CNVs, recognized with various platforms. However, results from chromosome X do not agree well with the previously annotated CNVs. Furthermore, data from single BACs deviating in copy number suggest that our above estimate of total variation is conservative. This report contributes to the establishment of the common baseline for CNV, which is an important resource in human genetics.

  • 46.
    D'Elia, Domenica
    et al.
    Institute for Biomedical Technologies, CNR, Via Amendola 122/D, 70126 Bari, Italy.
    Gisel, Andreas
    Institute for Biomedical Technologies, CNR, Via Amendola 122/D, 70126 Bari, Italy.
    Eriksson, Nils-Einar
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kossida, Sophia
    Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, 11527 Athens, Greece.
    Mattila, Kimmo
    CSC – IT Center for Science Ltd., Keilaranta 14, 02100 Espoo, Finland.
    Klucar, Lubos
    Institute of Molecular Biology, Slovak Academy of Sciences, Dubravska cesta 21, 84551 Bratislava, Slovakia.
    Bongcam-Rudloff, Erik
    Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, 75024 Uppsala, Sweden.
    The 20th anniversary of EMBnet: 20 years of bioinformatics for the Life Sciences community2009In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10, no Suppl. 6, p. S1-Article in journal (Refereed)
    Abstract [en]

    The EMBnet Conference 2008, focusing on 'Leading Applications and Technologies in Bioinformatics', was organized by the European Molecular Biology network (EMBnet) to celebrate its 20th anniversary. Since its foundation in 1988, EMBnet has been working to promote collaborative development of bioinformatics services and tools to serve the European community of molecular biology laboratories. This conference was the first meeting organized by the network that was open to the international scientific community outside EMBnet. The conference covered a broad range of research topics in bioinformatics with a main focus on new achievements and trends in emerging technologies supporting genomics, transcriptomics and proteomics analyses such as high-throughput sequencing and data managing, text and data-mining, ontologies and Grid technologies. Papers selected for publication, in this supplement to BMC Bioinformatics, cover a broad range of the topics treated, providing also an overview of the main bioinformatics research fields that the EMBnet community is involved in.

  • 47. Dennis, Jayne L
    et al.
    Hvidsten, Torgeir R
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Wit, Ernst C
    Komorowski, Jan
    Bell, Alexandra K
    Downie, Ian
    Mooney, Jacqueline
    Verbeke, Caroline
    Bellamy, Christopher
    Keith, W Nicol
    Oien, Karin A
    Markers of adenocarcinoma characteristic of the site of origin: development2005In: Clin Cancer Res, ISSN 1078-0432, Vol. 11, no 10, p. 3766-72Article in journal (Refereed)
  • 48. Draminski, Michal
    et al.
    Rada-Iglesias, Alvaro
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Koronacki, Jacek
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Monte Carlo feature selection for supervised classification2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 1, p. 110-117Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It is desirable that feature selection provides the features that contribute most to the classification task per se and which should therefore be used by any classifier later used to produce classification rules. In this article, a conceptually simple but computer-intensive approach to this task is proposed. The reliability of the approach rests on multiple construction of a tree classifier for many training sets randomly chosen from the original sample set, where samples in each training set consist of only a fraction of all of the observed features. RESULTS: The resulting ranking of features may then be used to advantage for classification via a classifier of any type. The approach was validated using Golub et al. leukemia data and the Alizadeh et al. lymphoma data. Not surprisingly, we obtained a significantly different list of genes. Biological interpretation of the genes selected by our method showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods.

  • 49.
    Dramiński, Michał
    et al.
    Institute of Computer Science, Polish Academy of Sciences.
    Kierczak, Marcin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Koronacki, Jacek
    Institute of Computer Science, Polish Academy of Sciences.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Monte Carlo feature selection and interdependency discovery in supervised classification2010In: Advances in Machine Learning: Dedicated to the memory of Professor Ryszard S. Michalski., Heidelberg: Springer , 2010Chapter in book (Other academic)
    Abstract [en]

    Applications of machine learning techniques in Life Sciences are the main applications forcing a paradigm shift in the way these techniques are used. Rather than obtaining the best possible supervised classifier, the Life Scientist needs to know which features contribute best to classifying distinct classes and what are the interdependencies between the features. To this end we significantly extend our earlier work [Dramiński et al. (2008)] that introduced an effective and reliable method for ranking features according to their importance for classification. We begin with adding a method for finding a cut-off between informative and non-informative fea- tures and then continue with a development of a methodology and an implementa- tion of a procedure for determining interdependencies between informative features. The reliability of our approach rests on multiple construction of tree classifiers. Essentially, each classifier is trained on a randomly chosen subset of the original data using only a fraction of all of the observed features. This approach is conceptually simple yet computer-intensive. The methodology is validated on a large and difficult task of modelling HIV-1 reverse transcriptase resistance to drugs which is a good example of the aforementioned paradigm shift. We construct a classifier but of the main interest is the identification of mutation points (i.e. features) and their combinations that model drug resistance.

  • 50.
    Dress A, Grunewald S
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Semiharmonic trees and monocyclic graphs2003In: Applied mathematics letter, ISSN 0893-9659, Vol. 16, no 8, p. 1329-1332Article in journal (Refereed)
12345 1 - 50 of 223
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf