uu.seUppsala University Publications
Change search
Refine search result
2345 201 - 224 of 224
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 201.
    Sinapah, Sylvie
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Wu, Shiying
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Microbiology.
    Chen, Yu
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Pettersson, Fredrik
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Microbiology.
    Gopalan, Venkat
    Kirsebom, Leif A.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Microbiology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Cleavage of model substrates by archaeal RNase P: role of protein cofactors in cleavage-site selection2011In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 39, no 3, p. 1105-1116Article in journal (Refereed)
    Abstract [en]

    RNase P is a catalytic ribonucleoprotein primarily involved in tRNA biogenesis. Archaeal RNase P comprises a catalytic RNase P RNA (RPR) and at least four protein cofactors (RPPs), which function as two binary complexes (POP5•RPP30 and RPP21• RPP29). Exploiting the ability to assemble a functional Pyrococcus furiosus (Pfu) RNase P in vitro, we examined the role of RPPs in influencing substrate recognition by the RPR. We first demonstrate that Pfu RPR, like its bacterial and eukaryal counterparts, cleaves model hairpin loop substrates albeit at rates 90- to 200-fold lower when compared with cleavage by bacterial RPR, highlighting the functionally comparable catalytic cores in bacterial and archaeal RPRs. By investigating cleavage-site selection exhibited by Pfu RPR (±RPPs) with various model substrates missing consensus-recognition elements, we determined substrate features whose recognition is facilitated by either POP5•RPP30 or RPP21•RPP29 (directly or indirectly via the RPR). Our results also revealed that Pfu RPR + RPP21•RPP29 displays substrate-recognition properties coinciding with those of the bacterial RPR-alone reaction rather than the Pfu RPR, and that this behaviour is attributable to structural differences in the substrate-specificity domains of bacterial and archaeal RPRs. Moreover, our data reveal a hierarchy in recognition elements that dictates cleavage-site selection by archaeal RNase P.

  • 202.
    Singh, Umashankar
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bongcam-Rudloff, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Westermark, Bengt
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    A DNA sequence directed mutual transcription regulation of HSF1 and NFIX involves novel heat sensitive protein interactions2009In: PLoS ONE, ISSN 1932-6203, Vol. 4, no 4, p. e5050-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: Though the Nuclear factor 1 family member NFIX has been strongly implicated in PDGFB-induced glioblastoma, its molecular mechanisms of action remain unknown. HSF1, a heat shock-related transcription factor is also a powerful modifier of carcinogenesis by several factors, including PDGFB. How HSF1 transcription is controlled has remained largely elusive. METHODOLOGY/PRINCIPAL FINDINGS: By combining microarray expression profiling and a yeast-two-hybrid screen, we identified that NFIX and its interactions with CGGBP1 and HMGN1 regulate expression of HSF1. We found that CGGBP1 organizes a bifunctional transcriptional complex at small CGG repeats in the HSF1 promoter. Under chronic heat shock, NFIX uses CGGBP1 and HMGN1 to get recruited to this promoter and in turn affects their binding to DNA. Results show that the interactions of NFIX with CGGBP1 and HMGN1 in the soluble fraction are heat shock sensitive due to preferential localization of CGGBP1 to heterochromatin after heat shock. HSF1 in turn was found to bind to the NFIX promoter and repress its expression in a heat shock sensitive manner. CONCLUSIONS/SIGNIFICANCE: NFIX and HSF1 exert a mutual transcriptional repressive effect on each other which requires CGG repeat in HSF1 promoter and HSF1 binding site in NFIX promoter. We unravel a unique mechanism of heat shock sensitive DNA sequence-directed reciprocal transcriptional regulation between NFIX and HSF1. Our findings provide new insights into mechanisms of transcription regulation under stress.

  • 203.
    Sjöberg, Paul
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Lötstedt, Per
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Elf, Johan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fokker-Planck approximation of the master equation in molecular biology2009In: Computing and Visualization in Science, ISSN 1432-9360, E-ISSN 1433-0369, Vol. 12, p. 37-50Article in journal (Refereed)
  • 204.
    Sperber, Göran
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience, Physiology.
    Lövgren, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Eriksson, Nils-Einar
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Benachenhou, Farid
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Virology.
    Blomberg, Jonas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Virology.
    RetroTector online, a rational tool for analysis of retroviral elements in small and medium size vertebrate genomic sequences2009In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10 Suppl 6, p. S4-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. METHODS: RetroTector (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. RESULT: ROL http://www.fysiologi.neuro.uu.se/jbgs/ was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10,000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of <or= 100 Megabases. Jobs are shown in an IP-number specific list. Results are text files, and can be viewed with the program, RetroTectorViewer.jar (at the same site), which has the full graphical capabilities of the basic ReTe program. A detailed analysis of any retroviral sequences found in the submitted sequence is graphically presented, exportable in standard formats. With the current server, a complete analysis of a 1 Megabase sequence is complete in 10 minutes. It is possible to mask nonretroviral repetitive sequences in the submitted sequence, using host genome specific "brooms", which increase specificity. DISCUSSION: Proviral sequences can be hard to recognize, especially if the integration occurred many million years ago. Precise delineation of LTR, gag, pro, pol and env can be difficult, requiring manual work. ROL is a way of simplifying these tasks. CONCLUSION: ROL provides 1. annotation and presentation of known retroviral sequences, 2. detection of proviral chains in unknown genomic sequences, with up to 100 Mbase per submission.

  • 205.
    Strimmer, K.
    et al.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Mathematics. Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Forslund, K.
    Holland, B.
    Moulton, V.
    New exploratory methods for visual recombination detection2003In: Genome Biology, Vol. 4Article in journal (Refereed)
  • 206.
    Strömbergsson, Helena
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Chemogenomics: Models of Protein-Ligand Interaction Space2009Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The large majority of the currently used drugs are small molecules that interact with proteins. Understanding protein-ligand recognition is thus central to drug discovery and design. Improved experimental techniques have resulted in an immense growth of drug target information. This has stimulated the development of chemogenomics and proteochemometrics (PCM) that take target information as well as ligand information into account to study the genomic effect of potential drugs.

    This thesis is concerned with modeling protein-ligand recognition, and the aim is to develop models that generalize to the entire protein-ligand space. To this end, protein-ligand interaction data has been extracted and manually curated from public databases, protein and ligand descriptors have been computed, and predictive models have been induced with machine-learning methods.

    An introduction to chemogenomics, machine learning, and PCM modeling is given in the thesis summary, which is followed by five research papers. Paper I shows that it is possible to induce interpretable models with a non-linear rule-based method, and paper II demonstrates that local descriptors of protein structure may be used to induce PCM models that cover proteins differing in sequence and fold. In paper III, such local descriptors are used to induce a PCM model on a large dataset that includes all major enzyme classes. This demonstrates that the local descriptors may be used to induce generalized models that span the entire known structural enzyme-ligand space. Paper IV describes a step towards proteome-wide PCM models, and shows that it is possible to predict high- and low-affinity complexes using a set of protein and ligand descriptors that do not require knowledge of 3D structure. Finally, paper V presents a method to visualize and compare protein-ligand chemogenomic subspaces, which may be used to predict unwanted cross-interactions of drugs with other proteins in the proteome.

    List of papers
    1.
    The record could not be found. The reason may be that the record is no longer available or you may have typed in a wrong id in the address field.
    2. A chemogenomics view on protein-ligand spaces
    Open this publication in new window or tab >>A chemogenomics view on protein-ligand spaces
    2009 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10, no Suppl.6, p. S13-Article in journal (Refereed) Published
    Abstract [en]

    BACKGROUND: Chemogenomics is an emerging inter-disciplinary approach to drug discovery that combines traditional ligand-based approaches with biological information on drug targets and lies at the interface of chemistry, biology and informatics. The ultimate goal in chemogenomics is to understand molecular recognition between all possible ligands and all possible drug targets. Protein and ligand space have previously been studied as separate entities, but chemogenomics studies deal with large datasets that cover parts of the joint protein-ligand space. Since drug discovery has traditionally focused on ligand optimization, the chemical space has been studied extensively. The protein space has been studied to some extent, typically for the purpose of classification of proteins into functional and structural classes. Since chemogenomics deals not only with ligands but also with the macromolecules the ligands interact with, it is of interest to find means to explore, compare and visualize protein-ligand subspaces. RESULTS: Two chemogenomics protein-ligand interaction datasets were prepared for this study. The first dataset covers the known structural protein-ligand space, and includes all non-redundant protein-ligand interactions found in the worldwide Protein Data Bank (PDB). The second dataset contains all approved drugs and drug targets stored in the DrugBank database, and represents the approved drug-drug target space. To capture biological and physicochemical features of the chemogenomics datasets, sequence-based descriptors were computed for the proteins, and 0, 1 and 2 dimensional descriptors for the ligands. Principal component analysis (PCA) was used to analyze the multidimensional data and to create global models of protein-ligand space. The nearest neighbour method, computed using the principal components, was used to obtain a measure of overlap between the datasets. CONCLUSION: In this study, we present an approach to visualize protein-ligand spaces from a chemogenomics perspective, where both ligand and protein features are taken into account. The method can be applied to any protein-ligand interaction dataset. Here, the approach is applied to analyze the structural protein-ligand space and the protein-ligand space of all approved drugs and their targets. We show that this approach can be used to visualize and compare chemogenomics datasets, and possibly to identify cross-interaction complexes in protein-ligand space.

    National Category
    Natural Sciences
    Identifiers
    urn:nbn:se:uu:diva-89297 (URN)10.1186/1471-2105-10-S6-S13 (DOI)000267522200013 ()19534738 (PubMedID)
    Available from: 2009-02-10 Created: 2009-02-10 Last updated: 2017-12-14Bibliographically approved
    3. Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures
    Open this publication in new window or tab >>Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures
    Show others...
    2006 (English)In: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 65, no 3, p. 568-579Article in journal (Refereed) Published
    Abstract [en]

    Modeling and understanding protein-ligand interactions is one of the most important goals in computational drug discovery. To this end, proteochemometrics uses structural and chemical descriptors from several proteins and several ligands to induce interaction-models. Here, we present a new and generalized approach in which proteins varying greatly in terms of sequence and structure are represented by a library of local substructures. Using linear regression and rule-based learning, we combine such local substructures with chemical descriptors from the ligands to model binding affinity for a training set of hydrolase and lyase enzymes. We evaluate the predictive performance of these models using cross validation and sets of unseen ligand with unknown three-dimensional structure. The models are shown to generalize by outperforming models using descriptors from only proteins or only ligands, or models using global structure similarities rather than local similarities. Thus, we demonstrate that this approach is capable of describing dependencies between local structural properties and ligands in otherwise dissimilar protein structures. These dependencies are often, but not always, associated with local substructures that are in contact with the ligands. Finally, we show that strongly bound enzyme-ligand complexes require the presence of particular local substructures, while weakly bound complexes may be described by the absence of certain properties. The results demonstrate that the alignment-independent approach using local substructures is capable of describing protein-ligand interaction for largely different proteins and hence opens up for proteochemometrics-analysis of the interaction-space of entire proteomes. Current approaches are limited to families of closely related proteins. families of closely related proteins.

    Keywords
    QSAR, partial least squares, rule-based learning, drug design, local descriptors of protein structure
    National Category
    Pharmaceutical Sciences
    Identifiers
    urn:nbn:se:uu:diva-23995 (URN)10.1002/prot.21163 (DOI)000241247100005 ()16948162 (PubMedID)
    Available from: 2007-02-02 Created: 2007-02-02 Last updated: 2018-01-26Bibliographically approved
    4. Towards proteome-wide interaction models using the proteochemometrics approach
    Open this publication in new window or tab >>Towards proteome-wide interaction models using the proteochemometrics approach
    2010 (English)In: Molecular Informatics, ISSN 1868-1743, Vol. 29, no 6-7, p. 499-508Article in journal (Refereed) Published
    Abstract [en]

    A proteochemometrics model was induced from all interaction data in the BindingDB database, comprizing in all 7078 protein-ligand complexes with representatives from all major drug target categories. Proteins were represented by alignment-independent sequence descriptors holding information on properties such as hydrophobicity, charge, and secondary structure. Ligands were represented by commonly used QSAR descriptors. The inhibition constant (pK(i)) values of protein-ligand complexes were discretized into "high" and "low" interaction activity. Different machine-learning techniques were used to induce models relating protein and ligand properties to the interaction activity. The best was decision trees, which gave an accuracy of 80% and an area under the ROC curve of 0.81. The tree pointed to the protein and ligand properties, which are relevant for the interaction. As the approach does neither require alignments nor knowledge of protein 3D structures virtually all available protein-ligand interaction data could be utilized, thus opening a way to completely general interaction models that may span entire proteomes.

    Keywords
    Bioinformatics, Chemogenomics, Drug design, Protein-Ligand interactions, Proteochemometrics
    National Category
    Pharmaceutical Sciences Biological Sciences
    Identifiers
    urn:nbn:se:uu:diva-89298 (URN)10.1002/minf.201000052 (DOI)000280908200004 ()
    Available from: 2009-02-10 Created: 2009-02-10 Last updated: 2018-01-13Bibliographically approved
    5. Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand
    Open this publication in new window or tab >>Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand
    Show others...
    2006 (English)In: Proteins: Structure, Function, and Bioinformatics, ISSN 1097-0134, Vol. 63, no 1, p. 24-34Article in journal (Refereed) Published
    Abstract [en]

    G-Protein-coupled receptors (GPCRs) are among the most important drug targets. Because of a shortage of 3D crystal structures, most of the drug design for GPCRs has been ligand-based. We propose a novel, rough set-based proteochemometric approach to the study of receptor and ligand recognition. The approach is validated on three datasets containing GPCRs. In proteochemometrics, properties of receptors and ligands are used in conjunction and modeled to predict binding affinity. The rough set (RS) rule-based models presented herein consist of minimal decision rules that associate properties of receptors and ligands with high or low binding affinity. The information provided by the rules is then used to develop a mechanistic interpretation of interactions between the ligands and receptors included in the datasets. The first two datasets contained descriptors of melanocortin receptors and peptide ligands. The third set contained descriptors of adrenergic receptors and ligands. All the rule models induced from these datasets have a high predictive quality. An example of a decision rule is If R1_ligand(Ethyl) and TM helix 2 position 27(Methionine) then Binding(High). The easily interpretable rule sets are able to identify determinative receptor and ligand parts. For instance, all three models suggest that transmembrane helix 2 is determinative for high and low binding affinity. RS models show that it is possible to use rule-based models to predict ligand-binding affinities. The models may be used to gain a deeper biological understanding of the combinatorial nature of receptor-ligand interactions.

    Keywords
    drug design, QSAR, GPCRs, machine learning, rough sets, partial least squares
    Identifiers
    urn:nbn:se:uu:diva-79854 (URN)10.1002/prot.2077 (DOI)16435365 (PubMedID)
    Available from: 2006-04-14 Created: 2006-04-14 Last updated: 2009-10-13Bibliographically approved
  • 207.
    Strömbergsson, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kleywegt, Gerard J
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    A chemogenomics view on protein-ligand spaces2009In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10, no Suppl.6, p. S13-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND: Chemogenomics is an emerging inter-disciplinary approach to drug discovery that combines traditional ligand-based approaches with biological information on drug targets and lies at the interface of chemistry, biology and informatics. The ultimate goal in chemogenomics is to understand molecular recognition between all possible ligands and all possible drug targets. Protein and ligand space have previously been studied as separate entities, but chemogenomics studies deal with large datasets that cover parts of the joint protein-ligand space. Since drug discovery has traditionally focused on ligand optimization, the chemical space has been studied extensively. The protein space has been studied to some extent, typically for the purpose of classification of proteins into functional and structural classes. Since chemogenomics deals not only with ligands but also with the macromolecules the ligands interact with, it is of interest to find means to explore, compare and visualize protein-ligand subspaces. RESULTS: Two chemogenomics protein-ligand interaction datasets were prepared for this study. The first dataset covers the known structural protein-ligand space, and includes all non-redundant protein-ligand interactions found in the worldwide Protein Data Bank (PDB). The second dataset contains all approved drugs and drug targets stored in the DrugBank database, and represents the approved drug-drug target space. To capture biological and physicochemical features of the chemogenomics datasets, sequence-based descriptors were computed for the proteins, and 0, 1 and 2 dimensional descriptors for the ligands. Principal component analysis (PCA) was used to analyze the multidimensional data and to create global models of protein-ligand space. The nearest neighbour method, computed using the principal components, was used to obtain a measure of overlap between the datasets. CONCLUSION: In this study, we present an approach to visualize protein-ligand spaces from a chemogenomics perspective, where both ligand and protein features are taken into account. The method can be applied to any protein-ligand interaction dataset. Here, the approach is applied to analyze the structural protein-ligand space and the protein-ligand space of all approved drugs and their targets. We show that this approach can be used to visualize and compare chemogenomics datasets, and possibly to identify cross-interaction complexes in protein-ligand space.

  • 208.
    Strömbergsson, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kryshtafovych, Andriy
    Prusis, Peteris
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Fidelis, Krzysztof
    Wikberg, Jarl E. S.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Hvidsten, Torgeir R.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures2006In: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 65, no 3, p. 568-579Article in journal (Refereed)
    Abstract [en]

    Modeling and understanding protein-ligand interactions is one of the most important goals in computational drug discovery. To this end, proteochemometrics uses structural and chemical descriptors from several proteins and several ligands to induce interaction-models. Here, we present a new and generalized approach in which proteins varying greatly in terms of sequence and structure are represented by a library of local substructures. Using linear regression and rule-based learning, we combine such local substructures with chemical descriptors from the ligands to model binding affinity for a training set of hydrolase and lyase enzymes. We evaluate the predictive performance of these models using cross validation and sets of unseen ligand with unknown three-dimensional structure. The models are shown to generalize by outperforming models using descriptors from only proteins or only ligands, or models using global structure similarities rather than local similarities. Thus, we demonstrate that this approach is capable of describing dependencies between local structural properties and ligands in otherwise dissimilar protein structures. These dependencies are often, but not always, associated with local substructures that are in contact with the ligands. Finally, we show that strongly bound enzyme-ligand complexes require the presence of particular local substructures, while weakly bound complexes may be described by the absence of certain properties. The results demonstrate that the alignment-independent approach using local substructures is capable of describing protein-ligand interaction for largely different proteins and hence opens up for proteochemometrics-analysis of the interaction-space of entire proteomes. Current approaches are limited to families of closely related proteins. families of closely related proteins.

  • 209.
    Strömbergsson, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Lapins, Maris
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Kleywegt, Gerard J.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Structural Molecular Biology.
    Wikberg, Jarl E. S.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Towards proteome-wide interaction models using the proteochemometrics approach2010In: Molecular Informatics, ISSN 1868-1743, Vol. 29, no 6-7, p. 499-508Article in journal (Refereed)
    Abstract [en]

    A proteochemometrics model was induced from all interaction data in the BindingDB database, comprizing in all 7078 protein-ligand complexes with representatives from all major drug target categories. Proteins were represented by alignment-independent sequence descriptors holding information on properties such as hydrophobicity, charge, and secondary structure. Ligands were represented by commonly used QSAR descriptors. The inhibition constant (pK(i)) values of protein-ligand complexes were discretized into "high" and "low" interaction activity. Different machine-learning techniques were used to induce models relating protein and ligand properties to the interaction activity. The best was decision trees, which gave an accuracy of 80% and an area under the ROC curve of 0.81. The tree pointed to the protein and ligand properties, which are relevant for the interaction. As the approach does neither require alignments nor knowledge of protein 3D structures virtually all available protein-ligand interaction data could be utilized, thus opening a way to completely general interaction models that may span entire proteomes.

  • 210.
    Strömbergsson, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Prusis, Peteris
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Midelfart, Herman
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Lapinsh, Maris
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Wikberg, Jarl E S
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand2006In: Proteins: Structure, Function, and Bioinformatics, ISSN 1097-0134, Vol. 63, no 1, p. 24-34Article in journal (Refereed)
    Abstract [en]

    G-Protein-coupled receptors (GPCRs) are among the most important drug targets. Because of a shortage of 3D crystal structures, most of the drug design for GPCRs has been ligand-based. We propose a novel, rough set-based proteochemometric approach to the study of receptor and ligand recognition. The approach is validated on three datasets containing GPCRs. In proteochemometrics, properties of receptors and ligands are used in conjunction and modeled to predict binding affinity. The rough set (RS) rule-based models presented herein consist of minimal decision rules that associate properties of receptors and ligands with high or low binding affinity. The information provided by the rules is then used to develop a mechanistic interpretation of interactions between the ligands and receptors included in the datasets. The first two datasets contained descriptors of melanocortin receptors and peptide ligands. The third set contained descriptors of adrenergic receptors and ligands. All the rule models induced from these datasets have a high predictive quality. An example of a decision rule is If R1_ligand(Ethyl) and TM helix 2 position 27(Methionine) then Binding(High). The easily interpretable rule sets are able to identify determinative receptor and ligand parts. For instance, all three models suggest that transmembrane helix 2 is determinative for high and low binding affinity. RS models show that it is possible to use rule-based models to predict ligand-binding affinities. The models may be used to gain a deeper biological understanding of the combinatorial nature of receptor-ligand interactions.

  • 211.
    Strömbergsson, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Prusis, Peteris
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Midelfart, Herman
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wikberg, Jarl E S
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Proteochemometrics modelling of receptor ligand interactions using rough sets2004In: Proceedings of the German conference on Bioinformatics / [ed] Robert Giegerich, Jens Stoye, 2004, p. 85-94Conference paper (Refereed)
    Abstract [en]

    We report on a model for the interaction of chimeric melanocortin G-protein coupled receptors with peptide ligands using the rough set approach. Rough sets generate If-Then rule models using Boolean reasoning. Two separate datasets have been analyzed, for which the binding affinities have previously been measured experimentally. The receptors and ligands are described by vectors of strings. Different partitions of each dataset were evaluated in order to find an optimal partition into rough set decision classes. To obtain a measurement of the accuracy of the rough set classifier generated from each dataset, a 10-fold cross validation (CV) was performed. The Area Under Curve (AUC) was calculated for each iteration during CV. This resulted in an AUC mean of 0.94 (SD 0.12) and 0.93 (SD 0.16) for the first and second dataset respectively. The CV results show that the rough set models exhibit a high classification quality. The decision rules generated from the rough set model inductions are easy to interpret. We apply this information to develop models of the interaction between ligands and receptors.

  • 212.
    Tåquist, Helena
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Cui, Yuanyuan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Ardell, David H.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    TFAM 1.0: an online tRNA function classifier2007In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 35, no Suppl. S: Web Server issue, p. W350-W353Article in journal (Other academic)
    Abstract [en]

    We have earlier published an automated statistical classifier of tRNA function called TFAM. Unlike tRNA gene-finders, TFAM uses information from the total sequences of tRNAs and not just their anticodons to predict their function. Therefore TFAM has an advantage in predicting initiator tRNAs, the amino acid charging identity of nonstandard tRNAs such as suppressors, and the former identity of pseudo-tRNAs. In addition, TFAM predictions are robust to sequencing errors and useful for the statistical analysis of tRNA sequence, function and evolution. Earlier versions of TFAM required a complicated installation and running procedure, and only bacterial tRNA identity models were provided. Here we describe a new version of TFAM with both a Web Server interface and simplified standalone installation. New TFAM models are available including a proteobacterial model for the bacterial lysylated isoleucine tRNAs, making it now possible for TFAM to correctly classify all tRNA genes for some bacterial taxa. First-draft eukaryotic and archaeal models are also provided making initiator tRNA prediction easily accessible genes to any researcher or genome sequencing effort. The TFAM Web Server is available at http://tfam.lcb.uu.se

  • 213.
    von Salomé, Jenny
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Gyllensten, Ulf
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bergström, Tomas F.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Full-length sequence analysis of the HLA-DRB1 locus suggests a recent origin of alleles2007In: Immunogenetics, ISSN 0093-7711, E-ISSN 1432-1211, Vol. 59, no 4, p. 261-271Article in journal (Refereed)
    Abstract [en]

    The HLA region harbors some of the most polymorphic loci in the human genome. Among them is the class II locus HLA-DRB1, with more than 400 known alleles. The age of the polymorphism and the rate at which new alleles are generated at HLA loci has caused much controversy over the years. Previous studies have mostly been restricted to the 270 base pairs that constitute the second exon and represent the most variable part of the gene. Here, we investigate the evolutionary history of the HLA-DRB1 locus on the basis of an analysis of 15 genomic full-length alleles (10–15 kb). In addition, the variation in 49 complete coding sequences and 322 exon 2 sequences were analyzed. When excluding exon 2 from the analysis, the diversity at the synonymous sites was found to be similar to the intron diversity. The overall diversity in noncoding region was also similar to the genome average. The DRB1*03 lineage has been found in human, chimpanzee, bonobo, gorilla, and orangutan. An ancestral “proto HLA-DRB1*03 lineage” appeared to have diverged in the last 5 million years into the human-specific lineages *08, *11, *13, and *14. With exception to exon 2, both the coding- and the noncoding diversity suggests a recent origin (<1 million years ago) for most of the alleles at the HLA-DRB1 locus. Sites encoding for amino acids involved in antigen binding [antigen recognizing sites (ARS)] appear to have a more ancient origin. Taken together, the recent origin of most alleles, the high diversity between allelic lineages, and the ancient origin of sequence motifs in exon 2, is consistent with a relatively rapid generation of novel alleles by gene conversion like events.

  • 214. Wabnik, Krzysztof
    et al.
    Hvidsten, Torgeir R
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kedzierska, Anna
    Van Leene, Jelle
    De Jaeger, Geert
    Beemster, Gerrit T S
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kuiper, Martin T R
    Gene expression trends and protein features effectively complement each other in gene function prediction2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 3, p. 322-330Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Genome-scale 'omics' data constitute a potentially rich source of information about biological systems and their function. There is a plethora of tools and methods available to mine omics data. However, the diversity and complexity of different omics data types is a stumbling block for multi-data integration, hence there is a dire need for additional methods to exploit potential synergy from integrated orthogonal data. Rough Sets provide an efficient means to use complex information in classification approaches. Here, we set out to explore the possibilities of Rough Sets to incorporate diverse information sources in a functional classification of unknown genes. RESULTS: We explored the use of Rough Sets for a novel data integration strategy where gene expression data, protein features and Gene Ontology (GO) annotations were combined to describe general and biologically relevant patterns represented by If-Then rules. The descriptive rules were used to predict the function of unknown genes in Arabidopsis thaliana and Schizosaccharomyces pombe. The If-Then rule models showed success rates of up to 0.89 (discriminative and predictive power for both modeled organisms); whereas, models built solely of one data type (protein features or gene expression data) yielded success rates varying from 0.68 to 0.78. Our models were applied to generate classifications for many unknown genes, of which a sizeable number were confirmed either by PubMed literature reports or electronically interfered annotations. Finally, we studied cell cycle protein-protein interactions derived from both tandem affinity purification experiments and in silico experiments in the BioGRID interactome database and found strong experimental evidence for the predictions generated by our models. The results show that our approach can be used to build very robust models that create synergy from integrating gene expression data and protein features. AVAILABILITY: The Rough Set-based method is implemented in the Rosetta toolkit kernel version 1.0.1 available at: http://rosetta.lcb.uu.se/

  • 215.
    Wallerman, Ola
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Motallebipour, Mehdi
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Patra, Kalicharan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bysani, Madhu Sudhan Reddy
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Molecular interactions between HNF4a, FOXA2 and GABP identified at regulatory DNA elements through ChIP-sequencing2009In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 37, no 22, p. 7498-7508Article in journal (Refereed)
    Abstract [en]

    Gene expression is regulated by combinations of transcription factors, which can be mapped to regulatory elements on a genome-wide scale using ChIP experiments. In a previous ChIP-chip study of USF1 and USF2 we found evidence also of binding of GABP, FOXA2 and HNF4a within the enriched regions. Here, we have applied ChIP-seq for these transcription factors and identified 3064 peaks of enrichment for GABP, 7266 for FOXA2 and 18783 for HNF4a. Distal elements with USF2 signal was frequently bound also by HNF4a and FOXA2. GABP peaks were found at transcription start sites, whereas 94% of FOXA2 and 90% of HNF4a peaks were located at other positions. We developed a method to accurately define TFBS within peaks, and found the predicted sites to have an elevated conservation level compared to peak centers; however the majority of bindings were not evolutionary conserved. An interaction between HNF4a and GABP was seen at TSS, with one-third of the HNF4a positive promoters being bound also by GABP, and this interaction was verified by co-immunoprecipitations.

  • 216.
    Wallerman, Ola
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Motallebipour, Mehdi
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Patra, Kalicharan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bysani, Madhusudhan Reddy
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Medical Genetics.
    Molecular interactions between HNF4a, FOXA2 and GABP identified at regulatory DNA elements through ChIP-sequencing2009In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 37, no 22, p. 7498-7508Article in journal (Refereed)
    Abstract [en]

    Gene expression is regulated by combinations of transcription factors, which can be mapped to regulatory elements on a genome-wide scale using ChIP experiments. In a previous ChIP-chip study of USF1 and USF2 we found evidence also of binding of GABP, FOXA2 and HNF4a within the enriched regions. Here, we have applied ChIP-seq for these transcription factors and identified 3064 peaks of enrichment for GABP, 7266 for FOXA2 and 18783 for HNF4a. Distal elements with USF2 signal was frequently bound also by HNF4a and FOXA2. GABP peaks were found at transcription start sites, whereas 94% of FOXA2 and 90% of HNF4a peaks were located at other positions. We developed a method to accurately define TFBS within peaks, and found the predicted sites to have an elevated conservation level compared to peak centers; however the majority of bindings were not evolutionary conserved. An interaction between HNF4a and GABP was seen at TSS, with one-third of the HNF4a positive promoters being bound also by GABP, and this interaction was verified by co-immunoprecipitations.

  • 217.
    Westholm, Jakub Orzechowski
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Xu, Feifei
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Ronne, Hans
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Genome-scale study of the importance of binding site context for transcription factor binding and gene regulation.2008In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 9, p. 484-Article in journal (Refereed)
    Abstract [en]

    BACKGROUND

    The rate of mRNA transcription is controlled by transcription factors that bind to specific DNA motifs in promoter regions upstream of protein coding genes. Recent results indicate that not only the presence of a motif but also motif context (for example the orientation of a motif or its location relative to the coding sequence) is important for gene regulation.

    RESULTS

    In this study we present ContextFinder, a tool that is specifically aimed at identifying cases where motif context is likely to affect gene regulation. We used ContextFinder to examine the role of motif context in S. cerevisiae both for DNA binding by transcription factors and for effects on gene expression. For DNA binding we found significant patterns of motif location bias, whereas motif orientations did not seem to matter. Motif context appears to affect gene expression even more than it affects DNA binding, as biases in both motif location and orientation were more frequent in promoters of co-expressed genes. We validated our results against data on nucleosome positioning, and found a negative correlation between preferred motif locations and nucleosome occupancy.

    CONCLUSION

    We conclude that the requirement for stable binding of transcription factors to DNA and their subsequent function in gene regulation can impose constraints on motif context.

  • 218.
    Wetterbom, Anna
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Sevov, Marie
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Cavelier, Lucia
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Bergström, Tomas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology. Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Comparative genomic analysis of human and chimpanzee indicates a key role for indels in primate evolution2006In: Journal of Molecular Evolution, ISSN 0022-2844, E-ISSN 1432-1432, Vol. 63, no 5, p. 682-690Article in journal (Refereed)
    Abstract [en]

    Sequence comparison of humans and chimpanzees is of interest to understand the mechanisms behind primate evolution. Here we present an independent analysis of human chromosome 21 and the high-quality BAC clone sequences of the homologous chimpanzee chromosome 22. In contrast to previous studies, we have used global alignment methods and Ensembl predictions of protein coding genes (n = 224) for the analysis. Divergence due to insertions and deletions (indels) along with substitutions was examined separately for different genomic features (coding, noncoding genic, and intergenic sequence). The major part of the genomic divergence could be attributed to indels (5.07%), while the nucleotide divergence was estimated as 1.52%. Thus the total divergence was estimated as 6.58%. When excluding repeats and low-complexity DNA the total divergence decreased to 2.37%. The chromosomal distribution of nucleotide substitutions and indel events was significantly correlated. To further examine the role of indels in primate evolution we focused on coding sequences. Indels were found within the coding sequence of 13% of the genes and approximately half of the indels have not been reported previously. In 5% of the chimpanzee genes, indels or substitutions caused premature stop codons that rendered the affected transcripts nonfunctional. Taken together, our findings demonstrate that indels comprise the majority of the genomic divergence. Furthermore, indels occur frequently in coding sequences. Our results thereby support the hypothesis that indels may have a key role in primate evolution.

  • 219.
    Wilczynski, B., Hvidsten, T., Kryshtafovych, A., Stubbs, L., Komorowski, J., Fidelis, K.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    A rule-based framework for gene regulation pathways discovery2003In: IEEE Computer Society Bioinformatics Conference (CSB2003) Stanford, CA, USA, August 11-14, 2003, p. 435-436Conference paper (Other scientific)
  • 220. Wilczynski, Bartek
    et al.
    Hvidsten, Torgeir R.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kryshtafovych, Andriy
    Tiuryn, Jerzy
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Fidelis, Krzysztof
    Using local gene expression similarities to discover regulatory binding site modules2006In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 7, p. 505-Article in journal (Refereed)
    Abstract [en]

    Background: We present an approach designed to identify gene regulation patterns using sequence and expression data collected for Saccharomyces cerevisae. Our main goal is to relate the combinations of transcription factor binding sites (also referred to as binding site modules) identified in gene promoters to the expression of these genes. The novel aspects include local expression similarity clustering and an exact IF-THEN rule inference algorithm. We also provide a method of rule generalization to include genes with unknown expression profiles. Results: We have implemented the proposed framework and tested it on publicly available datasets from yeast S. cerevisae. The testing procedure consists of thorough statistical analyses of the groups of genes matching the rules we infer from expression data against known sets of coregulated genes. For this purpose we have used published ChIP-Chip data and Gene Ontology annotations. In order to make these tests more objective we compare our results with recently published similar studies. Conclusion: Results we obtain show that local expression similarity clustering greatly enhances overall quality of the derived rules, both in terms of enrichment of Gene Ontology functional annotation and coherence with ChIP-Chip binding data. Our approach thus provides reliable hypotheses on co-regulation that can be experimentally verified. An important feature of the method is its reliance only on widely accessible sequence and expression data. The same procedure can be easily applied to other microbial organisms.

  • 221. Wright, Dominic
    et al.
    Butlin, Roger K.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Epistatic regulation of behavioural and morphological traits in the zebrafish (Danio rerio)2006In: Behavior Genetics, ISSN 0001-8244, E-ISSN 1573-3297, Vol. 36, no 6, p. 914-922Article in journal (Refereed)
    Abstract [en]

    There is currently a lack of studies examining epistasis in general, and specifically for behavioural traits of evolutionary significance. The advent of more efficient analytical methods for exploring epistasis in QTL studies removes the computational restraint on this type of analysis and suggests that performing further analyses of existing datasets may reveal a more complete picture of the genetic architecture of the traits. Here we report the results from an epistatic QTL analysis of an F2 cross between a wild population and a standard laboratory strain of zebrafish. This further analysis was performed using a simultaneous search to identify epistatically interacting QTL affecting behavioural and morphological traits and uncovered several novel epistatic interactions that reached either genome-wide or suggestive significance levels as determined by a randomisation testing approach. These results provide novel insight into the genetic architecture of the regulation of behavioural as well as morphological phenotypes and call for more studies of epistasis for this group of traits.

  • 222.
    Wright, Dominic
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Rubin, Carl-Johan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Martinez Barrio, Alvaro
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Schütz, K.
    Kerje, Susanne
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
    Brändström, Helena
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
    Kindmark, Andreas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
    Jensen, P.
    Andersson, Leif
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    The genetic architecture of domestication in the chicken: effects of pleiotropy and linkage2010In: Molecular Ecology, ISSN 0962-1083, E-ISSN 1365-294X, Vol. 19, no 23, p. 5140-5156Article in journal (Refereed)
    Abstract [en]

    The extent of pleiotropy and epistasis in quantitative traits remains equivocal. In the case of pleiotropy, multiple quantitative trait loci are often taken to be pleiotropic if their confidence intervals overlap, without formal statistical tests being used to ascertain if these overlapping loci are statistically significantly pleiotropic. Additionally, the degree to which the genetic correlations between phenotypic traits are reflected in these pleiotropic quantitative trait loci is often variable, especially in the case of antagonistic pleiotropy. Similarly, the extent of epistasis in various morphological, behavioural and life-history traits is also debated, with a general problem being the sample sizes required to detect such effects. Domestication involves a large number of trade-offs, which are reflected in numerous behavioural, morphological and life-history traits which have evolved as a consequence of adaptation to selective pressures exerted by humans and captivity. The comparison between wild and domestic animals allows the genetic analysis of the traits that differ between these population types, as well as being a general model of evolution. Using a large F(2) intercross between wild and domesticated chickens, in combination with a dense SNP and microsatellite marker map, both pleiotropy and epistasis were analysed. The majority of traits were found to segregate in 11 tight 'blocks' and reflected the trade-offs associated with domestication. These blocks were shown to have a pleiotropic 'core' surrounded by more loosely linked loci. In contrast, epistatic interactions were almost entirely absent, with only six pairs identified over all traits analysed. These results give insights both into the extent of such blocks in evolution and the development of domestication itself.

  • 223. Yadetie, Fekadu
    et al.
    Laegreid, Astrid
    Bakke, Ingunn
    Kusnierczyk, Waclaw
    Komorowski, Jan
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Faculty of Science and Technology, Biology, The Linnaeus Centre for Bioinformatics.
    Waldum, Helge L
    Sandvik, Arne K
    Liver gene expression in rats in response to the peroxisome proliferator-activated2003In: Physiol Genomics, ISSN 1531-2267, Vol. 15, no 1, p. 9-19Article in journal (Refereed)
  • 224.
    Álvarez-Castro, José M.
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Carlborg, Örjan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    A unified model for functional and statistical epistasis and its application in quantitative trait loci analysis2007In: Genetics, ISSN 0016-6731, E-ISSN 1943-2631, Vol. 176, no 2, p. 1151-1167Article in journal (Refereed)
    Abstract [en]

    Interaction between genes, or epistasis, is found to be common and it is a key, concept for understanding adaptation and evolution of natural populations, response to selection in breeding programs, and determination of complex disease. Current]),, two independent classes of models are used to study epistasis. Statistical models focus on maintaining desired statistical properties for detection and estimation of genetic effects and for the decomposition of genetic variance using average effects of allele Substitutions in populations as parameters. Functional models focus on the evolutionary consequences of the attributes of the genotype-phenotype map using natural effects of allele substitutions as parameters. Here we provide a new, general and unified model framework: the natural and orthogonal interactions (NOIA) model. NOIA implements tools for transforming genetic effects measured in One Population to the ones of other populations (e.g., between two experimental designs for QTL) and parameters of statistical and functional epistasis into each other (thus enabling us to obtain functional estimates of QTL), as demonstrated numerically. We develop graphical interpretations of functional and statistical models as regressions of the genotypic values on the gene content, which illustrates the difference between the models-the constraint on the slope of the functional regression-and when the models are equivalent. Furthermore, we use our theoretical foundations to conceptually clarify functional and statistical epistasis, discuss the advantages of NOIA over previous theory, and stress the importance of linking functional and statistical models.

2345 201 - 224 of 224
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf