uu.seUppsala University Publications
Change search
Refine search result
12 1 - 50 of 52
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ameur, Adam
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Yankovski, Vladimir
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Spjuth, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    The LCB Data Warehouse2006In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 22, no 8, p. 1024-1026Article in journal (Refereed)
    Abstract [en]

    The Linnaeus Centre for Bioinformatics Data Warehouse (LCB-DWH) is a web-based infrastructure for reliable and secure microarray gene expression data management and analysis that provides an online service for the scientific community. The LCB-DWH is an effort towards a complete system for storage (using the BASE system), analysis and publication of microarray data. Important features of the system include: access to established methods within R/Bioconductor for data analysis, built-in connection to the Gene Ontology database and a scripting facility for automatic recording and re-play of all the steps of the analysis. The service is up and running on a high performance server. At present there are more than 150 registered users.

  • 2.
    Andersson, Anders
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution.
    Bernander, Rolf
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution.
    Nilsson, Peter
    Dual-genome primer design for construction of DNA microarrays2005In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 21, no 3, p. 325-332Article in journal (Refereed)
    Abstract [en]

    Motivation: Microarray experiments using probes covering a whole transcriptome are expensive to initiate, and a major part of the costs derives from synthesizing gene-specific PCR primers or hybridization probes. The high costs may force researchers to limit their studies to a single organism, although comparing gene expression in different species would yield valuable information.

    Results: We have developed a method, implemented in the software DualPrime, that reduces the number of primers required to amplify the genes of two different genomes. The software identifies regions of high sequence similarity, and from these regions selects PCR primers shared between the genomes, such that either one or, preferentially, both primers in a given PCR can be used for amplification from both genomes. To assure high microarray probe specificity, the software selects primer pairs that generate products of low sequence similarity to other genes within the same genome. We used the software to design PCR primers for 2182 and 1960 genes from the hyperthermophilic archaea Sulfolobus solfataricus and Sulfolobus acidocaldarius, respectively. Primer pairs were shared among 705 pairs of genes, and single primers were shared among 1184 pairs of genes, resulting in a saving of 31% compared to using only unique primers. We also present an alternative primer design method, in which each gene shares primers with two different genes of the other genome, enabling further savings.

  • 3.
    Andersson, Robin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Bruder, Carl E G
    Piotrowski, Arkadiusz
    Menzel, Uwe
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Nord, Helena
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Sandgren, Johanna
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Surgical Sciences.
    Hvidsten, Torgeir R
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    de Ståhl, Teresita Diaz
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Dumanski, Jan P
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    A Segmental Maximum A Posteriori Approach to Genome-wide Copy Number Profiling2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 6, p. 751-758Article in journal (Other academic)
    Abstract [en]

    MOTIVATION: Copy number profiling methods aim at assigning DNA copy numbers to chromosomal regions using measurements from microarray-based comparative genomic hybridizations. Among the proposed methods to this end, Hidden Markov Model (HMM)-based approaches seem promising since DNA copy number transitions are naturally captured in the model. Current discrete-index HMM-based approaches do not, however, take into account heterogeneous information regarding the genomic overlap between clones. Moreover, the majority of existing methods are restricted to chromosome-wise analysis. RESULTS: We introduce a novel Segmental Maximum A Posteriori approach, SMAP, for DNA copy number profiling. Our method is based on discrete-index Hidden Markov Modeling and incorporates genomic distance and overlap between clones. We exploit a priori information through user-controllable parameterization that enables the identification of copy number deviations of various lengths and amplitudes. The model parameters may be inferred at a genome-wide scale to avoid overfitting of model parameters often resulting from chromosome-wise model inference. We report superior performances of SMAP on synthetic data when compared with two recent methods. When applied on our new experimental data, SMAP readily recognizes already known genetic aberrations including both large-scale regions with aberrant DNA copy number and changes affecting only single features on the array. We highlight the differences between the prediction of SMAP and the compared methods and show that SMAP accurately determines copy number changes and benefits from overlap consideration.

  • 4. Andersson, Siv G E
    et al.
    Alsmark, Cecilia
    Canbäck, Björn
    Davids, Wagied
    Frank, Carolin
    Karlberg, Olof
    Klasson, Lisa
    Antoine-Legault, Boris
    Mira, Alex
    Tamas, Ivica
    Comparative genomics of microbial pathogens and symbionts.2002In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 18 Suppl 2, p. S17-Article in journal (Refereed)
    Abstract [en]

    We are interested in quantifying the contribution of gene acquisition, loss, expansion and rearrangements to the evolution of microbial genomes. Here, we discuss factors influencing microbial genome divergence based on pair-wise genome comparisons of closely related strains and species with different lifestyles. A particular focus is on intracellular pathogens and symbionts of the genera Rickettsia, Bartonella and BUCHNERA: Extensive gene loss and restricted access to phage and plasmid pools may provide an explanation for why single host pathogens are normally less successful than multihost pathogens. We note that species-specific genes tend to be shorter than orthologous genes, suggesting that a fraction of these may represent fossil-orfs, as also supported by multiple sequence alignments among species. The results of our genome comparisons are placed in the context of phylogenomic analyses of alpha and gamma proteobacteria. We highlight artefacts caused by different rates and patterns of mutations, suggesting that atypical phylogenetic placements can not a priori be taken as evidence for horizontal gene transfer events. The flexibility in genome structure among free-living microbes contrasts with the extreme stability observed for the small genomes of aphid endosymbionts, in which no rearrangements or inflow of genetic material have occurred during the past 50 millions years (1). Taken together, the results suggest that genomic stability correlate with the content of repeated sequences and mobile genetic elements, and thereby indirectly with bacterial lifestyles.

  • 5.
    Björkholm, Patrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Daniluk, Pawel
    Kryshtafovych, Andriy
    Fidelis, Krzysztof
    Andersson, Robin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Hvidsten, Torgeir R.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 10, p. 1264-1270Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Correct prediction of residue-residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail. RESULTS: We propose a novel hidden Markov model based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary struc-ture and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities in-corporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 L predictions (L = sequence length), our hidden Markov models obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature.

  • 6.
    Bystry, Vojtech
    et al.
    Masaryk Univ, CEITEC Cent European Inst Technol, Brno, Czech Republic..
    Agathangelidis, Andreas
    IRCCS San Raffaele Sci Inst, Div Mol Oncol, Milan, Italy.;IRCCS San Raffaele Sci Inst, Dept Oncohematol, Milan, Italy.;Univ Vita Salute San Raffaele, Milan, Italy..
    Bikos, Vasilis
    Masaryk Univ, CEITEC Cent European Inst Technol, Brno, Czech Republic..
    Sutton, Lesley Ann
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology.
    Baliakas, Panagiotis
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Experimental and Clinical Oncology.
    Hadzidimitriou, Anastasia
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology. Ctr Res & Technol Hellas, Inst Appl Biosci, Thessaloniki, Greece..
    Stamatopoulos, Kostas
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology. Ctr Res & Technol Hellas, Inst Appl Biosci, Thessaloniki, Greece..
    Darzentas, Nikos
    Masaryk Univ, CEITEC Cent European Inst Technol, Brno, Czech Republic..
    ARResT/AssignSubsets: a novel application for robust subclassification of chronic lymphocytic leukemia based on B cell receptor IG stereotypy2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 23, p. 3844-3846Article in journal (Refereed)
    Abstract [en]

    Motivation: An ever-increasing body of evidence supports the importance of B cell receptor immunoglobulin (BcR IG) sequence restriction, alias stereotypy, in chronic lymphocytic leukemia (CLL). This phenomenon accounts for similar to 30% of studied cases, one in eight of which belong to major subsets, and extends beyond restricted sequence patterns to shared biologic and clinical characteristics and, generally, outcome. Thus, the robust assignment of new cases to major CLL subsets is a critical, and yet unmet, requirement. Results: We introduce a novel application, ARResT/AssignSubsets, which enables the robust assignment of BcR IG sequences from CLL patients to major stereotyped subsets. ARResT/AssignSubsets uniquely combines expert immunogenetic sequence annotation from IMGT/V-QUEST with curation to safeguard quality, statistical modeling of sequence features from more than 7500 CLL patients, and results from multiple perspectives to allow for both objective and subjective assessment. We validated our approach on the learning set, and evaluated its real-world applicability on a new representative dataset comprising 459 sequences from a single institution.

  • 7.
    Carlborg, Örjan
    et al.
    Roslin Institute.
    De Koning, D J
    Manly, K F
    Chesler, E
    Williams, R W
    Haley, C S
    Methodological aspects of the genetic dissection of gene expression.2005In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 21, no 10Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Dissection of the genetics underlying gene expression utilizes techniques from microarray analyses as well as quantitative trait loci (QTL) mapping. Available QLT mapping methods are not tailored for the highly automated analyses required to deal with the thousand of gene transcripts encountered in the mapping of QTL affecting gene expression (sometimes referred to as eQTL). This report focuses on the adaptation of QTL mapping methodology to perform automated mapping of QTL affecting gene expression.

    RESULTS: The analyses of expression data on > 12,000 gene transcripts in BXD recombinant inbred mice found, on average, 629 QTL exceeding the genome-wide 5% threshold. Using additional information on trait repeatabilities and QTL location, 168 of these were classified as 'high confidence' QTL. Current sample sizes of genetical genomics studies make it possible to detect a reasonable number of QTL using simple genetic models, but considerably larger studies are needed to evaluate more complex genetic models. After extensive analyses of real data and additional simulated data (altogether > 300,000 genome scans) we make the following recommendations for detection of QTL for gene expression: (1) For populations with an unbalanced number of replicates on each genotype, weighted least squares should be preferred above ordinary least squares. Weights can be based on repeatability of the trait and the number of replicates. (2) A genome scan based on multiple marker information but analysing only at marker locations is a good approximation to a full interval mapping procedure. (3) Significance testing should be based on empirical genome-wide significance thresholds that are derived for each trait separately. (4) The significant QTL can be separated into high and low confidence QTL using a false discovery rate that incorporates prior information such as transcript repeatabilities and co-localization of gene-transcripts and QTL. (5) Including observations on the founder lines in the QTL analysis should be avoided as it inflates the test statistic and increases the Type I error. (6) To increase the computational efficiency of the study, use of parallel computing is advised. These recommendations are summarized in a possible strategy for mapping of QTL in a least squares framework.

    AVAILABILITY: The software used for this study is available on request from the authors.

  • 8. Caulfield, Emmet
    et al.
    Hellander, Andreas
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    CellMC: a multiplatform model compiler for the Cell Broadband Engine and x862010In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 26, p. 426-428Article in journal (Refereed)
  • 9. Das, Sarbashis
    et al.
    Vishnoi, Anchal
    Bhattacharya, Alok
    ABWGAT: anchor-based whole genome analysis tool.2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 24, p. 3319-20Article in journal (Refereed)
    Abstract [en]

    SUMMARY: Large numbers of genomes are being sequenced regularly and the rate will go up in future due to availability of new genome sequencing techniques. In order to understand genotype to phenotype relationships, it is necessary to identify sequence variations at the genomic level. Alignment of a pair of genomes and parsing the alignment data is an accepted approach for identification of variations. Though there are a number of tools available for whole-genome alignment, none of these allows automatic parsing of the alignment and identification of different kinds of genomic variants with high degree of sensitivity. Here we present a simple web-based interface for whole genome comparison named ABWGAT (Anchor-Based Whole Genome Analysis Tool) that is simple to use. The output is a list of variations such as SNVs, indels, repeat expansion and inversion.

    AVAILABILITY: The web server is freely available to non-commercial users at the following address http://abwgc.jnu.ac.in/_sarba. Supplementary data are available at http://abwgc.jnu.ac.in/_sarba/cgi-bin/abwgc_retrival.cgi using job id 524, 526 and 528.

    CONTACT: dsarbashis@gmail.com; alok.bhattacharya@gmail.com

  • 10. Draminski, Michal
    et al.
    Rada-Iglesias, Alvaro
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Enroth, Stefan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Wadelius, Claes
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Koronacki, Jacek
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Monte Carlo feature selection for supervised classification2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 1, p. 110-117Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It is desirable that feature selection provides the features that contribute most to the classification task per se and which should therefore be used by any classifier later used to produce classification rules. In this article, a conceptually simple but computer-intensive approach to this task is proposed. The reliability of the approach rests on multiple construction of a tree classifier for many training sets randomly chosen from the original sample set, where samples in each training set consist of only a fraction of all of the observed features. RESULTS: The resulting ranking of features may then be used to advantage for classification via a classifier of any type. The approach was validated using Golub et al. leukemia data and the Alizadeh et al. lymphoma data. Not surprisingly, we obtained a significantly different list of genes. Biological interpretation of the genes selected by our method showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods.

  • 11. Emami Khoonsari, Payam
    et al.
    Moreno, Pablo
    Bergmann, Sven
    Burman, Joachim
    Capuccini, Marco
    Carone, Matteo
    Cascante, Marta
    de Atauri, Pedro
    Foguet, Carles
    Gonzalez-Beltran, Alejandra N
    Hankemeier, Thomas
    Haug, Kenneth
    He, Sijin
    Herman, Stephanie
    Johnson, David
    Kale, Namrata
    Larsson, Anders
    Neumann, Steffen
    Peters, Kristian
    Pireddu, Luca
    Rocca-Serra, Philippe
    Roger, Pierrick
    Rueedi, Rico
    Ruttkies, Christoph
    Sadawi, Noureddin
    Salek, Reza M
    Sansone, Susanna-Assunta
    Schober, Daniel
    Selivanov, Vitaly
    Thévenot, Etienne A
    van Vliet, Michael
    Zanetti, Gianluigi
    Steinbeck, Christoph
    Kultima, Kim
    Spjuth, Ola
    Interoperable and scalable data analysis with microservices: applications in metabolomics2019In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811Article in journal (Refereed)
    Abstract [en]

    Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects.Supplementary data are available at Bioinformatics online.

  • 12.
    Fange, David
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Fange, David
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Elf, Johan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    MesoRD 1.0: Stochastic reaction-diffusion simulations in the microscopic limit2012In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811Article in journal (Refereed)
  • 13.
    Fange, David
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Mahmutovic, Anel
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Elf, Johan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    MesoRD 1.0: Stochastic reaction-diffusion simulations in the microscopic limit2012In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 28, no 23, p. 3155-3157Article in journal (Refereed)
    Abstract [en]

    MesoRD is a tool for simulating stochastic reaction-diffusion systems as modeled by the reaction diffusion master equation. The simulated systems are defined in the Systems Biology Markup Language with additions to define compartment geometries. MesoRD 1.0 supports scale-dependent reaction rate constants and reactions between reactants in neighbouring subvolumes. These new features make it possible to construct physically consistent models of diffusion-controlled reactions also at fine spatial discretization.

  • 14.
    Freyhult, Eva
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Moulton, Vincent
    Clote, Peter
    Boltzmann probability of RNA structural neighbors and riboswitch detection2007In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 23, no 16, p. 2054-2062Article in journal (Refereed)
    Abstract [en]

    Motivation: We describe algorithms implemented in a new software package, RNAbor, to investigate structures in a neighborhood of an input secondary structure of an RNA sequence s. The input structure could be the minimum free energy structure, the secondary structure obtained by analysis of the X-ray structure or by comparative sequence analysis, or an arbitrary intermediate structure.

    Results: A secondary structure of s is called a -neighbor of if and differ by exactly base pairs. RNAbor computes the number (N), the Boltzmann partition function (Z) and the minimum free energy (MFE) and corresponding structure over the collection of all -neighbors of . This computation is done simultaneously for all m, in run time O (mn3) and memory O(mn2), where n is the sequence length. We apply RNAbor for the detection of possible RNA conformational switches, and compare RNAbor with the switch detection method paRNAss. We also provide examples of how RNAbor can at times improve the accuracy of secondary structure prediction.

  • 15. Garber, Manuel
    et al.
    Guttman, Mitchell
    Clamp, Michele
    Zody, Michael C.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Friedman, Nir
    Xie, Xiaohui
    Identifying novel constrained elements by exploiting biased substitution patterns2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 12, p. I54-I62Article in journal (Refereed)
    Abstract [en]

    Motivation: Comparing the genomes from closely related species provides a powerful tool to identify functional elements in a reference genome. Many methods have been developed to identify conserved sequences across species; however, existing methods only model conservation as a decrease in the rate of mutation and have ignored selection acting on the pattern of mutations. Results: We present a new approach that takes advantage of deeply sequenced clades to identify evolutionary selection by uncovering not only signatures of rate-based conservation but also substitution patterns characteristic of sequence undergoing natural selection. We describe a new statistical method for modeling biased nucleotide substitutions, a learning algorithm for inferring site-specific substitution biases directly from sequence alignments and a hidden Markov model for detecting constrained elements characterized by biased substitutions. We show that the new approach can identify significantly more degenerate constrained sequences than rate-based methods. Applying it to the ENCODE regions, we identify as much as 10.2% of these regions are under selection.

  • 16.
    Gennemark, Peter
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Mathematics.
    Wedelin, Dag
    Benchmarks for identification of ordinary differential equations from time series data2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 6, p. 780-786Article in journal (Refereed)
    Abstract [en]

    Motivation: In recent years, the biological literature has seen a significant increase of reported methods for identifying both structure and parameters of ordinary differential equations (ODEs) from time series data. A natural way to evaluate the performance of such methods is to try them on a sufficient number of realistic test cases. However, weak practices in specifying identification problems and lack of commonly accepted benchmark problems makes it difficult to evaluate and compare different methods. Results: To enable better evaluation and comparisons between different methods, we propose how to specify identification problems as optimization problems with a model space of allowed reactions (e. g. reaction kinetics like Michaelis-Menten or S-systems), ranges for the parameters, time series data and an error function. We also de. ne a. le format for such problems. We then present a collection of more than 40 benchmark problems for ODE model identification of cellular systems. The collection includes realistic problems of different levels of difficulty w.r.t. size and quality of data. We consider both problems with simulated data from known systems, and problems with real data. Finally, we present results based on our identification algorithm for all benchmark problems. In comparison with publications on which we have based some of the benchmark problems, our approach allows all problems to be solved without the use of supercomputing.

  • 17. Ghahremanpour, Mohammad Mehdi
    et al.
    Arab, Seyed Shahriar
    Aghazadeh, Saman Biook
    Zhang, Jin
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology.
    van der Spoel, David
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology.
    MemBuilder: a web-based graphical interface to build heterogeneously mixed membrane bilayers for the GROMACS biomolecular simulation program2014In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 30, no 3, p. 439-441Article in journal (Refereed)
    Abstract [en]

    Motivation: Molecular dynamics (MD) simulations have had a profound impact on studies of membrane proteins during past two decades, but the accuracy of MD simulations of membranes is limited by the quality of membrane models and the applied force fields. Membrane models used in MD simulations mostly contain one kind of lipid molecule. This is far from reality, for biological membranes always contain more than one kind of lipid molecule. Moreover, the lipid composition and their distribution are functionally important. As a result, there is a necessity to prepare more realistic lipid membranes containing different types of lipids at physiological concentrations. Results: To automate and simplify the building process of heterogeneous lipid bilayers as well as providing molecular topologies for included lipids based on both united and all-atom force fields, we provided MemBuilder as a web-based graphical user interface.

  • 18. Grabherr, Manfred G
    et al.
    Russell, Pamela
    Meyer, Miriah
    Mauceli, Evan
    Alföldi, Jessica
    Di Palma, Federica
    Lindblad-Toh, Kerstin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Genome-wide synteny through highly sensitive sequence alignment: Satsuma2010In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 26, no 9, p. 1145-1151Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Comparative genomics heavily relies on alignments of large and often complex DNA sequences. From an engineering perspective, the problem here is to provide maximum sensitivity (to find all there is to find), specificity (to only find real homology) and speed (to accommodate the billions of base pairs of vertebrate genomes). RESULTS: Satsuma addresses all three issues through novel strategies: (i) cross-correlation, implemented via fast Fourier transform; (ii) a match scoring scheme that eliminates almost all false hits; and (iii) an asynchronous 'battleship'-like search that allows for aligning two entire fish genomes (470 and 217 Mb) in 120 CPU hours using 15 processors on a single machine. AVAILABILITY: Satsuma is part of the Spines software package, implemented in C++ on Linux. The latest version of Spines can be freely downloaded under the LGPL license from http://www.broadinstitute.org/science/programs/genome-biology/spines/.

  • 19.
    Guy, Lionel
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    phyloSkeleton: taxon selection, data retrieval and marker identification for phylogenomics2017In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 33, no 8, p. 1230-1232Article in journal (Refereed)
    Abstract [en]

    With the wealth of available genome sequences, a difficult and tedious part of inferring phylogenomic trees is now to select genomes with an appropriate taxon density in the different parts of the tree. The package described here offers tools to easily select the most representative organisms, following a set of simple rules based on taxonomy and assembly quality, to retrieve the genomes from public databases (NCBI, JGI), to annotate them if necessary, to identify given markers in these, and to prepare files for multiple sequence alignment.

    AVAILABILITY AND IMPLEMENTATION: phyloSkeleton is a Perl module and is freely available under GPLv3 at https://bitbucket.org/lionelguy/phyloskeleton/ CONTACT: lionel.guy@imbim.uu.se.

  • 20.
    Guy, Lionel
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organism Biology, Molecular Evolution.
    Roat Kultima, Jens
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organism Biology, Molecular Evolution.
    Andersson, Siv G.E.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organism Biology, Molecular Evolution.
    genoPlotR: comparative gene and genome visualization in R2010In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 26, no 18, p. 2334-2335Article in journal (Refereed)
    Abstract [en]

    The amount of gene and genome data obtained by next-generation sequencing technologies generates a need for comparative visualization tools. Complementing existing software for comparison and exploration of genomics data, genoPlotR automatically creates publication-grade linear maps of gene and genomes, in a highly automatic, flexible and reproducible way.

    Availability: genoPlotR is a platform-independent R package, available with full source code under a GPL2 license at R-Forge: http://genoplotr.r-forge.r-project.org/

    Contact: lionel.guy@ebc.uu.se

  • 21.
    Irrgang, M Eric
    et al.
    University of Virginia.
    Hays, Jennifer M
    University of Virginia.
    Kasson, Peter M.
    University of Virginia.
    gmxapi: a high-level interface for advanced control and extension of molecular dynamics simulations.2018In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811Article in journal (Refereed)
    Abstract [en]

    Summary: Molecular dynamics simulations have found use in a wide variety of biomolecular applications, from protein folding kinetics to computational drug design to refinement of molecular structures. Two areas where users and developers frequently need to extend the built-in capabilities of most software packages are implementing custom interactions, for instance biases derived from experimental data, and running ensembles of simulations. We present a Python high-level interface for the popular simulation package GROMACS that 1) allows custom potential functions without modifying the simulation package code, 2) maintains the optimized performance of GROMACS, and 3) presents an abstract interface to building and executing computational graphs that allows transparent low-level optimization of data flow and task placement. Minimal dependencies make this integrated API for the GROMACS simulation engine simple, portable, and maintainable. We demonstrate this API for experimentally-driven refinement of protein conformational ensembles.

    Availability: LGPLv2.1 source and instructions are available at https://github.com/kassonlab/gmxapi.

    Supplementary information: Supplementary data are available at Bioinformatics online.

  • 22.
    Jakobsson, Mattias
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Evolutionary Biology.
    COMPASS: a program for generating serial samples under an infinite sites model2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 21, p. 2845-2847Article in journal (Refereed)
    Abstract [en]

    The program COMPASS can generate samples that have been collected at various points in time from a population that is evolving according to a Wright-Fisher model. The samples are generated using coalescence simulations permitting various demographic scenarios and the program uses an infinite sites model to generate polymorphism data for the samples. By generating serially sampled population-genetic data, COMPASS allows investigating properties of polymorphism data that has been collected at different time points, and aid in making inference from ancient polymorphism data.

  • 23. Jeszenoi, Norbert
    et al.
    Horvath, Istvan
    Balint, Monika
    van der Spoel, David
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology.
    Hetenyi, Csaba
    Mobility-based prediction of hydration structures of protein surfaces2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 12, p. 1959-1965Article in journal (Refereed)
    Abstract [en]

    Motivation: Hydration largely determines solubility, aggregation of proteins and influences interactions between proteins and drug molecules. Despite the importance of hydration, structural determination of hydration structure of protein surfaces is still challenging from both experimental and theoretical viewpoints. The precision of experimental measurements is often affected by fluctuations and mobility of water molecules resulting in uncertain assignment of water positions. Results: Our method can utilize mobility as an information source for the prediction of hydration structure. The necessary information can be produced by molecular dynamics simulations accounting for all atomic interactions including water-water contacts. The predictions were validated and tested by comparison to more than 1500 crystallographic water positions in 20 hydrated protein molecules including enzymes of biomedical importance such as cyclin-dependent kinase 2. The agreement with experimental water positions was larger than 80% on average. The predictions can be particularly useful in situations where no or limited experimental knowledge is available on hydration structures of molecular surfaces.

  • 24. Keane, Michael
    et al.
    Craig, Thomas
    Alfoldi, Jessica
    Berlin, Aaron M
    Johnson, Jeremy
    Seluanov, Andrei
    Gorbunova, Vera
    Di Palma, Federica
    Lindblad-Toh, Kerstin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Church, George M
    de Magalhaes, Joao Pedro
    The Naked Mole Rat Genome Resource: facilitating analyses of cancer and longevity-related adaptations2014In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 30, no 24, p. 3558-3560Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: The naked mole rat (Heterocephalus glaber) is an exceptionally long-lived and cancer-resistant rodent native to East Africa. Although its genome was previously sequenced, here we report a new assembly sequenced by us with substantially higher N50 values for scaffolds and contigs. RESULTS: We analyzed the annotation of this new improved assembly and identified candidate genomic adaptations which may have contributed to the evolution of the naked mole rat's extraordinary traits, including in regions of p53, and the hyaluronan receptors CD44 and HMMR (RHAMM). Furthermore, we developed a freely available web portal, the Naked Mole Rat Genome Resource (http://www.naked-mole-rat.org), featuring the data and results of our analysis, to assist researchers interested in the genome and genes of the naked mole rat, and also to facilitate further studies on this fascinating species. AVAILABILITY AND IMPLEMENTATION: The Naked Mole Rat Genome Resource is freely available online at http://www.naked-mole-rat.org. This resource is open source and the source code is available at https://github.com/maglab/naked-mole-rat-portal. CONTACT: jp@senescence.info.

  • 25.
    Kierczak, Marcin
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Jablonska, Jagoda
    Uppsala University, Science for Life Laboratory, SciLifeLab. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Forsberg, S. K.
    Bianchi, Matteo
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Tengvall, Katarina
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Pettersson, Mats
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Scholz, Veronica
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Meadows, Jennifer R.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Jern, Patric
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Carlborg, Örjan
    Swedish University of Agricultural Sciences, Uppsala, Sweden.
    Lindblad-Toh, Kerstin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    cgmisc: Enhanced Genome-wide Association Analyses and Visualisation2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 23, p. 3830-3831Article in journal (Refereed)
    Abstract [en]

    SUMMARY:

    High-throughput genotyping and sequencing technologies facilitate studies of complex genetic traits and provide new research opportunities. The increasing popularity of genome-wide association studies (GWAS) leads to the discovery of new associated loci and a better understanding of the genetic architecture underlying not only diseases, but also other monogenic and complex phenotypes. Several softwares are available for performing GWAS analyses, R environment being one of them.

    RESULTS: We present cgmisc, an R package that enables enhanced data analysis and visualisation of results from GWAS. The package contains several utilities and modules that complement and enhance the functionality of the existing software. It also provides several tools for advanced visualisation of genomic data and utilises the power of the R language to aid in preparation of publication-quality figures. Some of the package functions are specific for the domestic dog (Canis familiaris) data.

    AVAILABILITY: The package is operating system-independent and is available from: https://github.com/cgmisc-team/cgmisc CONTACT: cgmisc@imbim.uu.se.

  • 26.
    Lapinsh, Maris
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Prusis, Peteris
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Uhlén, Staffan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Wikberg, Jarl E S
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences, Pharmaceutical Pharmacology.
    Improved approach for proteochemometrics modeling: application to organic compound--amine G protein-coupled receptor interactions2005In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 21, no 23, p. 4289-4296Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Proteochemometrics is a novel technology for the analysis of interactions of series of proteins with series of ligands. We have here customized it for analysis of large datasets and evaluated it for the modeling of the interaction of psychoactive organic amines with all the five known families of amine G protein-coupled receptors (GPCRs). RESULTS: The model exploited data for the binding of 22 compounds to 31 amine GPCRs, correlating chemical descriptions and cross-descriptions of compounds and receptors to binding affinity using a novel strategy. A highly valid model (q2 = 0.76) was obtained which was further validated by external predictions using data for 10 other entirely independent compounds, yielding the high q2ext = 0.67. Interpretation of the model reveals molecular interactions that govern psychoactive organic amines overall affinity for amine GPCRs, as well as their selectivity for particular amine GPCRs. The new modeling procedure allows us to obtain fully interpretable proteochemometrics models using essentially unlimited number of ligand and protein descriptors.

  • 27.
    Larsson, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology.
    AliView: a fast and lightweight alignment viewer and editor for large data sets2014In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 30, no 22, p. 3276-3278Article in journal (Refereed)
    Abstract [en]

    Summary: AliView is an alignment viewer and editor designed tomeet the requirements of next generation sequencing era phyloge-netic datasets. AliView handles alignments of unlimited size in theformats most commonly used, i.e. Fasta, Phylip, Nexus, Clustal andMSF. The intuitive graphical interface makes it easy to inspect, sort,delete, merge and realign sequences as part of the manual filteringprocess of large data sets. AliView also works as an easy to usealignment editor for small as well as large data sets.Availability: AliView is released as open-source software under theGNU General Public License, version 3.0 (GPLv3), and is availableat GitHub (www.github.com/AliView). The program is cross-platformand extensively tested on Linux, Mac OS X and Windows systems.Downloads and help are available at http://ormbunkar.se/aliviewContact: anders.larsson@ebc.uu.seSupplementary information:

  • 28.
    Lindén, Martin
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Molecular Systems Biology.
    Ćurić, Vladimir
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Molecular Systems Biology.
    Boucharin, Alexis
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Molecular Systems Biology.
    Fange, David
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Molecular Systems Biology.
    Elf, Johan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Molecular Systems Biology.
    Simulated single molecule microscopy with SMeagol2016In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 15, p. 2394-2395Article in journal (Refereed)
    Abstract [en]

    Summary: SMeagol is a software tool to simulate highly realistic microscopy data based on spatial systems biology models, in order to facilitate development, validation and optimization of advanced analysis methods for live cell single molecule microscopy data.

    Availability and implementation: SMeagol runs on Matlab R2014 and later, and uses compiled binaries in C for reaction–diffusion simulations. Documentation, source code and binaries for Mac OS, Windows and Ubuntu Linux can be downloaded from http://smeagol.sourceforge.net.

  • 29.
    Ljungberg, Kajsa
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Carlborg, Örjan
    Simultaneous search for multiple QTL using the global optimization algorithm DIRECT2004In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 20, p. 1887-1895Article in journal (Refereed)
  • 30.
    Mahjani, Behrang
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Toor, Salman
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Nettelblad, Carl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    QTL as a service: PruneDIRECT for multi-dimensional QTL scans in cloud settings2016In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811Article in journal (Other academic)
  • 31.
    Marzouka, Nour-al-dain
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine.
    Nordlund, Jessica
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine.
    Bäcklin, Christofer L.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine. Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Cancer Pharmacology and Computational Medicine.
    Lönnerholm, Gudmar
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Women's and Children's Health, Pediatrics.
    Syvänen, Ann-Christine
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine.
    Almlöf, Jonas Carlsson
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine.
    CopyNumber450kCancer: baseline correction for accurate copy number calling from the 450k methylation array2016In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 7, p. 1080-1082Article in journal (Refereed)
    Abstract [en]

    The Illumina Infinium HumanMethylation450 BeadChip (450k) is widely used for the evaluation of DNA methylation levels in large-scale datasets, particularly in cancer. The 450k design allows copy number variant (CNV) calling using existing bioinformatics tools. However, in cancer samples, numerous large-scale aberrations cause shifting in the probe intensities and thereby may result in erroneous CNV calling. Therefore, a baseline correction process is needed. We suggest the maximum peak of probe segment density to correct the shift in the intensities in cancer samples.

  • 32. Money, Daniel
    et al.
    Whelan, Simon
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology.
    GeLL: a generalized likelihood library for phylogenetic models2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 14, p. 2391-2393Article in journal (Refereed)
    Abstract [en]

    Phylogenetic models are an important tool in molecular evolution allowing us to study the pattern and rate of sequence change. The recent influx of new sequence data in the biosciences means that to address evolutionary questions, we need a means for rapid and easy model development and implementation. Here we present GeLL, a Java library that lets users use text to quickly and efficiently define novel forms of discrete data and create new substitution models that describe how those data change on a phylogeny. GeLL allows users to define general substitution models and data structures in a way that is not possible in other existing libraries, including mixture models and non-reversible models. Classes are provided for calculating likelihoods, optimizing model parameters and branch lengths, ancestral reconstruction and sequence simulation.

  • 33.
    Novella, Jon Ander
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Emami Khoonsari, Payam
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Chemistry.
    Herman, Stephanie
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Chemistry. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Whitenack, Daniel
    Capuccini, Marco
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Burman, Joachim
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience, Neurology.
    Kultima, Kim
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Chemistry.
    Spjuth, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Container-based bioinformatics with Pachyderm2019In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, p. 839-846Article in journal (Refereed)
  • 34. Pronk, Sander
    et al.
    Pall, Szilard
    Schulz, Roland
    Larsson, Per
    Bjelkmar, Par
    Apostolov, Rossen
    Shirts, Michael R.
    Smith, Jeremy C.
    Kasson, Peter M.
    van der Spoel, David
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology.
    Hess, Berk
    Lindahl, Erik
    GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit2013In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 7, p. 845-854Article in journal (Refereed)
    Abstract [en]

    Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. Results: Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.

  • 35.
    Rostkowski, Michal
    et al.
    University of Copenhagen.
    Spjuth, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences. Uppsala University, Science for Life Laboratory, SciLifeLab.
    Rydberg, Patrik
    University of Copenhagen.
    WhichCyp: Prediction of Cytochromes P450 Inhibition2013In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 16, p. 2051-2052Article in journal (Refereed)
    Abstract [en]

    SUMMARY: In this work we present WhichCyp, a tool for prediction of which cytochromes P450 isoforms (among 1A2, 2C9, 2C19, 2D6 and 3A4) a given molecule is likely to inhibit. The models are built from experimental high-throughput data using support vector machines and molecular signatures.

    AVAILABILITY: The WhichCyp server is freely available for use on the web at http://drug.ku.dk/whichcyp, where the WhichCyp Java program and source code is also available for download.

    CONTACT: pry@sund.ku.dk

    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  • 36. Ryberg, Martin
    et al.
    Nilsson, R Henrik
    Matheny, P Brandon
    DivBayes and SubT: exploring species diversification using Bayesian statistics.2011In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, no 17, p. 2439-40Article in journal (Refereed)
    Abstract [en]

    SUMMARY: DivBayes is a program to estimate diversification rates from species richness and ages of a set of clades. SubT estimates diversification rates from node heights within a clade. Both programs implement Bayesian statistics and provide the ability to account for uncertainty in the ages of taxa in the underlying data, an improvement over more commonly used maximum likelihood methods.

    AVAILABILITY: DivBayes and SubT are released as C++ source code under the GNU GPL v. 3 software license in Supplementary information 1 and 2, respectively, and at http://web.utk.edu/~kryberg/. They have been successfully compiled on various Linux, MacOS X and Windows systems.

    CONTACT: kryberg@utk.edu

    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  • 37. Rögnvaldsson, Thorsteinn
    et al.
    You, Liwen
    Garwicz, Daniel
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Biochemial structure and function.
    State of the art prediction of HIV-1 protease cleavage sites2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 8, p. 1204-1210Article in journal (Refereed)
    Abstract [en]

    Motivation: Understanding the substrate specificity of human immunodeficiency virus (HIV)-1 protease is important when designing effective HIV-1 protease inhibitors. Furthermore, characterizing and predicting the cleavage profile of HIV-1 protease is essential to generate and test hypotheses of how HIV-1 affects proteins of the human host. Currently available tools for predicting cleavage by HIV-1 protease can be improved.

    Results: The linear support vector machine with orthogonal encoding is shown to be the best predictor for HIV-1 protease cleavage. It is considerably better than current publicly available predictor services. It is also found that schemes using physicochemical properties do not improve over the standard orthogonal encoding scheme. Some issues with the currently available data are discussed.

  • 38.
    Schaal, Wesley
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Hammerling, Ulf
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Cancer Pharmacology and Computational Medicine.
    Gustafsson, Mats G
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Cancer Pharmacology and Computational Medicine.
    Spjuth, Ola
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Automated QuantMap for rapid quantitative molecular network topology analysis2013In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 18, p. 2369-2370Article in journal (Refereed)
    Abstract [en]

    SUMMARY:

    The previously disclosed QuantMap method for grouping chemicals by biological activity used online services for much of the data gathering and some of the numerical analysis. The present work attempts to streamline this process by using local copies of the databases and in-house analysis. Using computational methods similar or identical to those used in the previous work, a qualitatively equivalent result was found in just a few seconds on the same dataset (collection of 18 drugs). We use the user-friendly Galaxy framework to enable users to analyze their own datasets. Hopefully, this will make the QuantMap method more practical and accessible and help achieve its goals to provide substantial assistance to drug repositioning, pharmacology evaluation and toxicology risk assessment.

    AVAILABILITY:

    http://galaxy.predpharmtox.org

    CONTACT:

    mats.gustafsson@medsci.uu.se or ola.spjuth@farmbio.uu.se

    SUPPLEMENTARY INFORMATION:

    Supplementary data are available at Bioinformatics online.

  • 39.
    Spjuth, Ola
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Eklund, Martin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Lapins, Maris
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Junaid, Muhammad
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Wikberg, Jarl
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Services for prediction of drug susceptibility for HIV proteases and reverse transcriptases at the HIV Drug Research Centre2011In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, no 12, p. 1719-1720Article in journal (Refereed)
    Abstract [en]

    Summary: The HIV Drug Research Centre (HIVDRC) has established Web services for prediction of drug susceptibility for HIV proteases and reverse transcriptases. The services are based on two proteochemometric models which accepts a protease or reverse transcriptase sequence in amino acid form, and outputs the predicted drug susceptibility values. The predictions are based on a comprehensive analysis where all the relevant inhibitors are included, resulting in models with excellent predictive capabilities.

    Availability and Implementation: The services are implemented as interoperable Web services (REST and XMPP), with supporting web pages to allow for individual analyses. A set of plugins were also developed which make the services available from the Bioclipse workbench for life science. Services are available athttp://www.hivdrc.org/services.

  • 40.
    Spjuth, Ola
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Georgiev, Valentin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Carlsson, Lars
    Global Safety Assesment, AstraZeneca R&D.
    Alvarsson, Jonathan
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Berg, Arvid
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Willighagen, Egon
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Wikberg, Jarl E S
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Eklund, Martin
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences.
    Bioclipse-R: Integrating management and visualization of life science data with statistical analysis2013In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 2, p. 286-289Article in journal (Refereed)
    Abstract [en]

    Bioclipse, a graphical workbench for the life sciences, provides functionality for managing and visualizing life science data. We introduce Bioclipse-R, which integrates Bioclipse and the statistical programming language R. The synergy between Bioclipse and R is demonstrated by the construction of a decision support system for anticancer drug screening and mutagenicity prediction, which shows how Bioclipse-R can be used to perform complex tasks from within a single software system.

  • 41.
    Stenberg, J
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Zhang, M
    Ji, H
    Disperse-a software system for design of selector probes for exon resequencing applications2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 5, p. 666-667Article in journal (Refereed)
    Abstract [en]

    Selector probes enable the amplification of many selected regions of the genome in multiplex. Disperse is a software pipeline that automates the procedure of designing selector probes for exon resequencing applications.

  • 42. Studham, Matthew E.
    et al.
    Tjarnberg, Andreas
    Nordling, Torbjörn
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Cancer and Vascular Biology.
    Nelander, Sven
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Cancer and Vascular Biology.
    Sonnhammer, Erik L. L.
    Functional association networks as priors for gene regulatory network inference2014In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 30, no 12, p. 130-138Article in journal (Refereed)
    Abstract [en]

    Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadequate for reliable inference of the network, informative priors have been shown to improve the accuracy of inferences. Results: This study explores the potential of undirected, confidence-weighted networks, such as those in functional association databases, as a prior source for GRN inference. Such networks often erroneously indicate symmetric interaction between genes and may contain mostly correlation-based interaction information. Despite these drawbacks, our testing on synthetic datasets indicates that even noisy priors reflect some causal information that can improve GRN inference accuracy. Our analysis on yeast data indicates that using the functional association databases FunCoup and STRING as priors can give a small improvement in GRN inference accuracy with biological data.

  • 43.
    Sundström, Görel
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Zamani, Neda
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Grabherr, Manfred G.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology.
    Mauceli, Evan
    Whiteboard: a framework for the programmatic visualization of complex biological analyses2015In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 31, no 12, p. 2054-2055Article in journal (Refereed)
    Abstract [en]

    A Summary: Whiteboard is a class library implemented in C++ that enables visualization to be tightly coupled with computation when analyzing large and complex datasets.

  • 44. Szpiech, Z.A.
    et al.
    Jakobsson, Mattias
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Evolutionary Biology.
    Rosenberg, N.A.
    ADZE: a rarefaction approach for counting alleles private to combinations of populations2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 21, p. 2498-2504Article in journal (Refereed)
    Abstract [en]

    Motivation: Analysis of the distribution of alleles across populations is a useful tool for examining population diversity and relationships. However, sample sizes often differ across populations, sometimes making it difficult to assess allelic distributions across groups.Results: We introduce a generalized rarefaction approach for counting alleles private to combinations of populations. Our method evaluates the number of alleles found in each of a set of populations but absent in all remaining populations, considering equal-sized subsamples from each population. Applying this method to a worldwide human microsatellite dataset, we observe a high number of alleles private to the combination of African and Oceanian populations. This result supports the possibility of a migration out of Africa into Oceania separate from the migrations responsible for the majority of the ancestry of the modern populations of Asia, and it highlights the utility of our approach to sample size correction in evaluating hypotheses about population history.

  • 45.
    Tammi, Martti
    et al.
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Arner, Erik
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Britton, Tom
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Mathematics.
    Andersson, Björn
    Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology.
    Separation of Nearly Identical Repeats in Shotgun Assemblies using Defined Nucloetide Positions, DNPs2002In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 18, no 3, p. 379-388Article in journal (Refereed)
    Abstract [en]

    An increasingly important problem in genome sequencing is the failure of the commonly used shotgun assembly programs to correctly assemble repetitive sequences. The assembly of non-repetitive regions or regions containing repeats considerably shorter than the average read length is in practice easy to solve, while longer repeats have been a difficult problem. We here present a statistical method to separate arbitrarily long, almost identical repeats, which makes it possible to correctly assemble complex repetitive sequence regions. The differences between repeat units may be as low as 1% and the sequencing error may be up to ten times higher. The method is based on the realization that a comparison of only a part of all overlapping sequences at a time in a data set does not generate enough information for a conclusive analysis. Our method uses optimal multi-alignments consisting of all the overlaps of each read. This makes it possible to determine defined nucleotide positions, DNPs, which constitute the differences between the repeat units. Differences between repeats are distinguished from sequencing errors using statistical methods, where the probabilities of obtaining certain combinations of candidate DNPs are calculated using the information from the multi-alignments. The use of DNPs and combinations of DNPs will allow for optimal and rapid assemblies of repeated regions. This method can solve repeats that differ in only two positions in a read length, which is the theoretical limit for repeat separation. We predict that this method will be highly useful in shotgun sequencing in the future.

  • 46.
    Thollesson, Mikael
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Evolution, Genomics and Systematics, Molecular Evolution.
    LDDist: a Perl module for calculating LogDet pair-wise distances for protein and nucleotide sequences2004In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 20, no 3, p. 416-418Article in journal (Refereed)
    Abstract [en]

    Summary: LDDist is a Perl module implemented in C++ that allows the user to calculate LogDet pair-wise genetic distances for amino acid as well as nucleotide sequence data. It can handle site-to-site rate variation by treating a proportion of the sites as invariant and/or by assigning sites to different, presumably homogenous, rate categories. The rate-class assignments and invariant proportion can be set explicitly, or estimated by the program; the latter using either of two different capture–recapture methods. The assignment to rate categories in lieu of a phylogeny can be done using Shannon–Wiener index as a crude token for relative rate.

  • 47. Van Belle, Vanya
    et al.
    Pelckmans, Kristiaan
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Automatic control.
    Van Huffel, Sabine
    Suykens, Johan A. K.
    Improved performance on high-dimensional survival data by application of Survival-SVM2011In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, no 1, p. 87-94Article in journal (Refereed)
  • 48.
    van der Spoel, David
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational and Systems Biology.
    van Maaren, Paul J.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology.
    Caleman, Carl
    Coherent Imaging Division, Center for Free-Electron Laser Science, Deutsches Elektronen-Synchrotron Notkestrasse 85, DE-22607 Hamburg, Germany .
    GROMACS molecule & liquid database2012In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 28, no 5, p. 752-753Article in journal (Refereed)
    Abstract [en]

    Motivation:

    The molecular dynamics simulation package GROMACS is a widely used tool used in a broad range of different applications within physics, chemistry and biology. It is freely available, user friendly and extremely efficient. The GROMACS software is force field agnostic, and compatible with many molecular dynamics force fields; coarse- grained, unified atom, all atom as well as polarizable models based on the charge on a spring concept. To validate simulations, it is necessary to compare results from the simulations to experimental data. To ease the process of setting up topologies and structures for simulations, as well as providing pre- calculated physical properties along with experimental values for the same we provide a web- based database, containing 145 organic molecules at present.

    Results

    Liquid properties of 145 organic molecules have been simulated using two different force fields, OPLS all atom and Generalized Amber Force Field. So far, eight properties have been calculated (the density, enthalpy of vaporization, surface tension, heat capacity at constant volume and pressure, isothermal compressibility, volumetric expansion coefficient and the static dielectric constant). The results, together with experimental values are available through the database, along with liquid structures and topologies for the 145 molecules, in the two force fields.

  • 49. Wabnik, Krzysztof
    et al.
    Hvidsten, Torgeir R
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kedzierska, Anna
    Van Leene, Jelle
    De Jaeger, Geert
    Beemster, Gerrit T S
    Komorowski, Jan
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
    Kuiper, Martin T R
    Gene expression trends and protein features effectively complement each other in gene function prediction2009In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 25, no 3, p. 322-330Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Genome-scale 'omics' data constitute a potentially rich source of information about biological systems and their function. There is a plethora of tools and methods available to mine omics data. However, the diversity and complexity of different omics data types is a stumbling block for multi-data integration, hence there is a dire need for additional methods to exploit potential synergy from integrated orthogonal data. Rough Sets provide an efficient means to use complex information in classification approaches. Here, we set out to explore the possibilities of Rough Sets to incorporate diverse information sources in a functional classification of unknown genes. RESULTS: We explored the use of Rough Sets for a novel data integration strategy where gene expression data, protein features and Gene Ontology (GO) annotations were combined to describe general and biologically relevant patterns represented by If-Then rules. The descriptive rules were used to predict the function of unknown genes in Arabidopsis thaliana and Schizosaccharomyces pombe. The If-Then rule models showed success rates of up to 0.89 (discriminative and predictive power for both modeled organisms); whereas, models built solely of one data type (protein features or gene expression data) yielded success rates varying from 0.68 to 0.78. Our models were applied to generate classifications for many unknown genes, of which a sizeable number were confirmed either by PubMed literature reports or electronically interfered annotations. Finally, we studied cell cycle protein-protein interactions derived from both tandem affinity purification experiments and in silico experiments in the BioGRID interactome database and found strong experimental evidence for the predictions generated by our models. The results show that our approach can be used to build very robust models that create synergy from integrating gene expression data and protein features. AVAILABILITY: The Rough Set-based method is implemented in the Rosetta toolkit kernel version 1.0.1 available at: http://rosetta.lcb.uu.se/

  • 50.
    Whelan, Simon
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology.
    Irisarri, Iker
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology.
    Burki, Fabien
    Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
    PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences2018In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 34, no 22, p. 3929-3930Article in journal (Refereed)
    Abstract [en]

    A Summary: Phylogenomic datasets invariably contain undetected stretches of non-homologous characters due to poor-quality sequences or erroneous gene models. The large-scale multi-gene nature of these datasets renders impractical or impossible detailed manual curation of sequences, but few tools exist that can automate this task. To address this issue, we developed a new method that takes as input a set of unaligned homologous sequences and uses an explicit probabilistic approach to identify and mask regions with non-homologous adjacent characters. These regions are defined as sharing no statistical support for homology with any other sequence in the set, which can result from e.g. sequencing errors or gene prediction errors creating frameshifts. Our methodology is implemented in the program PREQUAL, which is a fast and accurate tool for high-throughput filtering of sequences. The program is primarily aimed at amino acid sequences, although it can handle protein coding DNA sequences as well. It is fully customizable to allow fine-tuning of the filtering sensitivity.

12 1 - 50 of 52
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf