uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Bunikis, Ignas
Publications (10 of 12) Show all publications
Wallberg, A., Bunikis, I., Vinnere, O., Mosbech, M.-B., Childers, A. K., Evans, J. D., . . . Webster, M. T. (2019). A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds. BMC Genomics, 20, Article ID 275.
Open this publication in new window or tab >>A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds
Show others...
2019 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 20, article id 275Article in journal (Refereed) Published
Abstract [en]

Background

The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map.

Results

Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor >98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features.

Conclusions

The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.

Place, publisher, year, edition, pages
BMC, 2019
Keywords
Genome assembly, Single-molecule real-time (SMRT) sequencing, Linked-read sequencing, Optical mapping, Hi-C, Telomeres, Centromeres
National Category
Genetics
Identifiers
urn:nbn:se:uu:diva-382559 (URN)10.1186/s12864-019-5642-0 (DOI)000464118800001 ()30961563 (PubMedID)
Funder
Swedish Research Council Formas, 2013-722Swedish Research Council, 2014-5096
Note

Andreas Wallberg and Ignas Bunikis contributed equally to this work.

Available from: 2019-05-03 Created: 2019-05-03 Last updated: 2019-05-03Bibliographically approved
Christmas, M. J., Wallberg, A., Bunikis, I., Olsson, A., Wallerman, O. & Webster, M. T. (2019). Chromosomal inversions associated with environmental adaptation in honeybees. Molecular Ecology, 28(6), 1358-1374
Open this publication in new window or tab >>Chromosomal inversions associated with environmental adaptation in honeybees
Show others...
2019 (English)In: Molecular Ecology, ISSN 0962-1083, E-ISSN 1365-294X, Vol. 28, no 6, p. 1358-1374Article in journal (Refereed) Published
Abstract [en]

Chromosomal inversions can facilitate local adaptation in the presence of gene flow by suppressing recombination between well-adapted native haplotypes and poorly adapted migrant haplotypes. East African mountain populations of the honeybee Apis mellifera are highly divergent from neighbouring lowland populations at two extended regions in the genome, despite high similarity in the rest of the genome, suggesting that these genomic regions harbour inversions governing local adaptation. Here, we utilize a new highly contiguous assembly of the honeybee genome to characterize these regions. Using whole-genome sequencing data from 55 highland and lowland bees, we find that the highland haplotypes at both regions are present at high frequencies in three independent highland populations but extremely rare elsewhere. The boundaries of both divergent regions are characterized by regions of high homology with each other positioned in opposite orientations and contain highly repetitive, long inverted repeats with homology to transposable elements. These regions are likely to represent inversion breakpoints that participate in nonallelic homologous recombination. Using long-read data, we confirm that the lowland samples are contiguous across breakpoint regions. We do not find evidence for disruption of functional sequence by these breakpoints, which suggests that the inversions are likely maintained due to their allelic content conferring local adaptation in highland environments. Finally, we identify a third divergent genomic region, which contains highly divergent segregating haplotypes that also may contain inversion variants under selection. The results add to a growing body of evidence indicating the importance of chromosomal inversions in local adaptation.

Place, publisher, year, edition, pages
WILEY, 2019
Keywords
chromosomal inversion, honeybee, local adaptation, long-read sequencing, nonallelic homologous recombination, structural variation
National Category
Genetics
Identifiers
urn:nbn:se:uu:diva-383207 (URN)10.1111/mec.14944 (DOI)000465219200012 ()30431193 (PubMedID)
Funder
Swedish Research Council Formas, 2013-722Swedish Research Council, 2014-5096
Available from: 2019-05-23 Created: 2019-05-23 Last updated: 2019-08-12Bibliographically approved
Martijn, J., Lind, A. E., Schön, M. E., Spiertz, I., Juzokaite, L., Bunikis, I., . . . Ettema, T. J. G. (2019). Confident phylogenetic identification of uncultured prokaryotes through long read amplicon sequencing of the 16S-ITS-23S rRNA operon. Environmental Microbiology, 21(7), 2485-2498
Open this publication in new window or tab >>Confident phylogenetic identification of uncultured prokaryotes through long read amplicon sequencing of the 16S-ITS-23S rRNA operon
Show others...
2019 (English)In: Environmental Microbiology, ISSN 1462-2912, E-ISSN 1462-2920, Vol. 21, no 7, p. 2485-2498Article in journal (Refereed) Published
Abstract [en]

Amplicon sequencing of the 16S rRNA gene is the predominant method to quantify microbial compositions and to discover novel lineages. However, traditional short amplicons often do not contain enough information to confidently resolve their phylogeny. Here we present a cost-effective protocol that amplifies a large part of the rRNA operon and sequences the amplicons with PacBio technology. We tested our method on a mock community and developed a read-curation pipeline that reduces the overall read error rate to 0.18%. Applying our method on four environmental samples, we captured near full-length rRNA operon amplicons from a large diversity of prokaryotes. The method operated at moderately high-throughput (22286-37,850 raw ccs reads) and generated a large amount of putative novel archaeal 23S rRNA gene sequences compared to the archaeal SILVA database. These long amplicons allowed for higher resolution during taxonomic classification by means of long (similar to 1000 bp) 16S rRNA gene fragments and for substantially more confident phylogenies by means of combined near full-length 16S and 23S rRNA gene sequences, compared to shorter traditional amplicons (250 bp of the 16S rRNA gene). We recommend our method to those who wish to cost-effectively and confidently estimate the phylogenetic diversity of prokaryotes in environmental samples at high throughput.

Place, publisher, year, edition, pages
John Wiley & Sons, 2019
National Category
Microbiology
Identifiers
urn:nbn:se:uu:diva-390901 (URN)10.1111/1462-2920.14636 (DOI)000474294900020 ()31012228 (PubMedID)
Funder
EU, European Research Council, 310039-PUZZLE_CELLSwedish Foundation for Strategic Research , SSF-FFL5Swedish Research Council, 2015-04959
Available from: 2019-08-19 Created: 2019-08-19 Last updated: 2019-08-19Bibliographically approved
Andrade, P., Pinho, C., Perez i de lanuza, G., Afonso, S., Brejcha, J., Rubin, C.-J., . . . Carneiro, M. (2019). Regulatory changes in pterin and carotenoid genes underlie balanced color polymorphisms in the wall lizard. Proceedings of the National Academy of Sciences of the United States of America, 116(12), 5633-5642
Open this publication in new window or tab >>Regulatory changes in pterin and carotenoid genes underlie balanced color polymorphisms in the wall lizard
Show others...
2019 (English)In: Proceedings of the National Academy of Sciences of the United States of America, ISSN 0027-8424, E-ISSN 1091-6490, Vol. 116, no 12, p. 5633-5642Article in journal (Refereed) Published
Abstract [en]

Reptiles use pterin and carotenoid pigments to produce yellow, orange, and red colors. These conspicuous colors serve a diversity of signaling functions, but their molecular basis remains unresolved. Here, we show that the genomes of sympatric color morphs of the European common wall lizard (Podarcis muralis), which differ in orange and yellow pigmentation and in their ecology and behavior, are virtually undifferentiated. Genetic differences are restricted to two small regulatory regions near genes associated with pterin [sepiapterin reductase (SPR)] and carotenoid [beta-carotene oxygenase 2 (BCO2)] metabolism, demonstrating that a core gene in the housekeeping pathway of pterin biosynthesis has been coopted for bright coloration in reptiles and indicating that these loci exert pleiotropic effects on other aspects of physiology. Pigmentation differences are explained by extremely divergent alleles, and haplotype analysis revealed abundant transspecific allele sharing with other lacertids exhibiting color polymorphisms. The evolution of these conspicuous color ornaments is the result of ancient genetic variation and cross-species hybridization.

Keywords
Podarcis muralis, carotenoid pigmentation, pterin pigmentation, balanced polymorphism, introgression
National Category
Evolutionary Biology
Identifiers
urn:nbn:se:uu:diva-381075 (URN)10.1073/pnas.1820320116 (DOI)000461679000067 ()30819892 (PubMedID)
Funder
Knut and Alice Wallenberg FoundationSwedish Research Council, 2017-02907Swedish Research Council, E0446501EU, Horizon 2020, PTDC/BIA-EVL/30288/2017 -NORTE -01-0145-FEDER-30288EU, Horizon 2020, PTDC/BIA-EVL/30288/2017 -NORTE -01-0145-FEDER-30288EU, FP7, Seventh Framework Programme, 286431Swedish Research Council
Available from: 2019-04-23 Created: 2019-04-23 Last updated: 2019-04-23Bibliographically approved
Ameur, A., Che, H., Martin, M., Bunikis, I., Dahlberg, J., Höijer, I., . . . Gyllensten, U. B. (2018). De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data. Genes, 9(10), Article ID 486.
Open this publication in new window or tab >>De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data
Show others...
2018 (English)In: Genes, ISSN 2073-4425, E-ISSN 2073-4425, Vol. 9, no 10, article id 486Article in journal (Refereed) Published
Abstract [en]

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.

Keywords
de novo assembly, SMRT sequencing, GRCh38, human reference genome, human whole-genome sequencing, population sequencing, Swedish population
National Category
Genetics
Identifiers
urn:nbn:se:uu:diva-369762 (URN)10.3390/genes9100486 (DOI)000448656700024 ()30304863 (PubMedID)
Funder
Knut and Alice Wallenberg Foundation, 2014.0272Swedish Research Council
Available from: 2018-12-17 Created: 2018-12-17 Last updated: 2018-12-17Bibliographically approved
Weissensteiner, M. H., Pang, A. W. C., Bunikis, I., Höijer, I., Pettersson, O. V., Suh, A. & Wolf, J. B. W. (2017). Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications. Genome Research, 27(5), 697-708
Open this publication in new window or tab >>Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
Show others...
2017 (English)In: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 27, no 5, p. 697-708Article in journal (Refereed) Published
Abstract [en]

Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and subtelomeric regions, it locally influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly [LR]) and single-molecule optical maps (optical map assembly [ OM]). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing misassemblies. By combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using whole-genome population resequencing data, we estimated the population-scaled recombination rate (rho) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three different technologies, our results highlight the importance of adding a layer of information on genome structure that is inaccessible to each approach independently.

Place, publisher, year, edition, pages
COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT, 2017
National Category
Biological Sciences
Identifiers
urn:nbn:se:uu:diva-323040 (URN)10.1101/gr.215095.116 (DOI)000400392400005 ()28360231 (PubMedID)
Funder
Knut and Alice Wallenberg FoundationSwedish National Infrastructure for Computing (SNIC)Swedish Research Council, 621-2010-5553EU, European Research Council, ERCStG-336536
Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2019-01-07Bibliographically approved
Olsen, R.-A., Bunikis, I., Tiukova, I., Holmberg, K., Lotstedt, B., Pettersson, O. V., . . . Vezzi, F. (2015). De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping. GigaScience, 4, Article ID 56.
Open this publication in new window or tab >>De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
Show others...
2015 (English)In: GigaScience, ISSN 2047-217X, E-ISSN 2047-217X, Vol. 4, article id 56Article in journal (Refereed) Published
Abstract [en]

Background: It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. Methods: In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. Results: We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.

National Category
Medical Genetics
Identifiers
urn:nbn:se:uu:diva-271435 (URN)10.1186/s13742-015-0094-1 (DOI)000365669400002 ()26617983 (PubMedID)
Available from: 2016-01-08 Created: 2016-01-08 Last updated: 2018-01-10Bibliographically approved
Ameur, A., Bunikis, I., Enroth, S. & Gyllensten, U. (2014). CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects. Database: The Journal of Biological Databases and Curation, bau098
Open this publication in new window or tab >>CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects
2014 (English)In: Database: The Journal of Biological Databases and Curation, ISSN 1758-0463, E-ISSN 1758-0463, p. bau098-Article in journal (Refereed) Published
Abstract [en]

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome-(WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server.

National Category
Medical Genetics
Identifiers
urn:nbn:se:uu:diva-235613 (URN)10.1093/database/bau098 (DOI)000342753100001 ()
Available from: 2014-11-12 Created: 2014-11-06 Last updated: 2018-01-11Bibliographically approved
Ameur, A., Meiring, T. L., Bunikis, I., Häggqvist, S., Lindau, C., Lindberg, J. H., . . . Gyllensten, U. (2014). Comprehensive profiling of the vaginal microbiome in HIV positive women using massive parallel semiconductor sequencing. Scientific Reports, 4, 4398
Open this publication in new window or tab >>Comprehensive profiling of the vaginal microbiome in HIV positive women using massive parallel semiconductor sequencing
Show others...
2014 (English)In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 4, p. 4398-Article in journal (Refereed) Published
Abstract [en]

Infections by HIV increase the risk of acquiring secondary viral and bacterial infections and methods are needed to determine the spectrum of co-infections for proper treatment. We used rolling circle amplification (RCA) and Ion Proton sequencing to investigate the vaginal microbiome of 20 HIV positive women from South Africa. A total of 46 different human papillomavirus (HPV) types were found, many of which are not detected by existing genotyping assays. Moreover, the complete genomes of two novel HPV types were determined. Abundance of HPV infections was highly correlated with real-time PCR estimates, indicating that the RCA-Proton method can be used for quantification of individual pathogens. We also identified a large number of other viral, bacterial and parasitic co-infections and the spectrum of these co-infections varied widely between individuals. Our method provides rapid detection of a broad range of pathogens and the ability to reconstruct complete genomes of novel infectious agents.

National Category
Medical and Health Sciences
Identifiers
urn:nbn:se:uu:diva-223522 (URN)10.1038/srep04398 (DOI)000332937300007 ()
Available from: 2014-04-30 Created: 2014-04-22 Last updated: 2017-12-05Bibliographically approved
Sevov, M., Bunikis, I., Häggqvist, S., Höglund, M., Rosenquist, R., Ameur, A. & Cavelier, L. (2014). Targeted RNA Sequencing Assay Efficiently Identifies Cryptic KMT2A (MLL)-Fusions in Acute Leukemia Patients. Paper presented at 56th Annual Meeting of the American-Society-of-Hematology, DEC 06-09, 2014, San Francisco, CA. Blood, 124(21)
Open this publication in new window or tab >>Targeted RNA Sequencing Assay Efficiently Identifies Cryptic KMT2A (MLL)-Fusions in Acute Leukemia Patients
Show others...
2014 (English)In: Blood, ISSN 0006-4971, E-ISSN 1528-0020, Vol. 124, no 21Article in journal, Meeting abstract (Other academic) Published
National Category
Hematology
Identifiers
urn:nbn:se:uu:diva-247856 (URN)000349233808076 ()
Conference
56th Annual Meeting of the American-Society-of-Hematology, DEC 06-09, 2014, San Francisco, CA
Available from: 2015-03-27 Created: 2015-03-24 Last updated: 2017-12-04Bibliographically approved
Organisations

Search in DiVA

Show all publications