Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): Unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters
UMR5288 CNRS, Lab Anthropol Mol & Imagerie Synth AMIS, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;Univ Toulouse III, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;Inst Natl Police Sci, Lab Police Sci Lyon, 31 Ave Franklin Roosevelt, F-69134 Ecully, France..
UMR5288 CNRS, Lab Anthropol Mol & Imagerie Synth AMIS, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;Univ Toulouse III, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;UMR 5505 CNRS, REVA Unit, F-31400 Toulouse, France.;Univ Toulouse, Inst Rech Informat Toulouse, F-31400 Toulouse, France..
UMR5288 CNRS, Lab Anthropol Mol & Imagerie Synth AMIS, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;Univ Toulouse III, 37 Allees Jules Guesde, F-31073 Toulouse 3, France..
UMR5288 CNRS, Lab Anthropol Mol & Imagerie Synth AMIS, 37 Allees Jules Guesde, F-31073 Toulouse 3, France.;Univ Toulouse III, 37 Allees Jules Guesde, F-31073 Toulouse 3, France..
Show others and affiliations
2020 (English)In: Forensic Science International: Genetics, ISSN 1872-4973, E-ISSN 1878-0326, Vol. 48, article id 102342Article in journal (Refereed) Published
Abstract [en]

We developed a new mutationally well-balanced 32 Y-STR multiplex (CombYplex) together with a machine learning (ML) program PredYMaLe to assess the impact of STR mutability on haplogourp prediction, while respecting forensic community criteria (high DC/HD). We designed CombYplex around two sub-panels M1 and M2 characterized by average and high-mutation STR panels. Using these two sub-panels, we tested how our program PredYmale reacts to mutability when considering basal branches and, moving down, terminal branches. We tested first the discrimination capacity of CombYplex on 996 human samples using various forensic and statistical parameters and showed that its resolution is sufficient to separate haplogroup classes. In parallel, PredYMaLe was designed and used to test whether a ML approach can predict haplogroup classes from Y-STR profiles. Applied to our kit, SVM and Random Forest classifiers perform very well (average 97 %), better than Neural Network (average 91 %) and Bayesian methods (< 90 %). We observe heterogeneity in haplogroup assignation accuracy among classes, with most haplogroups having high prediction scores (99-100 %) and two (E1b1b and G) having lower scores (67 %). The small sample sizes of these classes explain the high tendency to misclassify the Y-profiles of these haplogroups; results were measurably improved as soon as more training data were added. We provide evidence that our ML approach is a robust method to accurately predict haplogroups when it is combined with a sufficient number of markers, well-balanced mutation rate Y-STR panels, and large ML training sets. Further research on confounding factors (such as CNV-STR or gene conversion) and ideal STR panels in regard to the branches analysed can be developed to help classifiers further optimize prediction scores.

Place, publisher, year, edition, pages
Elsevier BV , 2020. Vol. 48, article id 102342
Keywords [en]
Y-STR, Machine learning, Assignation accuracy and haplogroup prediction (Hg prediction), Incremental mutation rates
National Category
Genetics and Genomics
Identifiers
URN: urn:nbn:se:uu:diva-423041DOI: 10.1016/j.fsigen.2020.102342ISI: 000569444200004PubMedID: 32818722OAI: oai:DiVA.org:uu-423041DiVA, id: diva2:1478060
Funder
EU, FP7, Seventh Framework Programme, 290344Available from: 2020-10-21 Created: 2020-10-21 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Fortes-Lima, Cesar A.

Search in DiVA

By author/editor
Fortes-Lima, Cesar A.Migot-Nabias, FlorenceTheves, Catherine
By organisation
Human Evolution
In the same journal
Forensic Science International: Genetics
Genetics and Genomics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 73 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf