1
|
Cai M, Lei F, Chen M, Lan Q, Wu X, Mao C, Shi M, Zhu B. Systematic analyses of AISNPs screening and classification algorithms based on genome-wide data for forensic biogeographic ancestry inference. Forensic Sci Int 2024; 357:111975. [PMID: 38547686 DOI: 10.1016/j.forsciint.2024.111975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 01/23/2024] [Accepted: 03/01/2024] [Indexed: 04/06/2024]
Abstract
Identifying the biogeographic ancestral origin of biological sample left at a crime scene can provide important evidence for judicial case, as well as clue for narrowing down suspect. Ancestry informative single nucleotide polymorphism (AISNP) has become one of the most important genetic markers in recent years for screening ancestry information loci and analyzing the population genetic background and structure due to their high number and wide distributions in the human genome. In this study, based on data from 26 populations in the 1000 Genomes Project Phase 3, a Random Forest classification model was constructed with one-vs-rest classification strategy for embedded feature selection in order to obtain a panel with a small number of efficient AISNPs. The research aim was to clarify differentiations of population genetic structures among continents and subregions of East Asia. ADMIXTURE results showed that based on the 58 AISNPs selected by the machine learning algorithm, the 26 populations involved in the study could be categorized into six intercontinental ancestry components: North East Asia, South East Asia, Africa, Europe, South Asia, and America. The 24 continental-specific AISNPs and 34 East Asian-specific AISNPs were finally obtained, and used to construct the ancestry prediction model using XGBoost algorithm, resulting in the Matthews correlation coefficients of 0.94 and 0.89, and accuracies of 0.94 and 0.92, respectively. The machine learning models that we constructed using population-specific AISNPs were able to accurately predict the ancestral origins of continental and intra-East Asian populations. To summarize, screening a set of high-perform AISNPs to infer biogeographical ancestral information using embedded feature selection has potential application in creating a layered inference system that accurately differentiates from intercontinental populations to local subpopulations.
Collapse
Affiliation(s)
- Meiming Cai
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China
| | - Fanzhang Lei
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China
| | - Man Chen
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China
| | - Qiong Lan
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China; Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Xiaolian Wu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China
| | - Chen Mao
- Department of Epidemiology, School of Public Health, Southern Medical University, Guangzhou, Guangdong, China.
| | - Meisen Shi
- Criminal Justice College of China University of Political Science and Law, Beijing, China.
| | - Bofeng Zhu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong, China.
| |
Collapse
|
2
|
Jin XY, Liu YF, Cui W, Chen C, Zhang XR, Huang J, Zhu BF. Development a multiplex panel of AISNPs, multi-allelic InDels, microhaplotypes and Y-SNP/InDel loci for multiple forensic purposes via the NGS. Electrophoresis 2021; 43:632-644. [PMID: 34859475 DOI: 10.1002/elps.202100253] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 11/11/2021] [Accepted: 11/16/2021] [Indexed: 11/09/2022]
Abstract
Recently, next generation sequencing showed the promising application value in forensic research. In this study, we constructed a multiplex system of different molecular genetic markers based on the previous selected AISNPs, multi-allelic InDels, microhaplotypes and Y-SNP/InDel loci and evaluated forensic efficiencies of the system in Chinese Shaanxi Han, Hui and Mongolian groups via the NGS platform. Ancestry informative analyses of Shaanxi Han, Hui and Mongolian groups revealed that most Mongolian individuals could be differentiated from Shaanxi Hans and Huis based on the selected AISNPs. Multi-allelic InDels and microhaplotypes showed the multiple allele variations and possessed relatively high genetic polymorphisms in these three groups, indicating these loci could also provide higher forensic efficiencies for individual identification and paternity testing. Based on Y-SNPs, different haplogroup distributions were observed among Shaanxi Han, Hui and Mongolian groups. In conclusion, the self-developed system could be used to simultaneously carry out the individual identification, paternity analysis, mixture deconvolution, forensic ancestry information analysis and Y chromosomal haplogroup inference, which could provide more investigative clues in forensic practices. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Xiao-Ye Jin
- Xi'an Jiaotong University Health Science Center, Xi'an, P. R. China.,Department of Forensic Medicine, Guizhou Medical University, Guiyang, P. R. China
| | - Yan-Fang Liu
- Multi-Omics Innovative Research Center of Forensic Identification, Department of Forensic Genetics, School of Forensic Medicine, Southern Medical University, Guangzhou, P. R. China
| | - Wei Cui
- Multi-Omics Innovative Research Center of Forensic Identification, Department of Forensic Genetics, School of Forensic Medicine, Southern Medical University, Guangzhou, P. R. China
| | - Chong Chen
- Xi'an Jiaotong University Health Science Center, Xi'an, P. R. China
| | - Xing-Ru Zhang
- Xi'an Jiaotong University Health Science Center, Xi'an, P. R. China
| | - Jiang Huang
- Department of Forensic Medicine, Guizhou Medical University, Guiyang, P. R. China
| | - Bo-Feng Zhu
- Multi-Omics Innovative Research Center of Forensic Identification, Department of Forensic Genetics, School of Forensic Medicine, Southern Medical University, Guangzhou, P. R. China
| |
Collapse
|
3
|
Guo XY, Sun CC, Xue SY, Zhao H, Jiang L, Li CX. 49 AISNP: a study on the ancestry inference of the three ethnic groups in the north of East Asia. Yi Chuan 2021; 43:880-889. [PMID: 34702700 DOI: 10.16288/j.yczz.21-073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
The ancestry inference of unknown samples plays an important role in forensic investigations. An ideal panel is a set of few markers with high ancestry inference accuracy. We collected 428 AISNP (ancestry informative SNP) that can distinguish the three ethnic groups in north of East Asia, including northern Han, Japanese and Korean. The genotypes of 428 AISNP in 307 samples from these three ethnic groups were obtained. Based on the information of Fst value and clustering by allele frequency, the panel was further refined into 49AISNP smart panel. Inference accuracy of the 49AISNP was verified by the leave-one-out method with 307 samples, and the results showed that its accuracy was higher than 99% in the northern Han, Japanese and Korean ethnic groups. This panel can also be helpful to further distinguish the ethnic sub-groups in East Asia.
Collapse
Affiliation(s)
- Xiao-Yuan Guo
- Shanxi Medical University, Taiyuan 030001, China.,Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| | - Chang-Chun Sun
- Shanxi Medical University, Taiyuan 030001, China.,Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| | - Si-Yao Xue
- Shanxi Medical University, Taiyuan 030001, China.,Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| | - Hui Zhao
- Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| | - Li Jiang
- Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| | - Cai-Xia Li
- Shanxi Medical University, Taiyuan 030001, China.,Key Laboratory of Forensic Genetics, Beijing Engineering Research Center of Crime Scene Evidence Examination, National Engineering Laboratory for Forensic Science, Institute of Forensic Science, Beijing 100038, China
| |
Collapse
|