A comparative analysis of mathematical methods for homogeneity estimation of the Lithuanian population

  • Alma Molytė
  • Alina Urnikytė
  • Vaidutis Kučinskas
Keywords: genotyping, multidimensional scaling, principal components, principal coordinate analysis, genotypes data visualization

Abstract

Background. Population genetic structure is one of the most important population genetic parameters revealing its demographic features. The aim of this study was to evaluate the homogeneity of the Lithuanian population on the basis of the genome-wide genotyping data. The comparative analysis of three methods – multidimensional scaling, principal components, and principal coordinates analysis – to visualize multidimensional genetics data was performed. The results of visualization (mapping images) are also presented. Materials and methods. The data set consisted of 425 samples from six ethnolinguistic groups of the Lithuanian population. Genomic DNA was extracted from whole venous blood using either the phenol-chloroform extraction method or the automated DNA extraction platform TECAN Freedom EVO. Genotyping was performed at the Department of Human and Medical Genetics, Institute of Biomedical Sciences, Faculty of Medicine, Vilnius University, Lithuania, with the Illumina HumanOmniExpress-12 v1.1 and the Infinium OmniExpress-24. For the estimation of homogeneity of the Lithuanian population, PLINK data file was obtained using PLINK v1.07 program. The Past3 software was used to visualize the genotype data with multidimensional scaling and principal coordinates methods. The SmartPCA from EIGENSOFT 7.2.1 program was used in the principal component analysis to determine the population structure. Conclusions. Methods of multidimensional scaling, principal coordinate, and principal component for the genetic structure of the Lithuanian population were investigated and compared. The principal coordinate and principal component methods can be used for genotyping data visualization, since any essential differences in the results obtained were not observed and compared to multidimensional scaling. The Lithuanian population is homogenous whereas the points are strongly close when we use the principal coordinates or principal component methods.
Published
2020-04-14
Section
Genetics