Anders Bergström et. al
Science. 2020 Mar 20
この論文の分析深度は、35xと充分に深く、かつ、統一した基準で比較した最新論文である
(精神面での完全なDNA異常集団であることが100%確実な韓国人どもは、1000ゲノムプロジェクトフェーズ3及びHDGP(この論文のプロジェクト)のいずれでもサンプル対象ではない)
サンプル数929で
①SNPs 67,300,000
②インデル 880,000
③CNVs 40,736
一人当たりでは、下記であるが、
①72,443
②947
③4.38
In this set of 929 genomes we identified 67.3 million single-nucleotide polymorphisms (SNPs), 8.8 million small insertions or deletions (indels) and 40,736 copy number variants (CNVs) (15). This is nearly as many as the 84.7 million SNPs discovered in 2504 individuals by the 1000 Genomes Project (2), reflecting increased sensitivity due to high-coverage sequencing as well as the greater diversity of human ancestries covered by the HGDP-CEPH panel.
と自慢している、事実上は、新規発見のSNPs等である
主に、ネアンデルタール人、デニソバ人のゲノムの現性人類への影響にメインを置いて分析している
Short abstract
We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to each of southern Africa, central Africa, Oceania and the Americas, but an absence of such variants fixed between major geographical regions.We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the last 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.
Structured Abstract
We sequence 929 genomes from 54 geographically, linguistically and culturally diverse human populations to an average of 35x coverage, and analyze the variation among them.
Results
We identify 67.3 million single-nucleotide polymorphisms (SNPs), 8.8 million small insertions or deletions (indels) and 40,736 copy number variants (CNVs).
Populations in central and southern Africa, the Americas and Oceania each harbour tens to hundreds of thousands of private, common genetic variants.
The majority of these variants arose as novel mutations rather than through archaic introgression, except in Oceanian populations where many private variants derive from Denisovan admixture. While some reach high frequencies, no variants are fixed between major geographical regions.
We estimate that the genetic separation between present-day human populations occurred mostly within the last 250,000 years.
Most populations expanded in size over the last 10,000 years, but hunter-gatherer groups did not.
Despite their very low levels or absence of archaic ancestry, African populations share many Neanderthal and Denisovan variants that are absent from Eurasia, reflecting how a larger proportion of the ancestral human variation has been maintained in Africa.
A consensus view of the history of our species includes divergence from the ancestors of the archaic Neanderthal and Denisovan groups 500,000-700,000 years ago,the appearance of anatomical modernity in Africa in the last few hundred thousand years, an expansion out of Africa and the Near East 50,000-70,000 years ago, with a reduction in genetic diversity in the descendant populations, admixture with archaic groups in Eurasia shortly after this and large-scale population growth, migration and admixture following multiple independent transitions from hunter-gatherer to food producing lifestyles in the last 10,000 years (1).
1. Nielsen R, et al. Tracing the peopling of the world through genomics. Nature. 2017;541:302–310.
Large-scale genome sequencing efforts to date have been restricted to large, metropolitan populations and employed low-coverage sequencing (2), while those sampling human groups more widely have mostly been limited to 1-3 genomes per population (3, 4).
The Human Genome Diversity Project (HGDP)-CEPH panel (5) has constituted a key resource to which several iterations of genetic assays have been applied (3, 6–12).
Genetic variant discovery across diverse human populations
We performed Illumina sequencing to an average coverage of 35x (min: 25x) and mapped reads to the GRCh38 reference assembly.
While the vast majority of the variants discovered by one of the studies but not the other are very low in frequency, the HGDP dataset contains substantial numbers of variants that were not identified by the 1000 Genomes Project but are common or even high-frequency in some populations:
~1 million variants at ≥20%, ~100,000 variants at ≥50% and even ~1000 variants fixed at 100% frequency in at least one sampled population (Fig. 1B). This highlights the importance of anthropologically informed sampling for uncovering human genetic diversity.
Effective population size histories
In Europe and East Asia, most populations are inferred to have experienced major growth in the last 10,000 years, but less so in more isolated groups, including the European Sardinians, Basques, Orkney islanders, the southern Chinese Lahu and the Siberian Yakut.The time depth and mode of human population separations
We used the 26 genomes physically phased by linked-read technology to study the time-course of population separations using the MSMC2 method (22, 29).Assuming a mutation rate of 1.25×10-8 per base-pair per generation (30) and a generation time of 29 years (31),