The KPGP data contains a monozygotic twin pair (KPGP88/KPGP89), a dizygotic twin pair (KPGP90/KPGP91) and a multicultural family (KPGP1-KPGP12) with a Caucasian female from US (KPGP10), a Korean father (KPGP9) and two children (KPGP11 and KPGP12). The relations are given in the following pedigree charts.
The Korean Personal Genome Project (KPGP) is part of the international Personal Genome Project (PGP) established by Genome Research Foundation (GRF). 39 Human genomes were sequenced on an Illumina HiSeq 2000 platform with 30x to 40x coverage.
KGPG data (vcf format)
All Chromosomes vcf.gz (1008MB), tbi (3MB)
Chromosome 1 vcf.gz (78MB), tbi (1MB)
Chromosome 2 vcf.gz (80MB), tbi (1MB)
Chromosome 3 vcf.gz (68MB), tbi (1MB) |
|
KPGP / 1000 Genomes merged
The genotypes of 38 Koreans and one Caucasian female are merged with the genotype data of the 1000 Genomes Project (for additional information see the pipeline). Due to the Fort Lauderdale agreement for pre-publication data, we only provide data for the merged chromosome 1 (3.1 million SNVs of 1,134 individuals).
KPGP / 1000 Genomes merged Chromosome 1 vcf.gz (11.7GB), tbi (1MB)
Participants who want to analyze other chromosomes need to download the corresponding vcf data from the 1000 Genomes Project and the KGPG data (vcf format) from the first download box. The example below shows, how chromosome 2 can be merged using these data sets.
vcf-merge kpgpB_chr2.vcf.gz ALL.chr2.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz | bgzip -c > kpgp1000_chr2.vcf.gz gunzip kpgp1000_chr2.vcf.gz sed -e 's/\t\./\t0|0:0:0,0,0/g' kpgp1000_chr2.vcf > KPGP1000_chr2.vcf bgzip KPGP1000_chr2.vcf tabix -p vcf KPGP1000_chr2.vcf.gz
The command line tools bgzip, tabix and vcf-merge are included in samtools and VCF tools, respectively.