Skip to content

genepi/23andme-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

23andMe Tools - Convert your 23andMe genotypes to VCF files

This project provides a Java implementation to convert your 23andMe genotype data into compressed VCFs (vcf.gz). VCF files can be useful for imputation with the Michigan Imputation Server or for mitochondrial haplogroup classification with HaploGrep2.

Download your data

Your personal genome can be downloaded from here. After entering your secure answer, the complete dataset can be downloaded at once.

Generate VCF files

The VCF files are generated by combining information from your 23andMe data (.zip or .txt) with the human reference file. This project uses HTSJDK for many tasks.

git clone https://github.com/genepi/23andme-tools.git
cd 23andme-tools
mvn install
java -jar vcf-tools-0.1.jar vcf-generator --in </path/to/genome> --ref </path/to/human_g1k_v37.fasta> 
--out <vcf-destination-folder> [--exclude <chromosomes to exclude>] [--split false]

The current version of the 23andMe genotyping chip v4 (> Nov 2013) is based on the GRCh37 reference (aka g1k, v37) human_g1k_v37.fasta. By setting --ref v37 it will be downloaded automatically.

Usage Examples

Default Command

This command downloads the fasta reference and writes each chromosome (chr1-22,X,Y,MT) to a seperate vcf.gz file:

java -jar vcf-tools-0.1.jar vcf-generator --in /path/to/genome
--ref v37 --out /path/to/vcfDir 

The chromosome MT file can be then used with e.g. HaploGrep2. Add --split false to write one VCF file.

Generating VCFs for Michigan Imputation Server

The Imputation Server accepts vcf.gz files split by chromosome, which can be created with the following command:

java -jar vcf-tools-0.1.jar vcf-generator --in /path/to/23andMe-genome.txt 
--ref /path/to/human_g1k_v37.fasta --out /path/to/vcfDir --exclude Y,MT