Inbred Laboratory Mouse Haplotype Map
Last update: February 2009
We
are pleased to make an updated release of genotype data from the
haplotype map of the inbred mouse project being led by the Daly
Laboratory at The Broad Institute of Harvard and MIT and the
Massachusetts General Hospital in collaboration with The Jackson
Laboratories and with funding from the National Human Genome Research
Institute.
Complete descriptions of the data and analyses can be found in (manuscript reference to be added).
Downloadable files
Genotypes
The
below file is a gzip'd tab-delimited text file which contains the
post-QC Affymetrix dataset merged with the Wellcome-CTC Mouse
Strain SNP Genotype Set (the large fraction of it which placed well to
the NCBI-build-37 assembly in our hands) for jointly typed strains.
Each SNP appears as a single line, and columns in the file are:
1) SNP name/location according to the NCBI-build-37 mouse assembly (mm37-chr-position)
2) Wellcome Trust marker name, if any
3+) genotype data for each strain in the form of
"genotype:affy-confidence-score" (0 is the best confidence score)
All hapmap/WT-discordant genotypes were removed
Only homozygous WT calls were considered for well-placed build-37 SNPs
Only high-quality (uppercase) genotypes from WT's "extra" typing were used
Confidence-score special cases:
* (asterisk after Affymetrix confidence), means genotype confirmed with WT genotype
-1 .. a discordancy-based NoCall
-2 .. a WT call which has no Affy confidence score
-3 .. a NoCall for an WT-only marker.
Annotated list of well-behaved SNPs on NspI and StyI chips
The
below file is a gzip'd tab-delimited text file listing those
well-behaved SNPs on the two Affy mouse arrays. Each SNP appears
as a single line for each mouse array on which it is well behaved.
Columns in the file are:
1) SNP name on Affy chip, which is the NCBI-build-33 assembly location (eg. mm33-1-123456)
2) enzyme - N = NspI, S = StyI
3) NCBI-build-37 assembly location
4) A-to-build-37-F-strand - the build-37-assembly,
forward-strand-oriented base which corresponds to the Affy "A" allele
5) B-to-build-37-F-strand - the build-37-assembly,
forward-strand-oriented base which corresponds to the Affy "B" allele
6) C57BL/6J-build-37-F-strand call
Flanking sequence and NCBI build 33 -> NCBI build 37 SNP mappings
The
Affymetrix mouse chips were designed using the 2004 mouse assembly
(NCBI build 33). SNPs on the chips are named according to their
build-33 location (mm33-chr-base_position). The below
downloadable file contains information regarding the mapping of the
SNPs to the NCBI-build-37 assembly and also flanking sequence for SNPs
based on the build-37 assembly.
The file is a gzip'd tab-delimited text file, with each SNP as a row, and with the following columns:
1) SNP name on Affy chip, which is the 2004-assembly location (eg. mm33-1-123456)
2) B6(assembly)-allele relative to +-strand of 2004 assembly
3) alternate-allele relative to +-strand of 2004 assembly
All the following will be "N/A" if the SNP doesn't map well to NCBI build 37:
4) NCBI-build-37 assembly location (eg. mm37-1-123456)
5) B6(assembly)-allele relative
to +-strand of NCBI build-37 assembly
6) alternate-allele relative to +-strand of NCBI build-37 assembly
7) NCBI build-37 discovery
has different alleles relative to that of NCBI build-33? .. Y/N
8) NCBI build-37 discovery
is rev cmp'd relative to that of NCBI build-33? .. Y/N ("N/A" if prev
column is "Y")
9) local sequence change relative to NCBI
build-33 within 16 bases of the SNP (Affy probe footprint)? .. Y/N
10) passes QC
and NCBI-build-37-well-mapped/behaved for at least one enzyme chip
(i.e. THE GOOD ONES) .. Y/N
11) number of called strains, relative to original 94-strain typing
12) average Affy DM confidence ("N/A" if previous column is 0)
13) flanking sequence (bracketed-SNP notation with
B6 allele as the numerator, 500 flanking bases to each side, sequence
is soft repeatmasked, nearby known SNPs are N'd out)
Useful links
Previous data release
High-density resequencing genotypes and imputation of non-resequenced strains
High-density resequencing data generation
Acknowledgements
Broad Institute of Harvard and MIT
University of California, Los Angeles
Harvard Medical School
Massachusetts General Hospital
Jackson Laboratories (Mouse Phenome Project)
Affymetrix
Perlegen Sciences
NHGRI
NIEHS
Wellcome Trust Center of Human Genetics