Inbred Laboratory Mouse Haplotype Map

Last Update: February 24, 2006

 

We are pleased to make a preliminary release of genotype data from the haplotype map of the inbred mouse project being led by the Daly Laboratory at The Broad Institute of Harvard and MIT and the Massachusetts General Hospital in collaboration with The Jackson Laboratories and with funding from the National Human Genome Research Institute.

 

Abstract

We aim to create a genome-wide haplotype map in all commonly used inbred lab strains of mice in order to enable efficient positional cloning and genotype-phenotype correlation studies. Recent data has established that the genomes of commonly used inbred lab mice, the primary mammalian model system, are simple mosaics of long segments from a limited number of distinct subspecies ofMus musculus. By providing a more complete catalog of variation patterns in each modern strain and their origins, the map will allow the use of QTL mapping data from many crosses simultaneously, as well as strain phenotype data, to accelerate the fine mapping and identification of genes responsible for medically relevant phenotypes.

 

Strains examined to date

"Classical" Inbred Laboratory Strains

Wild-derived Strains

 

129S1/SvImJ

129S4/SvJae

129X1/SvJ

A/J

AKR/J

BALB/cByJ

BTBR+Ttftf

BUB/BnJ

C3H/HeJ

C57BL/6J

C57BLKS/J

C57BR/cdJ

C57L/J

C58/J

CBA/J

CE/J

DBA/1J

DBA/2J

DDK/Pas

 

FVB/NJ

I/LnJ

KK/HIJ

LG/J

LP/J

MA/MyJ

NOD/LtJ

NON/LtJ

NZB/B1NJ

NZW/LacJ

O20

PL/J

Qsi5

RIIIS/J

SEA/GnJ

SJL/J

SM/J

ST/bJ

SWR/J

Mus m. castaneus

CAST/Ei

 

Mus m. musculus

CZECHII/Ei

PWD/Ph

 

Mus m. molossinus

JF1/Ms

MAI/Pas

MOLF/Ei

MSM/Ms

 

Mus m. domesticus

PERA/Ei

WSB/Ei

 

Mus spretus

SEG/Pas

SPRET/Ei

Markers

The initial data release consists of 138,793 Single Nucleotide Polymorphisms (SNP) dispersed at ~20kb intervals across the entire euchromatic genome (with the exception of the Y chromosome). Markers for the HapMap chips, based on snps discovered from several inbred laboratory strains with supplemental discovery from a wild-derived musculus strain (CzechII/Ei), were selected to be as evenly-spaced as  possible with additional reinforcement in sparsely-covered regions to help guarantee successful assays. All markers are placed relative to Mouse Build 33 (May 2004 assembly) and are reported for the forward direction on that assembly.

Raw genotypes from the Affymetrix arrays designed for this project were filtered to remove those:

1)  Showing excess homology to the interrogated genome fraction,

2) Affymetrix quality scores greater than 0.25 (range 0-0.5, 0 being the best),

3) Average quality score of retained calls for each allele among all typed strains was  greater than 0.1 and was not within a factor of 2 of each other,

4) Where the particular strains used to discover a particular SNP did not show the expected allele calls.

 

Important Notes

More details about the design and performance of these arrays (two arrays similar in nature to the 500K human arrays) as well as SNP ascertainment and flanking sequence will be added to this site soon. The data here is still under development and while internal consistency suggests genotyping accuracy is high (~6500 SNPs duplicated on the two arrays show 99.8% consistency), further extraction of data from the arrays and additional QC may be performed so the data should be considered preliminary at this point. The final version of this data will be integrated with other existing SNP data and reflected at public resources at Jackson Labs, NCBI and elsewhere in short order so this site should be viewed as a transient data release site.

 

Download

Files are tab delimited files that range in length from 1239 lines in length (Chr X) to 12,104 lines (Chromosome 1) and so should be readable in Excel. The format is markers in rows and strains in columns.

 

Chromosome

File Size

Chromosome 1

1325956

Chromosome 2

1144265

Chromosome 3

1058666

Chromosome 4

951744

Chromosome 5

927466

Chromosome 6

889708

Chromosome 7

859335

Chromosome 8

849330

Chromosome 9

835596

Chromosome 10

801542

Chromosome 11

793624

Chromosome 12

711231

Chromosome 13

699475

Chromosome 14

698529

Chromosome 15

582587

Chromosome 16

567558

Chromosome 17

519499

Chromosome 18

495696

Chromosome 19

376238

Chromosome X

135965

 

 

Download SNP Flank Sequences Zipped File of SNP Flanks 35.1Mb

        Each snp has one line:  <snp name/position>TAB<bracket notation snp sequence>

        All sequence is relative to the + strand of the 2004 sequence (NCBI Build 33).

        Sequence is 500 bases + snp + 500 bases.

        Each snp-bracket has the B6 allele as the numerator.

        Nearby snps (whether on the hapmap or not) are N'd out.  As a result, sequence is intended for assay design rather than genomic placement per se.

        Lowercase sequence indicates repeatmasking.

 

NOTE: This file is a simple tab delimited text file and CANNOT be opened using Excel.

 

Acknowledgements

Broad Institute of Harvard and MIT

Harvard Medical School

Massachusetts General Hospital

Jackson Laboratories (Mouse Phenome Project)

Affymetrix

Perlegen Sciences

NHGRI

NIEHS