DEPICT

1. Download DEPICT (2.9 GB)

Decompress per mouse click or use tar xvfj depict_140611.tar.bz2 at the command line.

2. Run DEPICT

Change directory into depict and issue the command ./depict.py from the terminal.

3. Browse results

DEPICT results for a test run will be found in depict/results/.

Prepare your study-specific input files

  1. Use PLINK to identify independent loci:
    --clump-p1 5e-8 --clump-kb 500 --clump-r2 0.05
  2. We recommend doing 2 DEPICT runs based on independent genome-wide significant SNPs and another based on all independent SNPs with P value < 1e-5.

Customize depict.py

param_analysislabel
Set this to the label you want to appear in the result filenames.
path_snpfile
Path to your file with associated SNPs (rsIDs must be used to specifiy SNPs).
flag_loci
Construct loci based on your associated SNPs? (Yes, 1; No, 0). This parameter must be set to 1 the first time the analysis is run.
flag_genes
Should genes be prioritized? (Yes, 1; No, 0).
flag_genesets
Conduct reconstituted gene set enrichment analysis? (Yes, 1; No, 0).
flag_tissues
Conduct tissue/cell type enrichment analysis? (Yes, 1; No, 0).
param_ncores
Number of CPU cores used by DEPICT.
path_locusgenerator_jar
Path to JAR file used to constructed loci (Should not be changed).
path_depict_jar
Path to DEPICT JAR file (Should not be changed).

Description of columns in result files

In the gene prioritization results file there will be one row for each gene in the associated loci. Column descriptions:
Locus
The rsID of the associated locus. Merged loci will be referred to by a list of rsIDs.
Nr of genes in locus
The number of genes within the associated locus.
Chromosome and position
Chromosome and boundaries of the associated locus (HG19/GRCh37 genome build).
Ensembl Gene ID
Ensembl database gene identifier.
Gene symbol
Hugo gene symbol.
Nominal P value
The nominal gene prioritization P value.
Gene closest to lead SNP
Indication whether the gene is the gene nearest to the associated SNP (Yes, 1; No, 0).
Gene bio-type
Type of gene.
Top cis eQTL SNP (Westra et al. Nature Genetics 2014)
SNPs that are eQTLs for the gene in whole blood (Westra et al. Nature Genetics 2014)
False discovery rate
The estimated false discovery rate of the gene.
In the reconstituted gene set enrichment results file there will be one row for each gene set. Column descriptions:
Original gene set ID
Identifier of the predefined gene set.
Original gene set description
Description of of the predefined gene set.
Nominal P value
Nominal enrichment Pvalue of the reconstituted gene set (Null hypothesis: Genes in associated loci do not enrich for the reconstituted gene set).
False discovery rate
Estimated false discovery rate for the reconstituted gene set.
In the tissue/cell type enrichment results file there will be one row for tissue or cell type annotation. Column descriptions:
MeSH term
Medical Subject Heading term for the tissue or cell type annotation.
MeSH first level term
Description of the tissue or cell type annotation.
MeSH second level term
More general description of the tissue or cell type annotation.
Nominal P value
Nominal enrichment Pvalue of tissue/cell type annotation (Null hypothesis: Genes in associated are not highly expressed in the given tissue or cell type).
False discovery rate
Estimated false discovery rate of the enrichment P value for the tissue or cell type.

System requirements

JAVA
Download JAVA (please use version 8.0).
Python
Download Python.

Directories in DEPICT Tarball

Depict/
Directory were the DEPICT Java JAR files are located.
LocusGenerator/
Directory were the Java JAR files for the locus creation are located.
data/
Directory were files needed by DEPICT are located.
results/
Directory were the result files will appear.
testfiles/
Directory were test files are located.

GWAS Catalog Locus Definitions

Genome-wide assocations from 63 GWAS Catalog traits used in DEPICT

Decompress per mouse click or use tar xvfz gwascatalog_140201.tar.gz at the command line.

DEPICT 1.1 beta input files

DEPICT (Pers et al. Nature Communications 2015) reconstituted gene sets, 2.3G.

DEPICT (Pers et al. Nature Communications 2015) tissue expression data, 31M.

DEPICT seed gene sets, 2M.

DEPICT reconstituted gene sets (plain text), 2.7G.

DEPICT v1 beta version rel194 for 1KG imputed GWAS, 8.2G.

DEPICT collection file for 1000 Genomes Project pilot phase data, 220M.