Neph 2012, DNaseI Digital Genomic Footprinting from ENCODE/University of Washington

Description:

This experiment, produced as part of the ENCODE Project, contains deep sequencing DNase data that will be used to identify sites where regulatory factors bind to the genome (footprints). Footprinting is a technique used to define the DNA sequences that interact with and bind DNA-binding proteins, such as transcription factors, zinc-finger proteins, hormone-receptor complexes, and other chromatin-modulating factors like CTCF. The technique depends upon the strength and tight nature of protein-DNA interactions. In their native chromatin state, DNA sequences that interact directly with DNA-binding proteins are relatively protected from DNA-degrading endonucleases, while the exposed/unbound portions are readily degraded by such endonucleases. A massively parallel next-generation sequencing technique to define the DNase hypersensitive sites in the genome was adopted. The DNase samples were sequenced using next-generation sequencing machines to significantly higher depths of 300-fold or greater. This produces a base-pair level resolution of the DNase susceptibility maps of the native chromatin state. These base-pair resolution maps represent and are dependent upon the nature and the specificity of interaction of the DNA with the regulatory/modulatory proteins binding at specific loci in the genome; thus they represent the native chromatin state of the genome under investigation. The deep sequencing approach has been used to define the footprint landscape of the genome by identifying DNA motifs that interact with known or novel DNA-binding proteins.

Source

Files downloaded from the UCSC Genome Browser via URL: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeUwDgf
Input file format: BAM

Samples

From Human Feb. 2009 (GRCh37/hg19) Assembly

Filename Description Feature GEO-ID
1 wgEncodeUwDgfA549Aln.sga DNaseI DGF - A549 DNaseI-DGF -
2 wgEncodeUwDgfAg10803Aln.sga DNaseI DGF - AG10803 DNaseI-DGF -
3 wgEncodeUwDgfAoafAln.sga DNaseI DGF - AoAF DNaseI-DGF -
4 wgEncodeUwDgfCd20ro01778Aln.sga DNaseI DGF - CD20+_RO01778 DNaseI-DGF -
5 wgEncodeUwDgfCd4naivewb11970640Aln.sga DNaseI DGF - CD4+_Naive_Wb11970640 DNaseI-DGF -
6 wgEncodeUwDgfGm06990Aln.sga DNaseI DGF - GM06990 DNaseI-DGF -
7 wgEncodeUwDgfGm12865Aln.sga DNaseI DGF - GM12865 DNaseI-DGF -
8 wgEncodeUwDgfH7esAln.sga DNaseI DGF - H7-hESC DNaseI-DGF -
9 wgEncodeUwDgfHaeAln.sga DNaseI DGF - HAEpiC DNaseI-DGF -
10 wgEncodeUwDgfHahAln.sga DNaseI DGF - HA-h DNaseI-DGF -
11 wgEncodeUwDgfHaspAln.sga DNaseI DGF - HA-sp DNaseI-DGF -
12 wgEncodeUwDgfHcfAln.sga DNaseI DGF - HCF DNaseI-DGF -
13 wgEncodeUwDgfHcfaaAln.sga DNaseI DGF - HCFaa DNaseI-DGF -
14 wgEncodeUwDgfHcmAln.sga DNaseI DGF - HCM DNaseI-DGF -
15 wgEncodeUwDgfHcpeAln.sga DNaseI DGF - HCPEpiC DNaseI-DGF -
16 wgEncodeUwDgfHeeAln.sga DNaseI DGF - HEEpiC DNaseI-DGF -
17 wgEncodeUwDgfHepg2Aln.sga DNaseI DGF - HepG2 DNaseI-DGF -
18 wgEncodeUwDgfHffAln.sga DNaseI DGF - HFF DNaseI-DGF -
19 wgEncodeUwDgfHgfAln.sga DNaseI DGF - HGF DNaseI-DGF -
20 wgEncodeUwDgfHipeAln.sga DNaseI DGF - HIPEpiC DNaseI-DGF -
21 wgEncodeUwDgfHmfAln.sga DNaseI DGF - HMF DNaseI-DGF -
22 wgEncodeUwDgfHmvecbAln.sga DNaseI DGF - HMVEC-LLy DNaseI-DGF -
23 wgEncodeUwDgfHmvecdAln.sga DNaseI DGF - HMVEC-dBl-Neo DNaseI-DGF -
24 wgEncodeUwDgfHmvecdbladAln.sga DNaseI DGF - HMVEC-dBl-Ad DNaseI-DGF -
25 wgEncodeUwDgfHmvecfAln.sga DNaseI DGF - HMVEC-dLy-Neo DNaseI-DGF -
26 wgEncodeUwDgfHmvechAln.sga DNaseI DGF - HMVEC-LBl DNaseI-DGF -
27 wgEncodeUwDgfHpafAln.sga DNaseI DGF - HPAF DNaseI-DGF -
28 wgEncodeUwDgfHpdlfAln.sga DNaseI DGF - HPdLF DNaseI-DGF -
29 wgEncodeUwDgfHpfAln.sga DNaseI DGF - HPF DNaseI-DGF -
30 wgEncodeUwDgfHrceAln.sga DNaseI DGF - HRCEpiC DNaseI-DGF -
31 wgEncodeUwDgfHsmmAln.sga DNaseI DGF - HSMM DNaseI-DGF -
32 wgEncodeUwDgfHuvecAln.sga DNaseI DGF - HUVEC DNaseI-DGF -
33 wgEncodeUwDgfHvmfAln.sga DNaseI DGF - HVMF DNaseI-DGF -
34 wgEncodeUwDgfK562Aln.sga DNaseI DGF - K562 rep1 DNaseI-DGF -
35 wgEncodeUwDgfK562Znfa41c6Aln.sga DNaseI DGF - K562 rep2 DNaseI-DGF -
36 wgEncodeUwDgfK562Znfp5Aln.sga DNaseI DGF - K562 rep3 DNaseI-DGF -
37 wgEncodeUwDgfLhcnm2Aln.sga DNaseI DGF - LHCN-M2 rep1 DNaseI-DGF -
38 wgEncodeUwDgfLhcnm2Diff4dAln.sga DNaseI DGF - LHCN-M2 rep2 DNaseI-DGF -
39 wgEncodeUwDgfM059jAln.sga DNaseI DGF - M059J DNaseI-DGF -
40 wgEncodeUwDgfMonocd14ro1746Aln.sga DNaseI DGF - Monocytes-CD14+_RO01746 DNaseI-DGF -
41 wgEncodeUwDgfNb4Aln.sga DNaseI DGF - NB4 DNaseI-DGF -
42 wgEncodeUwDgfNhaAln.sga DNaseI DGF - NH-A DNaseI-DGF -
43 wgEncodeUwDgfNhdfadAln.sga DNaseI DGF - NHDF-Ad DNaseI-DGF -
44 wgEncodeUwDgfNhdfneoAln.sga DNaseI DGF - NHDF-neo DNaseI-DGF -
45 wgEncodeUwDgfNhlfAln.sga DNaseI DGF - NHLF DNaseI-DGF -
46 wgEncodeUwDgfRpmi7951Aln.sga DNaseI DGF - RPMI-7951 DNaseI-DGF -
47 wgEncodeUwDgfSaecAln.sga DNaseI DGF - SAEC DNaseI-DGF -
48 wgEncodeUwDgfSkmcAln.sga DNaseI DGF - SKMC DNaseI-DGF -
49 wgEncodeUwDgfSknshraAln.sga DNaseI DGF - SK-N-SH_RA DNaseI-DGF -
50 wgEncodeUwDgfT47dAln.sga DNaseI DGF - T-47D DNaseI-DGF -
51 wgEncodeUwDgfTh17Aln.sga DNaseI DGF - Th17 DNaseI-DGF -
52 wgEncodeUwDgfTh1Aln.sga DNaseI DGF - Th1 DNaseI-DGF -
53 wgEncodeUwDgfTh1AlnRep2.sga DNaseI DGF - Th1 DNaseI-DGF -
54 wgEncodeUwDgfTh1wb33676984Aln.sga DNaseI DGF - Th1_Wb33676984 DNaseI-DGF -
55 wgEncodeUwDgfTh2Aln.sga DNaseI DGF - Th2 DNaseI-DGF -
56 wgEncodeUwDgfTh2wb54553204Aln.sga DNaseI DGF - Th2_Wb54553204 DNaseI-DGF -
57 wgEncodeUwDgfTregwb78495824Aln.sga DNaseI DGF - Treg_Wb78495824 DNaseI-DGF -

Technical Notes

BAM files were downloaded from UCSC and converted in sga format using specific perl scripts.

References

  1. Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, Thurman RE, John S, Sandstrom R, Johnson AK, Maurano MT, Humbert R, Rynes E, Wang H, Vong S, Lee K, Bates D, Diegel M, Roach V, Dunn D, Neri J, Schafer A, Hansen RS, Kutyavin T, Giste E, Weaver M, Canfield T, Sabo P, Zhang M, Balasundaram G, Byron R, MacCoss MJ, Akey JM, Bender MA, Groudine M, Kaul R, Stamatoyannopoulos JA.
    An expansive human regulatory lexicon encoded in transcription factor footprints.
    Nature 2012 Sep 6;489(7414):83-90
    PMID: 22955618

  2. GEO series GSE26328 DNaseI Digital Genomic Footprinting from ENCODE/University of Washington.