Header Name |
Description |
hg19_pos |
The genomic position of the CpG on human genome assembly hg19 (or
GRCh37) |
hg38_pos |
The genomic position of the CpG on human genome assembly hg38 (or
GRCh38). |
strand |
Strand of the CpG. Value - “R” (reverse strand) or “F” (forward strand). |
geneSymbol |
Genes the CpG has been assigned to. “N/A” indicates no genes were found.
This is retrieved from the Illumina MethylationEpic v1.0 B4 manifest file. |
CpGisland |
The CpG island (CGI) that overlaps with this CpG. “N/A” indicates no
CGIs were found. |
with_450K |
Boolean indicating whether this CpG probe is also included in 450K.
“0” - No, “1”- Yes. |
SNP_ID |
SNPs (rsID) that are close to this CpG. Multiple SNPs are separated
by “;”. “N/A” indicates no SNPs were found. |
SNP_distance |
The nucleotide distances between SNPs and the CpG. |
SNP_MAF |
The minor allele frequencies (MAF) of SNPs. |
Cross_Reactive |
Boolean (“0” - No, “1”- Yes) indicating whether this CpG could be
affected by cross-hybridization or underlying genetic variation as
reported by this paper. |
ENCODE_TF_ChIP |
Transcription factor (TF) binding sites identified from ChIP-seq
experiments performed by the ENCODE
project. Peaks from 1264 experiments representing 338 transcription
factors in 130 cell types are combined (N = 10,560,472).
BED format file was downloaded from the UCSC Tabel Browser, and a detailed description
is provided here. |
ENCODE_DNaseI |
DNase I hypersensitivity sites identified from ENCODE DNase-seq experiments. Peaks from
125 cell types are combined (N - 1,867,665). BED format file was
downloaded from the UCSC Table Browser, and a detailed description
is provided here. |
ENCODE_H3K27ac_ChIP |
H3K27ac peaks identified from ENCODE histone ChIP-seq experiments. Peaks
from 11 cell types (GM12878, H1-hESC, HMEC, HSMM, HUVEC, HeLaS3, HepG2,
K562, Monocytes-CD14+_RO01746, NHEK, NHLF) are combined (N = 665,650) |
ENCODE_H3K4me1_ChIP |
H3K4me1 peaks identified from ENCODE histone ChIP-seq experiments. Peaks
from 11 cell types (GM12878, H1-hESC, HMEC, HSMM, HUVEC, HeLaS3, HepG2,
K562, Monocytes-CD14+_RO01746, NHEK, NHLF) are combined (N = 1,435,550) |
ENCODE_H3K4me3_ChIP |
H3K4me3 peaks identified from ENCODE histone ChIP-seq experiments. Peaks
from 11 cell types (GM12878, H1-hESC, HMEC, HSMM, HUVEC, HeLaS3, HepG2,
K562, Monocytes-CD14+_RO01746, NHEK, NHLF) are combined (N = 525,824) |
ENCODE_chromHMM |
Chromatin State Segmentation by chromHMM from ENCODE. Chromatin states across 9 cell types
(GM12878, H1-hESC, K562, HepG2, HUVEC, HMEC, HSMM, NHEK, NHLF) were
learned by computationally by integrating 9 factors (CTCF, H3K27ac,
H3K27me3, H3K36me3, H3K4me1, H3K4me2, H3K4me3, H3K9ac, H4K20me1 )
plus input. A total of 15 states were identified, include: State-1
(Active Promoter), state-2 (Weak Promoter), state-3 (Inactive/poised
Promoter), state-4 and 5 (Strong enhancer), state-6 and 7
(Weak/poised enhancer), state-8 (insulator), state-9 (Transcriptional
transition), state-10 (Transcriptional elongation), state-11 (Weak
transcribed), state-12 (Polycomb-repressed), state-13 (Heterochromatin or
low signal), state-14 and 15 (Repetitive/Copy Number Variation).
Orignal chromatin state BED file was downloaded from UCSC Table Browser, and detailed description
is provided here. |
FANTOM_enhancer |
PHANTOM5 human enhancers downloaded from here. |