There is a very high volume of traffic coming from your site (IP address 3.236.218.88) as of Sun Oct 17 22:48:13 2021 (California time). So that other users get a fair share of our bandwidth, we are putting in a delay of 13.5 seconds before we service your request. This delay will slowly decrease over a half hour as activity returns to normal. This high volume of traffic is likely due to program-driven rather than interactive access, or the submission of queries on a large number of sequences. If you are making large batch queries, please write to our genome@soe.ucsc.edu public mailing list and inquire about more efficient ways to access our data. If you are sharing an IP address with someone who is submitting large batch queries, we apologize for the inconvenience. To use the genome browser functionality from a Unix command line, please read <a href='http://genome.ucsc.edu/FAQ/FAQdownloads.html#download36'>our FAQ</a> on this topic. For further help on[...] CpG Islands Tracks
CpG Islands Tracks
 
CpG Islands (Islands < 300 Bases are Light Green) tracks   (All Expression and Regulation tracks)

Display mode:   

 All
CpG Islands  CpG Islands (Islands < 300 Bases are Light Green)  
Unmasked CpG  CpG Islands on All Sequence (Islands < 300 Bases are Light Green)  

Description

CpG islands are associated with genes, particularly housekeeping genes, in vertebrates. CpG islands are typically common near transcription start sites and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time, methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some other reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole.

The unmasked version of the track displays potential CpG islands that exist in repeat regions and would otherwise not be visible in the repeat masked version.

By default, only the masked version of the track is displayed. To view the unmasked version, change the visibility settings in the track controls at the top of this page.

Methods

CpG islands were predicted by searching the sequence one base at a time, scoring each dinucleotide (+17 for CG and -1 for others) and identifying maximally scoring segments. Each segment was then evaluated for the following criteria:

  • GC content of 50% or greater
  • length greater than 200 bp
  • ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the basis of the number of Gs and Cs in the segment

The entire genome sequence, masking areas included, was used for the construction of the track Unmasked CpG. The track CpG Islands is constructed on the sequence after all masked sequence is removed.

The CpG count is the number of CG dinucleotides in the island. The Percentage CpG is the ratio of CpG nucleotide bases (twice the CpG count) to the length. The ratio of observed to expected CpG is calculated according to the formula (cited in Gardiner-Garden et al. (1987)):

    Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G)
where N = length of sequence.

Data access

CpG islands and its associated tables can be explored interactively using the REST API, the Table Browser or the Data Integrator. All the tables can also be queried directly from our public MySQL servers, with more information available on our help page as well as on our blog.

Credits

This track was generated using a modification of a program developed by G. Miklem and L. Hillier (unpublished).

References

Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261-82. PMID: 3656447