MAVID Alignment Track Settings
 
MAVID Alignments   (All ENCODE Comparative Genomics tracks)

Display mode:      Duplicate track

Species selection:  + -

  primate  + -

chimp
colobus_monkey
baboon
macaque
dusky_titi
owl_monkey
marmoset
mouse_lemur
galago

  placental  + -

rat
mouse
rabbit
cow
dog
rfbat
hedgehog
shrew
armadillo
elephant
tenrec

  mammal  + -

monodelphis
platypus

  vertebrate  + -

chicken
xenopus
tetraodon
fugu
zebrafish

Multiple alignment base-level:
Display bases identical to reference as dots
Display chains between alignments

Codon highlighting:
  Alternate colors every bases
  Offset alternate colors by bases

Data schema/format description and download
Source data version: ENCODE Oct 2005 Freeze
Assembly: Human May 2004 (NCBI35/hg17)
Data last updated at UCSC: 2006-01-19

Description

This track displays human-centric multiple sequence alignments in the ENCODE regions for the 28 vertebrates included in the September 2005 ENCODE MSA freeze, based on comparative sequence data generated for the ENCODE project as well as whole-genome assemblies residing at UCSC, as listed:

  • human (May 2004, hg17)
  • armadillo (NISC and May 2005 Broad Assisted Assembly v 1.0)
  • baboon (NISC)
  • chicken (Feb 2004, galGal2)
  • chimp (Nov 2003, panTro1)
  • colobus_monkey (NISC)
  • cow (BCM)
  • dog (July 2004, canFam1)
  • dusky_titi (NISC)
  • elephant (NISC and May 2005 Broad Assisted Assembly v 1.0)
  • fugu (Aug 2002, fr1)
  • galago (NISC)
  • hedgehog (NISC)
  • macaque (Jan 2005, rheMac1)
  • marmoset (NISC)
  • monodelphis (Oct 2004, monDom1)
  • mouse (Mar 2005, mm6)
  • mouse_lemur (NISC)
  • owl_monkey (NISC)
  • platypus (NISC and Aug 2005 Mullikin Phusion Assembly of WUGSC Traces)
  • rabbit (NISC and May 2005 Broad Assisted Assembly v 1.0)
  • rat (June 2003, rn3)
  • rfbat (NISC)
  • shrew (NISC and Sep 2005 Mullikin Phusion Assembly of Broad Traces)
  • tenrec (Apr 2005 Mullikin Phusion Assembly of Broad Traces)
  • tetraodon (Feb 2004, tetNig1)
  • xenopus (Oct 2004, xenTro1)
  • zebrafish (June 2004, danRer2)

The alignments in this track were generated using the Mercator orthology mapping program and the MAVID multiple global alignment program. The Genome Browser companion tracks, MAVID Cons and MAVID Elements, display conservation scoring and conserved elements for these alignments based on various conservation methods.

Display Conventions and Configuration

In full display mode, this track shows pairwise alignments of each species aligned to the human genome. In dense mode, the alignments are depicted using a gray-scale density gradient. The checkboxes in the track configuration section allow the exclusion of species from the pairwise display.

When zoomed-in to the base-display level, the track shows the base composition of each alignment. The numbers and symbols on the "Gaps" line indicate the lengths of gaps in the human sequence at those alignment positions relative to the longest non-human sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment.

Methods

Mercator was first used to identify the colinear and orthologous segments in the sequences given for each ENCODE region. Input to Mercator was generated by using Genscan to predict genes in all sequences, Blat to compare predicted coding exons, and MUMmer to identify non-coding exact matches between all pairs of sequences. The output of Mercator was a small-scale one-to-one orthology map for each ENCODE region, as well as a set of alignment constraints based on matched landmarks (e.g., exons and long non-coding exact matches).

MAVID was then used to construct a global multiple alignment of each colinear orthologous segment set specified in the orthology map. As part of its input, MAVID used a phylogenetic tree determined from alignments of four-fold degenerate sites in the ENCODE regions.

Credits

Generation of the MAVID alignments was engineered by Colin Dewey at the Pachter Lab Comparative Genomics Group at UC Berkeley.

Mercator was written by Colin Dewey and Lior Pachter.

MAVID was authored by Nicholas Bray and Lior Pachter.

The phylogenetic tree is based on Murphy et al. (2001).

References

Bray, N. and Pachter, L. MAVID: Constrained Ancestral Alignment of Multiple Sequences. Genome Res 14(4), 693-699 (2004).

Burge, C. and Karlin, S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 268(1), 78-94 (1997).

Dewey, C.N. and Pachter, L. Mercator: multiple whole-genome orthology map construction. In preparation.

Kent, W.J. BLAT-the BLAST-like alignment tool. Genome Res 12(4), 656-664 (2002).

Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C. and Salzberg, S.L. Versatile and open software for comparing large genomes. Genome Biol 5(2), R12 (2004).

Murphy, W.J., et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294(5550), 2348-51 (2001).