data about single nucleotide variant alleles in the SARS-CoV-2 RNA and protein sequences that have
occurred in different samples of the virus during the current 2019/2020 outbreak.
Nextstrain has a powerful user interface for viewing the time stamped phylogenetic tree
that it infers from the patterns of variants in sequences worldwide.
Nextstrain maintains an ongoing pipeline that continuously obtains SARS-CoV-2 genome sequences
and metadata from
aligns them against the reference genome
collects single-nucleotide variants (SNVs), and infers a phylogenetic tree.
A parsimony score
can be computed for each mutation as the minimum number of nucleotide changes along branches
of the tree that would lead to the observed sample genotypes at the leaves of the tree.
For example, if there is a branch for which all leaves have a mutation, and no other leaves of
the tree have the mutation, then the mutation presumably occurred once on that branch and the
parsimony score would be one. However, when a mutation appears on leaves belonging to several
branches whose other leaves do not have the mutation, then the mutation would need to occur
on multiple branches in the tree, increasing the parsimony score. Mutations with a parsimony
score that is relatively high, especially when compared to alternate allele count (the number
of samples/leaves with the mutation), may be of interest when identifying systematic errors
and/or sites of recurrent mutations.
This track shows the parsimony score of each SNV reported by Nextstrain as a bar graph
with the height indicating the score.
(The Nextstrain Variants track
displays the phylogenetic tree and sample genotypes
from which the parsimony scores were generated.
Nextstrain downloads SARS-CoV-2 genomes from
as they are submitted by labs worldwide.
The sequences are processed by an
and annotations are written to a data file
that UCSC downloads and extracts annotations for display.
UCSC computes parsimony scores using the phylogenetic tree and variants extracted
You can download the bigWig file underlying this track (nextstrainParsimony.bw) from our
Download Server. The data can be explored interactively with the
or the Data Integrator. The data can be
accessed from scripts through our API.
offers phylogenetic trees and metadata files:
scroll to the bottom of the page and click "DOWNLOAD DATA",
and a dialog with download options appears.
This work is made possible by the open sharing of genetic data by research
groups from all over the world. We gratefully acknowledge their contributions.
Special thanks to
sharing its analysis of genomes collected by
Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, Sagulenko P, Bedford T, Neher
Nextstrain: real-time tracking of pathogen evolution.
Bioinformatics. 2018 Dec 1;34(23):4121-4123.
PMID: 29790939; PMC: PMC6247931