Description
Exon usage data collected from this cell line. The green marks represent 5' splice sites, and the blue marks 3' splice sites.
Best displayed in pack format
Methods
The splice site usage of a 5’ or 3’ splice site is meant to estimate the proportion of transcripts from a gene that undergo a splicing event that utilizes that particular site.
To accomplish this we analyze exon-exon junction reads obtained from mapping of RNA-seq data. For a given splice site, there are three categories of junction reads which go into calculating its splice site usage:
(A) a read that has the splice site as one of the sides of the junction, (B) a read that spans the splice site (i.e. the junction is between a site that is upstream and one that is downstream of the site in question),
and (C) for a 5’ (3’) splice site, a read with a junction that has its 5’ (3’) end in the downstream (upstream) intron. From the counts of these three classes of reads, splice site usage is defined as A/(A+B+C).
In order to mitigate the corruption of this metric by the false positive splice junctions frequently output by RNA-seq aligners, we only considered junction reads that contained splice sites present in the GENCODEv29 annotation.
If there were multiple RNA-seq replicates for a particular cell line or condition, we collapsed the junction read counts from all replicates before calculating splice site usages.
Splice site usage is bimodally distributed, with the vast majority of splice sites having usage values greater than 0.9 (~85%) or less than 0.1 (~6%), using K562 as an example.
In other words, splice site usage measures how often a particular splice site is "used" during splicing relative to other potential splice sites in the same gene, with values ranging from 0 (never used) to 1 (every transcript seen in the RNA-seq data is spliced at this locus).
Scripts used to generate usage data can be found here
Credits
This track was created at the Fairbrother Laboratory at Brown University by Luke Buerer, Camillo Saueressig, and David Glidden.
References
ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), 57.
Contact
william_fairbrother@brown.edu
|