DASHR Called human sncRNA loci Track Settings
 
DASHR Called human small non-coding RNA loci

Display mode:       Reset to defaults
Select subtracks:
    All                    
    Tissue
            Adipose1
            Bcellgerminalcenter1
            Bcellmemory1
            Bcellnaive1
            Bcellplasma1
            Bladder1
            Brainog1
            Brainpfc1
            Brainpfc2
            Braintgm1
            Breast1
            Cd4plustcell1
            Colon1
            Colon2
            Colonascendens1
            Coloncoecum1
            Colonrectum1
            Heart1
            Heart2
            Kidney1
            Kidney2
            Liver1
            Liver2
            Liver3
            Lung1
            Lung2
            Monocytemacrophage1
            Muscle1
            Pancreas1
            Pancreaticbetacell1
            Pancreaticislet1
            Peripheralbmc1
            Peripheralbmc2
            Plasma1
            Serum1
            Serum2
            Skin1
            Skin2
            Spermatozoa1
            Testiculargerm1
            Wholeblood1
            Wholeislet1
            tissue
List subtracks: only selected/visible    all    ()
  Tissue↓1   Track Name↓2  
dense
 Adipose1  Adipose1 GSE45159 peak[+]   Data format 
dense
 Adipose1  Adipose1 GSE45159 peak[-]   Data format 
dense
 Bcellgerminalcenter1  Bcellgerminalcenter1 GSE22898 peak[+]   Data format 
dense
 Bcellgerminalcenter1  Bcellgerminalcenter1 GSE22898 peak[-]   Data format 
dense
 Bcellmemory1  Bcellmemory1 GSE22898 peak[+]   Data format 
dense
 Bcellmemory1  Bcellmemory1 GSE22898 peak[-]   Data format 
dense
 Bcellnaive1  Bcellnaive1 GSE22898 peak[+]   Data format 
dense
 Bcellnaive1  Bcellnaive1 GSE22898 peak[-]   Data format 
dense
 Bcellplasma1  Bcellplasma1 GSE22898 peak[+]   Data format 
dense
 Bcellplasma1  Bcellplasma1 GSE22898 peak[-]   Data format 
dense
 Bladder1  Bladder1 GSE31616 peak[+]   Data format 
dense
 Bladder1  Bladder1 GSE31616 peak[-]   Data format 
dense
 Brainog1  Brainog1 SRA012516 peak[+]   Data format 
dense
 Brainog1  Brainog1 SRA012516 peak[-]   Data format 
dense
 Brainpfc1  Brainpfc1 GSE43335 peak[+]   Data format 
dense
 Brainpfc1  Brainpfc1 GSE43335 peak[-]   Data format 
dense
 Brainpfc2  Brainpfc2 GSE48552 peak[+]   Data format 
dense
 Brainpfc2  Brainpfc2 GSE48552 peak[-]   Data format 
dense
 Braintgm1  Braintgm1 GSE46131 peak[+]   Data format 
dense
 Braintgm1  Braintgm1 GSE46131 peak[-]   Data format 
dense
 Breast1  Breast1 GSE39162 peak[+]   Data format 
dense
 Breast1  Breast1 GSE39162 peak[-]   Data format 
dense
 Cd4plustcell1  Cd4plustcell1 GSE59944 peak[+]   Data format 
dense
 Cd4plustcell1  Cd4plustcell1 GSE59944 peak[-]   Data format 
dense
 Colon1  Colon1 peak[+]   Data format 
dense
 Colon1  Colon1 peak[-]   Data format 
dense
 Colon2  Colon2 GSE43550 peak[+]   Data format 
dense
 Colon2  Colon2 GSE43550 peak[-]   Data format 
dense
 Colonascendens1  Colonascendens1 GSE46622 peak[+]   Data format 
dense
 Colonascendens1  Colonascendens1 GSE46622 peak[-]   Data format 
dense
 Coloncoecum1  Coloncoecum1 GSE46622 peak[+]   Data format 
dense
 Coloncoecum1  Coloncoecum1 GSE46622 peak[-]   Data format 
dense
 Colonrectum1  Colonrectum1 GSE46622 peak[+]   Data format 
dense
 Colonrectum1  Colonrectum1 GSE46622 peak[-]   Data format 
dense
 Heart1  Heart1 ERP000773 peak[+]   Data format 
dense
 Heart1  Heart1 ERP000773 peak[-]   Data format 
dense
 Heart2  Heart2 SRA012516 peak[+]   Data format 
dense
 Heart2  Heart2 SRA012516 peak[-]   Data format 
dense
 Kidney1  Kidney1 ERP000773 peak[+]   Data format 
dense
 Kidney1  Kidney1 ERP000773 peak[-]   Data format 
dense
 Kidney2  Kidney2 GSE24457 peak[+]   Data format 
dense
 Kidney2  Kidney2 GSE24457 peak[-]   Data format 
dense
 Liver1  Liver1 ERP000773 peak[+]   Data format 
dense
 Liver1  Liver1 ERP000773 peak[-]   Data format 
dense
 Liver2  Liver2 SRA012516 peak[+]   Data format 
dense
 Liver2  Liver2 SRA012516 peak[-]   Data format 
dense
 Liver3  Liver3 GSE21279 peak[+]   Data format 
dense
 Liver3  Liver3 GSE21279 peak[-]   Data format 
dense
 Lung1  Lung1 GSE33858 peak[+]   Data format 
dense
 Lung1  Lung1 GSE33858 peak[-]   Data format 
dense
 Lung2  Lung2 SRA012516 peak[+]   Data format 
dense
 Lung2  Lung2 SRA012516 peak[-]   Data format 
dense
 Monocytemacrophage1  Monocytemacrophage1 GSE59944 peak[+]   Data format 
dense
 Monocytemacrophage1  Monocytemacrophage1 GSE59944 peak[-]   Data format 
dense
 Muscle1  Muscle1 SRA012516 peak[+]   Data format 
dense
 Muscle1  Muscle1 SRA012516 peak[-]   Data format 
dense
 Pancreas1  Pancreas1 SRA012516 peak[+]   Data format 
dense
 Pancreas1  Pancreas1 SRA012516 peak[-]   Data format 
dense
 Pancreaticbetacell1  Pancreaticbetacell1 GSE47720 peak[+]   Data format 
dense
 Pancreaticbetacell1  Pancreaticbetacell1 GSE47720 peak[-]   Data format 
dense
 Pancreaticislet1  Pancreaticislet1 GSE47720 peak[+]   Data format 
dense
 Pancreaticislet1  Pancreaticislet1 GSE47720 peak[-]   Data format 
dense
 Peripheralbmc1  Peripheralbmc1 GSE19812 peak[+]   Data format 
dense
 Peripheralbmc1  Peripheralbmc1 GSE19812 peak[-]   Data format 
dense
 Peripheralbmc2  Peripheralbmc2 GSE37710 peak[+]   Data format 
dense
 Peripheralbmc2  Peripheralbmc2 GSE37710 peak[-]   Data format 
dense
 Plasma1  Plasma1 GSE52981 peak[+]   Data format 
dense
 Plasma1  Plasma1 GSE52981 peak[-]   Data format 
dense
 Serum1  Serum1 GSE53439 peak[+]   Data format 
dense
 Serum1  Serum1 GSE53439 peak[-]   Data format 
dense
 Serum2  Serum2 GSE34891 peak[+]   Data format 
dense
 Serum2  Serum2 GSE34891 peak[-]   Data format 
dense
 Skin1  Skin1 GSE31037 peak[+]   Data format 
dense
 Skin1  Skin1 GSE31037 peak[-]   Data format 
dense
 Skin2  Skin2 GSE53600 peak[+]   Data format 
dense
 Skin2  Skin2 GSE53600 peak[-]   Data format 
dense
 Spermatozoa1  Spermatozoa1 GSE21191 peak[+]   Data format 
dense
 Spermatozoa1  Spermatozoa1 GSE21191 peak[-]   Data format 
dense
 Testiculargerm1  Testiculargerm1 GSE31616 peak[+]   Data format 
dense
 Testiculargerm1  Testiculargerm1 GSE31616 peak[-]   Data format 
dense
 Wholeblood1  Wholeblood1 GSE46579 peak[+]   Data format 
dense
 Wholeblood1  Wholeblood1 GSE46579 peak[-]   Data format 
dense
 Wholeislet1  Wholeislet1 GSE52314 peak[+]   Data format 
dense
 Wholeislet1  Wholeislet1 GSE52314 peak[-]   Data format 
    
Assembly: Human Feb. 2009 (GRCh37/hg19)

Description

This set of data tracks represents a comprehensive set of processed human small non-coding RNAs (sncRNAs) based on over 180 high-throughput small RNA-seq (smRNA-seq) experiments generated by over 30 independent groups. The data tracks represent raw signal (expression) and peaks (regions of enrichment) that were generated using a uniform processing pipeline by Wang lab at UPenn. Provided data tracks are based on the integrated analysis of data from over 40 normal human tissues and cell types (see DASHR database). We also provide sncRNA processing information for each peak/loci.

Methods

Data collection and curation of smRNA-seq

We manually curated Illumina smRNA-seq datasets on normal human tissue samples and cell types from GEO and SRA. The smRNA-seq samples were categorized into different groups of tissues and cell types, according to study ID (GSE accession).

Processing smRNA-seq datasets

We standardized the processing of smRNA-seq datasets and generated sncRNA expression levels for sncRNA genes and mature sncRNA products derived from these larger RNAs. The pipeline can be summarized into three parts. We first identified the correct adapter sequence and trimmed the sequencing reads using cutadapt. We then mapped the set of trimmed reads corresponding to small RNAs to a standardized version of the human reference genome (GRCh37/hg19). The reads were aligned using STAR algorithm using 'all-matches' strategy, i.e. allowing for multi-mapping and no mismatches.

Segmentation and quantification

We used a customized approach to identify peaks with evidence of specific processing for mature sncRNA products at base pair resolution. We scanned the genomic sequence and identified the start of the peak by finding two adjacent positions with at least a 2-fold increase in the number of mapped reads. Similarly, the corresponding end of the peak is found by looking for at least a 2-fold decrease in the number of mapped reads. Additionally, the detected peaks needed to have at least 10 reads. After identifying the mature sncRNA locations, we then quantified the number of reads falling within these regions as expression (raw read counts) for each sncRNA. To enable comparison across tissues, we took into account the library size information for each of the sequencing experiments and reported the read count in 'reads per million' (RPM). The bedscore (Score field) gives the log-transformed RPM expression score in [0,1000] range computed as max(0,min(100*log(100*RPM+0.05)/log(10),1000)).

A detailed description of the data processing pipeline and precise set of considerations for evaluating the quality of small RNA-seq data is available in [1].

References

  1. Yuk Yee Leung, Pavel P. Kuksa, Alexandre Amlie-Wolf, Otto Valladares, Lyle H. Ungar, Sampath Kannan, Brian D. Gregory, and Li-San Wang. DASHR: database of small human noncoding RNAs. Nucl. Acids Res., 2015 (Database Issue) doi:10.1093/nar/gkv1188 PMID: 26553799
  2. Yuk Yee Leung, Paul Ryvkin, Lyle H. Ungar, Brian D. Gregory, and Li-San Wang (2013) CoRAL: predicting non-coding RNAs from small RNA-sequencing data. Nucleic Acids Research, 41, e137. PMID: 23700308

Data Release Policy

There are no restrictions on the use of the tracks.

Contact

Li-San Wang