Gold Standard LncRNAs and Reliable LncRNAs

 

Unlike protein coding genes, publically available lncRNAs are often scantily annotated, partial in scope and scattered in collection. Arraystar maintains high quality proprietary transcriptome and lncRNA databases to extensively collect lncRNAs through our lncRNA discovery pipelines, external data sources, and knowledge-based mining of scientific publications. Arraystar Human LncRNA Array V4.0 has a total of 40,173 lncRNAs in two major lncRNA collections, 7,506 for Gold Standard LncRNAs and 32,667 Reliable LncRNAs, from more than 47 Tb worth of RNA-seq data and all major public databases and repositories, such as Refseq, USCS Known Genes, GENCODE, lincRNA catalogs, lncRNAdb, T-UCRs, RNAdb, NRED, and scientific publications.

The Gold Standard lncRNAs are well annotated and experimentally validated genuine lncRNAs, compared to very large numbers of partial fragments, incomplete UTRs, and less reliable lncRNA sequences deposited in public databases. The lncRNAs are complete with the information of lncRNA transcription units, transcript isoforms, functional molecular mechanisms and subcellular localizations. They are selected from:

• lncRNAdb v2.0 compilation as the reference database for functional  lncRNAs [1];

• Arraystar's continuous scientific publication review and selection;

• Level 1 GENCODE v21 LncRNAs with experimental support by RT-PCR-Seq and manual curation [2];

• Refseq full length high confidence LncRNAs under stringent selection;

• Arraystar lncRNA complete transcripts with 5'TSS, 3'ends and expression data defined by ENCODE CAGE Clusters, PolyA-seq, deep RNA-Seq and capture seq [3, 4].

The Reliable lncRNAs are the remaining lncRNA sequences from databases and publications merged into transcription units (TU). One best representative lncRNA is selected from each TU based on the transcript source, length and other available information. 32,667 Reliable LncRNAs are constructed from 308,525 lncRNA sequences.


Related Services
LncRNA Array Service

SE-lncRNA Array Service
LncPath™ Array Service
T-UCR Array Service



Reference
1. Quek X.C. et al. (2015) Nucleic Acids Res. 43(Database issue):D168-73 [PMID: 25332394]
2. Howald C. et al. (2012) Genome Res. 22(9):1698-710 [PMID: 22955982]
3. Clark M.B. et al. (2015) Nat. Methods 12(4):339-42 [PMID: 25751143]
4. Iyer M.K. et al. (2015) Nat. Genet. 47(3):199-208 [PMID: 25599403]

 

 

Back to news