Unique genomic features and deeply-conserved functions of long non-coding RNAs in the Cancer LncRNA Census (CLC)

Carlevaro-Fita, Joana; Lanzos, Andrés; Feuerbach, Lars; Hong, Chen; Mas-Ponte, David; Skou Pedersen, Jakob; Johnson, Rory (2019). Unique genomic features and deeply-conserved functions of long non-coding RNAs in the Cancer LncRNA Census (CLC) (bioRxiv). Cold Spring Harbor Laboratory 10.1101/152769

152769.full.pdf - Published Version
Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC).

Download (4MB) | Preview

Long non-coding RNAs (lncRNAs) that drive tumorigenesis are a growing focus of cancer genomics studies. To facilitate further discovery, we have created the “Cancer LncRNA Census” (CLC), a manually-curated and strictly-defined compilation of lncRNAs with causative roles in cancer. CLC has two principle applications: first, as a resource for training and benchmarking de novo identification methods; and second, as a dataset for studying the fundamental properties of these genes. CLC Version 1 comprises 122 lncRNAs implicated in 29 distinct cancers. LncRNAs are included based on functional or genetic evidence for causative roles in cancer progression. All belong to the GENCODE reference annotation, to enable integration across projects and datasets. For each entry, the evidence type, biological activity (oncogene or tumour suppressor), source reference and cancer type are recorded. Supporting its usefulness, CLC genes are significantly enriched amongst de novo predicted driver genes from PCAWG. CLC genes are distinguished from other lncRNAs by a series of features consistent with biological function, including gene length, high expression and sequence conservation of both exons and promoters. We identify a trend for CLC genes to be co-localised with known protein-coding cancer genes along the human genome. Finally, by integrating data from transposon-mutagenesis functional screens, we show that mouse orthologues of CLC genes tend also to be cancer genes. Thus CLC represents a valuable resource for research into long non-coding RNAs in cancer. Their evolutionary and genomic properties have implications for understanding disease mechanisms and point to conserved functions across ~80 million years of evolution.

Item Type:

Working Paper


04 Faculty of Medicine > Department of Haematology, Oncology, Infectious Diseases, Laboratory Medicine and Hospital Pharmacy (DOLS) > Clinic of Medical Oncology

Graduate School:

Graduate School for Cellular and Biomedical Sciences (GCB)

UniBE Contributor:

Lanzos Camaioni, Andrés Arturo


600 Technology > 610 Medicine & health




Cold Spring Harbor Laboratory




Andrés Arturo Lanzos Camaioni

Date Deposited:

03 Oct 2019 14:25

Last Modified:

23 Oct 2019 05:39

Publisher DOI:


Additional Information:

bioRxiv (pronounced "bio-archive") is a free online archive and distribution service for unpublished preprints in the life sciences.





Actions (login required)

Edit item Edit item
Provide Feedback