Description
Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3'' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator. Overall design: We cultured and processed 8 KBM7 cell lines in one batch. These cell lines were: two wild type KBM7 cells (WT2 and WT3), two monoclonal KBM7 cell lines with gene trap cassette insertions outside of the body of LOC100288798 (C1 and C2), two independently obtained KBM7 clones with gene trap cassette insertion 3kb downstream LOC100288798 transcriptional start site (TSS) (3kb1 and 3kb2), one independently obtained KBM7 clone with gene trap cassette insertion 100kb downstream LOC100288798 TSS replicated twice at the thawing step (100kb1 and 100kb2). We isolated total RNA from all th 8 cell lines, applied DNAseI treatment and ribosomal RNA depletion, and thhen prepared strand-specific RNA-seq libraries, which were pooled in equal molarities and sequenced using Illumina HiSeq 2000 (8 pooled samples were sequence on 2 lanes). We performed 50bp single-end RNA-seq. We used these 8 samples (4 untreated: WT2, WT3, C1, C2 and 4 treated:3kb1, 3kb2, 100kbk1, 100kb2) to analyze genome-wide gene deregulation associated with LOC100288798 lncRNA truncation