Some groups have claimed that the majority of long noncoding RNAs in mammals are likely to be functional, but other groups have claimed the opposite. This is an active area of research. Some lncRNAs have been functionally annotated in
LncRNAdb (a database of literature described lncRNAs), with the majority of these being described in
humans. Over 2600 human lncRNAs with experimental evidences have been community-curated in LncRNAWiki (a
wiki-based, publicly editable and open-content platform for
community curation of human lncRNAs). According to the curation of functional mechanisms of lncRNAs based on the literatures, lncRNAs are extensively reported to be involved in
ceRNA regulation,
transcriptional regulation, and epigenetic regulation.
In the regulation of gene transcription In gene-specific transcription In
eukaryotes,
RNA transcription is a tightly regulated process. Noncoding RNAs act upon different aspects of this process, targeting transcriptional modulators,
RNA polymerase (RNAP) II and even the DNA duplex to regulate gene expression. NcRNAs modulate transcription by several mechanisms, including functioning themselves as co-regulators, modifying
transcription factor activity, or regulating the association and activity of co-regulators. For example, the noncoding RNA
Evf-2 functions as a co-activator for the
homeobox transcription factor
Dlx2, which plays important roles in
forebrain development and
neurogenesis.
Sonic hedgehog induces transcription of Evf-2 from an
ultra-conserved element located between the
Dlx5 and
Dlx6 genes during forebrain development. Indeed, the transcription and expression of similar non-coding ultraconserved elements was shown to be abnormal in human
leukaemia and to contribute to
apoptosis in
colon cancer cells, suggesting their involvement in
tumorigenesis in like fashion to protein-coding RNA. Local ncRNAs can also recruit transcriptional programmes to regulate adjacent protein-coding
gene expression. The
RNA binding protein TLS binds and inhibits the
CREB binding protein and
p300 histone acetyltransferase activities on a repressed gene target,
cyclin D1. The recruitment of TLS to the promoter of cyclin D1 is directed by long ncRNAs expressed at low levels and tethered to 5' regulatory regions in response to DNA damage signals. Moreover, these local ncRNAs act cooperatively as ligands to modulate the activities of TLS. In the broad sense, this mechanism allows the cell to harness
RNA-binding proteins, which make up one of the largest classes within the mammalian
proteome, and integrate their function in transcriptional programs. Nascent long ncRNAs have been shown to increase the activity of CREB binding protein, which in turn increases the transcription of that ncRNA. A study found that a lncRNA in the antisense direction of the
Apolipoprotein A1 (APOA1) regulates the transcription of APOA1 through
epigenetic modifications. Recent evidence has raised the possibility that transcription of genes that escape from
X-inactivation might be mediated by expression of long non-coding RNA within the escaping
chromosomal domains.
Regulating basal transcription machinery NcRNAs also target general
transcription factors required for the
RNAP II transcription of all genes. This novel mechanism of regulating gene expression may represent a widespread method of controlling promoter usage, as thousands of RNA-DNA triplexes exist in eukaryotic
chromosome. The
U1 ncRNA can induce transcription by binding to and stimulating
TFIIH to
phosphorylate the C-terminal domain of RNAP II. In contrast the ncRNA
7SK is able to repress transcription elongation by, in combination with
HEXIM1/
2, forming an inactive complex that prevents
PTEFb from
phosphorylating the C-terminal domain of RNAP II, repressing global elongation under stressful conditions. These examples, which bypass specific modes of regulation at individual promoters provide a means of quickly affecting global changes in
gene expression. The ability to quickly mediate global changes is also apparent in the rapid expression of non-coding
repetitive sequences. The short interspersed nuclear (
SINE)
Alu elements in humans and analogous B1 and B2 elements in
mice have succeeded in becoming the most abundant mobile elements within the genomes, comprising ~10% of the
human and ~6% of the
mouse genome, respectively. where they then bind to RNAP II with high affinity and prevent the formation of active pre-initiation complexes. This allows for the broad and rapid repression of gene expression in response to stress. The Alu RNA contains two 'arms', each of which may bind one RNAP II molecule, as well as two regulatory domains that are responsible for RNAP II transcriptional repression in vitro. In addition to
heat shock, the expression of
SINE elements (including Alu, B1, and B2 RNAs) increases during cellular stress such as
viral infection in some
cancer cells where they may similarly regulate global changes to gene expression. The ability of Alu and B2 RNA to bind directly to RNAP II provides a broad mechanism to repress transcription. This activation involves a conformational alteration of HSR-1 in response to rising temperatures, permitting its interaction with the
transcriptional activator HSF-1, which trimerizes and induces the expression of heat shock genes. uncoupling their expression from RNAP II, which they regulate. RNAP III also transcribes other ncRNAs, such as BC2,
BC200 and some microRNAs and snoRNAs, in addition to
housekeeping ncRNA genes such as
tRNAs,
5S rRNAs and
snRNAs. showing that one of these ncRNAs, 21A, regulates the expression of its antisense partner gene,
CENP-F in trans.
In post-transcriptional regulation In addition to regulating transcription, ncRNAs also control various aspects of post-transcriptional
mRNA processing. Similar to small regulatory RNAs such as
microRNAs and
snoRNAs, these functions often involve
complementary base pairing with the target mRNA. The formation of RNA duplexes between complementary ncRNA and mRNA may mask key elements within the mRNA required to bind trans-acting factors, potentially affecting any step in post-transcriptional
gene expression including pre-mRNA processing and
splicing, transport, translation, and degradation.
In splicing The
splicing of mRNA can induce its translation and functionally diversify the repertoire of
proteins it encodes. The
Zeb2 mRNA requires the retention of a 5'UTR
intron that contains an
internal ribosome entry site for efficient translation. The retention of the intron depends on the expression of an
antisense transcript that complements the intronic 5'
splice site. Another well-characterized lncRNA involved in splicing is
MALAT1 (metastasis associated lung adenocarcinoma transcript 1). MALAT1 localizes predominantly to
nuclear speckles and has been reported to regulate the distribution and activity of splicing factors, thereby influencing alternative splicing of a subset of pre-mRNAs.
In translation NcRNA may also apply additional regulatory pressures during
translation, a property particularly exploited in
neurons where the
dendritic or
axonal translation of mRNA in response to
synaptic activity contributes to changes in
synaptic plasticity and the remodelling of neuronal networks. The RNAP III transcribed BC1 and BC200 ncRNAs, that previously derived from
tRNAs, are expressed in the mouse and human
central nervous system, respectively. BC1 expression is induced in response to synaptic activity and
synaptogenesis and is specifically targeted to dendrites in neurons. Sequence complementarity between BC1 and regions of various neuron-specific
mRNAs also suggest a role for BC1 in targeted translational repression. Indeed, it was recently shown that BC1 is associated with translational repression in dendrites to control the efficiency of dopamine
D2 receptor-mediated transmission in the
striatum and BC1 RNA-deleted mice exhibit behavioural changes with reduced exploration and increased
anxiety.
In siRNA-directed gene regulation In addition to masking key elements within single-stranded
RNA, the formation of
double-stranded RNA duplexes can also provide a substrate for the generation of endogenous
siRNAs (endo-siRNAs) in
Drosophila and mouse
oocytes. The
annealing of complementary sequences, such as antisense or repetitive regions between
transcripts, forms an RNA duplex that may be processed by
Dicer-2 into endo-siRNAs. Also, long ncRNAs that form extended intramolecular hairpins may be processed into siRNAs, compellingly illustrated by the esi-1 and esi-2 transcripts. Endo-siRNAs generated from these transcripts seem particularly useful in suppressing the spread of
mobile transposon elements within the genome in the germline. However, the generation of endo-siRNAs from antisense transcripts or
pseudogenes may also silence the expression of their functional counterparts via
RISC effector complexes, acting as an important node that integrates various modes of long and short RNA regulation, as exemplified by the
Xist and
Tsix (see above).
In epigenetic regulation Epigenetic modifications, including
histone and
DNA methylation,
histone acetylation and
sumoylation, affect many aspects of chromosomal biology, primarily including regulation of large numbers of genes by remodeling broad
chromatin domains. While it has been known for some time that RNA is an integral component of chromatin, it is only recently that we are beginning to appreciate the means by which RNA is involved in pathways of chromatin modification. For example,
Oplr16 epigenetically induces the activation of
stem cell core factors by coordinating intrachromosomal
looping and recruitment of
DNA demethylase TET2. In
Drosophila, long ncRNAs induce the expression of the homeotic gene,
Ubx, by recruiting and directing the chromatin modifying functions of the trithorax protein Ash1 to
Hox regulatory elements. This example nicely illustrates a broader theme whereby ncRNAs recruit the function of a generic suite of chromatin modifying proteins to specific
genomic loci, underscoring the complexity of recently published genomic maps. Indeed, the prevalence of long ncRNAs associated with protein coding genes may contribute to localised patterns of chromatin modifications that regulate gene expression during development. For example, the majority of
protein-coding genes have antisense partners, including many tumour suppressor genes that are frequently silenced by epigenetic mechanisms in cancer. A recent study observed an inverse expression profile of the p15 gene and an antisense ncRNA in leukaemia. Indeed, detailed analysis has revealed a crucial role for the ncRNAs
Kcnqot1 and
Igf2r/Air in directing imprinting. Almost all the genes at the
Kcnq1 loci are maternally inherited, except the paternally expressed antisense ncRNA Kcnqot1. Transgenic mice with truncated Kcnq1ot fail to silence the adjacent genes, suggesting that Kcnqot1 is crucial to the imprinting of genes on the paternal chromosome. It appears that Kcnqot1 is able to direct the trimethylation of lysine 9 (
H3K9me3) and 27 of histone 3 (
H3K27me3) to an imprinting centre that overlaps the Kcnqot1
promoter and actually resides within a Kcnq1 sense exon. Similar to HOTAIR (see above), Eed-Ezh2
Polycomb complexes are recruited to the Kcnq1 loci paternal chromosome, possibly by Kcnqot1, where they may mediate gene silencing through repressive
histone methylation. The presence of allele-specific histone methylation at the Igf2r locus suggests Air also mediates silencing via chromatin modification.
Xist and X-chromosome inactivation The
inactivation of a X-chromosome in female
placental mammals is directed by one of the earliest and best characterized long ncRNAs,
Xist. The expression of Xist from the future inactive X-chromosome, and its subsequent coating of the inactive X-chromosome, occurs during early
embryonic stem cell differentiation. Xist expression is followed by irreversible layers of chromatin modifications that include the loss of the histone (H3K9) acetylation and H3K4 methylation that are associated with active chromatin, and the
induction of repressive chromatin modifications including H4 hypoacetylation,
H3K27 trimethylation, Xist RNA also localises the histone variant macroH2A to the inactive
X–chromosome. There are additional ncRNAs that are also present at the Xist loci, including an antisense transcript
Tsix, which is expressed from the future active chromosome and able to repress Xist expression by the generation of endogenous siRNA. Telomeres have been long considered transcriptionally inert DNA-protein complexes until it was shown in the late 2000s that telomeric repeats may be transcribed as telomeric RNAs (TelRNAs) or
telomeric repeat-containing RNAs. These ncRNAs are heterogeneous in length, transcribed from several sub-telomeric loci and physically localise to telomeres. Their association with chromatin, which suggests an involvement in regulating telomere specific heterochromatin modifications, is repressed by SMG proteins that protect chromosome ends from telomere loss. Deletion of any one of the
genetic loci containing ASAR6, ASAR15, or ASAR6-141 results in the same phenotype of delayed replication timing and delayed
mitotic condensation (DRT/DMC) of the entire chromosome. DRT/DMC results in chromosomal segregation errors that lead to increased frequency of secondary rearrangements and an unstable chromosome. Similar to
Xist, ASARs show random
monoallelic expression and exist in asynchronous DNA replication domains. Although the mechanism of ASAR function is still under investigation, it is hypothesized that they work via similar mechanisms as the Xist lncRNA, but on smaller autosomal domains resulting in allele specific changes in gene expression. Incorrect reparation of
DNA double-strand breaks (DSB) leading to chromosomal rearrangements is one of the oncogenesis's primary causes. A number of lncRNAs are crucial at the different stages of the main pathways of DSB repair in
eukaryotic cells: nonhomologous end joining (
NHEJ) and homology-directed repair (
HDR). Gene mutations or variation in expression levels of such RNAs can lead to local DNA repair defects, increasing the chromosome aberration frequency. Moreover, it was demonstrated that some RNAs could stimulate long-range chromosomal rearrangements. == Structure ==