Non-coding RNA—an added layer of gene regulation
The emerging significance of non-coding RNA species is revolutionizing how we view eukaryotic genome complexity. The direct synthesis of proteins from the DNA blueprint via mRNA, described by the well-known “central dogma” model, is only part of the story, as it becomes increasingly evident that RNA plays a much bigger role in the life of the cell than simply acting as a template for protein translation. Large numbers of newly discovered RNAs are now recognized as having important functions in gene regulation. While some of these RNAs have similar lengths and splicing structures to mRNA, they do not encode proteins. Already, many thousands of these non-coding RNA species have been described, which include long intergenic non-coding RNAs (lincRNAs), long non-coding RNAs (lncRNAs), and small nucleolar RNAs (snoRNAs).
When looking at a meta-analysis of the amount of protein-coding RNA versus non-coding RNA relative to the size and complexity of an organism, an intriguing trend appears. Very simple organisms have a high proportion of protein-coding RNAs, but as organisms increase in size and complexity, there is a point at which the number of protein-coding genes plateaus—at about 20,000–25,000 genes. The number of non-coding RNA species, however, continues to increase and, in humans, there may be over 60,000 non-coding RNAs compared to only 20,000 protein-coding genes. The majority of these are intronic or intergenic sequences that are differentially transcribed, therefore expanding transcriptome diversity alongside physiological complexity [1,2]. One way of interpreting these data is that beyond a certain point it may not be necessary to increase the number of proteins available to an organism to increase complexity; rather, increased complexity in large multicellular organisms may arise from additional layers of temporal-spatial regulation of existing genes during development and across tissues, which may be partly directed by non-coding RNAs. For example, a number of lncRNAs are human-specific  and may contribute to the degree of phenotypic diversity between humans and non-human primates despite these organisms’ relatively similar genomes.
Recent research is clarifying some of the mechanisms behind this relationship, and has demonstrated, for some lncRNAs, a tight spatial and temporal regulation of protein-coding genes . Dynamic expression of lncRNAs has been found in numerous systems during differentiation, from embryonic stem cells  to neurons , and has even been implicated in cancer . Evidence is growing to support the theory that the extra layer of regulation in which lncRNAs are involved enables the complexity and sophistication of an organism to increase.
The lncRNA field is relatively young and only a small fraction of the thousands of known lncRNA species have been functionally characterized. The lack of protein counterparts for antibody-based methods makes functional studies challenging. The primary method to study lncRNA function is to employ tools that suppress expression of the lncRNA and examine the resulting phenotypic changes. We discuss here the relative merits of the 2 main approaches to experimentally reduce lncRNA levels: antisense and RNA interference (RNAi).
Dissecting function through gene knockdown
Antisense technology was the first oligonucleotide-based approach to gene knockdown—that researchers could use a synthetic oligonucleotide to quell gene expression was an exciting revelation. However, antisense methods have been used less than was originally anticipated, possibly due to the absence of good computer-assisted design tools to identify active sites in the target RNA and, more recently, the introduction of easier-to-use RNAi methods.
The antisense reagent is a short (18–25 base), synthetic oligonucleotide (antisense oligonucleotide, or ASO), usually a DNA chimera with flanking modifications, such as 2'-O-methyl (2’OMe) or locked nucleic acid residues. Such chemical modifications enhance function by increasing nuclease resistance and improving binding affinity. (More information on the advantages of including such modifications and the latest ASO designs available from IDT can be found here.)
Once delivered into cells, ASOs enter the nucleus and bind to their complementary, endogenous RNA target. Hybridization of the ASO to the target RNA forms a DNA:RNA heteroduplex, which becomes a substrate for cleavage by the enzyme RNase H1 [8–10]. For highest potency, it is vital to have strong binding between the antisense oligonucleotide and RNA target.
However, destruction of mRNA or lncRNA targets is not the natural role of RNase H1. The enzyme exists in the nucleus to maintain integrity of the genome during DNA replication, helping to remove unwanted RNA residues from the nascent DNA strand and initiate repair processes. Thus, there are no pre-existing cellular proteins or co-factors to facilitate ASO binding and subsequent destruction of lncRNA targets. These molecules can range from 200 to 118,000 bases long and are often difficult to access due to protein binding or inherent RNA secondary structure. Identifying a good antisense target site can be challenging and may require the empirical testing of 10 or more candidate ASOs to find an effective reagent. Computational methods to predict tertiary structure and protein binding sites within long RNAs are often inaccurate, so computer-assisted ASO design algorithms have had limited success in helping researchers with site selection .
Harnessing natural gene-regulatory pathways with RNAi
RNAi was discovered in 1998 and offered an entirely different way to use synthetic oligonucleotides to suppress expression of a specific gene [12,13]. The RNAi pathway is a highly conserved, naturally occurring process used by most eukaryotes as a global method to regulate gene expression and, at the same time, protect cells against potentially dangerous sequences. Scientists quickly realized they could coopt this process for gene knockdown.
Double-stranded RNA (dsRNA) is cleaved by the cellular endoribonuclease Dicer into small interfering RNAs (siRNAs), that are usually 21 bp long with a central 19 bp duplex domain and 2-base 3′ overhangs. In mammals, the siRNA associates with the proteins Dicer, the trans-activation response RNA-binding protein (TRBP), and another ribonuclease, Argonaute 2 (Ago2), to form the RNA-Induced Silencing Complex, or RISC . Once in RISC, one strand of the siRNA (the passenger strand) is degraded or discarded while the other strand (the guide strand) remains to direct sequence specificity of the silencing complex. Hybridization of the siRNA guide strand to a complementary RNA target leads to cleavage of that RNA by Ago2.
The RNAi machinery has been continuously evolving over the last 500 million years and can regulate genes in eukaryotic cells through a variety of mechanisms. As a result, this complex protein apparatus is highly efficient in targeting cellular RNAs for degradation (or regulation by other means). Researchers have been able to determine key design features that enable dsRNAs to function efficiently as siRNA-mimics, enter RISC, and direct sequence-specific gene suppression. siRNAs can have amazing potency, with some duplexes having an EC50 in the sub-picomolar range. Working together with Prof John Rossi from the Beckman Research Institute at the City of Hope, IDT has developed an improved siRNA variant called Dicer-substrate siRNA (DsiRNA), which engages Dicer prior to RISC formation and shows enhanced potency (Figure 1) [15–17].
Computer design tools, such as IDT’s Custom RNAi Design Tool, enable researchers to screen only 2 or 3 sequences to find an effective gene knockdown reagent, making RNAi more accessible to researchers than antisense methods. Further increasing the accessibility of this method, complete kits are available containing predesigned DsiRNAs for any desired gene in the human, mouse, or rat transcriptomes. The RNAi method nearly always works to provide some level of gene expression knockdown, and its ease and accessibility has made this approach extremely popular.
The challenge of lncRNA
One key difference between antisense and RNAi approaches is the location in the cell where these 2 mechanisms are most effective. The Ago2-based degradative RNAi machinery is localized primarily in the cytoplasm, which is fine for targeting mRNAs in the cytoplasmic phase of their life cycle. However, spatial localization of lncRNA is much more complex than mRNA  and, importantly, they can also localize to the nucleus, limiting the effectiveness of RNAi against some lncRNA targets. RNase H1 is mostly present in the nucleus and can therefore be engaged via antisense mechanisms to degrade nuclear lncRNAs. Further, if the localization of a lncRNA of interest is unknown, ASOs can target these RNAs in the nucleus at the site of transcription and should be effective regardless of where steady-state lncRNA pools accumulate.
Ongoing studies are examining the efficacy of ASO methods versus RNAi to knock down lncRNAs. It may well be that, considering the variable localization of lncRNAs, there are added benefits to combining both technologies to rapidly deplete both nuclear and cytoplasmic pools of the lncRNA under study. Driven by the need to understand lncRNA function, it appears we have now come full circle in the story of antisense, to a place where the technology once again may be the knockdown agent of choice.
The future of antisense
Although good predictive site selection algorithms for ASOs are currently lacking, this is not where the story ends. With active collaborations, work is well underway at IDT to develop machine-learning algorithms that can predict active ASO sites, using the same approaches that were successful in developing effective DsiRNA design tools. We may yet see antisense technologies come to fruition as an accessible and insightful tool. Our advice for researchers involved in the world of long non-coding RNA … watch this space.