This figure was used as an advertisement for the seminar club event. The portrait photograph was kindly provided by Dr. Katoaka. The figure on the left and rights are by LadyofHats on Wikimedia Commons and from Kataoka et al. 2021, Int J Mol Sci 22:7789, respectively.

Summary of CMS Seminar Club presentation on Friday, May 20, 2022

Title: Misregulation of RNA splicing and possible therapies for RNA diseases

Speaker: Dr. Naoyuki Kataoka, Associate Professor, Graduate School of Agriculture and Life Sciences, University of Tokyo

On Friday, May 20, Dr. Katoako gave a presentation at Fujita Health University.  He showed us the principles of RNA splicing and how errors therein can lead to disease.

Recording: For members of Fujita University, a recording of the meeting (without the discussion part) will be available at our Manabi system. Unfortunately, we cannot open the recording for a wider audience.

There were 25 participants who enjoyed the meeting. Several people expressed afterward that they especially liked that Dr. Kataoka kept speaking in a calm, educating voice. That is certainly true, as sometimes in the past we also had speakers who in the middle of their talk forgot that they were addressing a non-specialist audience and then accelerated.

What I particularly liked about Dr. Kataoka’s presentation were his kind character and passion for research, and the way he made RNA splicing easy to understand. I found it intriguing that the splicing of specific RNAs can be manipulated by chemical compounds.

Unlike most protein functions, RNA splicing participates in most biological functions and therefore has an exciting “few methods serve all” therapeutic potential. However, Dr. Kataoka did not discuss the antisense strand route for manipulating splicing (e.g., Pitout et al. 2019), but focused on chemical approaches which seemingly do so by targeting spliceosome protein functions. I found it very fascinating that such is possible, although would argue that there is a lot of trial and error involved and an enormous potential to also affect the splicing of non-target mRNAs (but see below). Dr. Kataoka explained that the advantage of the chemical approach is the easier in vivo administration as compared to oligonucleotide treatment (as also discussed in Nishida, Kataoka et al. 2011).

Prof. Akila Mayeda, who suggested Dr. Katoaka as a speaker, also enjoyed the event and is further helping us to assemble an impressive set of seminars on RNA topics. In February 2023, we will even have his friend the 1993 Nobel prize laureate Sir Richard J. Roberts, one of the discoverers of RNA splicing, as a speaker!

 

The contents of the presentation

Here, I summarize large parts of Dr. Kataoka’s presentation. It is divided into the paragraphs “RNA splicing general,” “RNA diseases general,” “RNA disease example: Progressive Muscular Dystrophy,” and “RNA disease example: Familial Dysautonomia.”

Dr. Kataoka also talked about the Exon Junction Complex (EJC), which is a complex of proteins that binds to the exon-exon junctions during splicing and remains bound there in the cytoplasm. These complexes differ per RNA and can participate in their differential regulation. Dr. Kataoka discovered several EJC proteins. However, EJC description would make the below story too complicated, and therefore, for EJC information, I just refer to the reviews by Dreyfuss, Kim, and Kataoka, 2002, and Asthana et al. 2022.

We also had a short discussion of why introns exist, and why they can be so long. Possible arguments are:

Evolutionary diversity can be created by using exons as building blocks (encoding protein modules) that can be newly arranged in new alleles or new genes by recombination in intron regions

● Longer introns promote recombination (see the above argument)

● Intron processing provides an extra level for controling gene expression

● Alternative splicing allows the production of multiple different proteins from a single gene

● The length of introns may partly be driven by selfish DNA (such as transposons) and thus not always have a function

● Some introns are a source of microRNAs or long non-coding RNAs

 

RNA splicing general

In Eukaryotes, mRNA is transcribed from DNA in the nucleus and undergoes several modifications before it can be transported to the cytoplasm where it can be translated into protein (Fig. 1). One of those modifications is the splicing of “introns,” which are located between the protein-coding “exons” (Fig. 1B).

Except for a GU dinucleotide sequence at the start of an intron and an AG dinucleotide sequence at its end, the sequence requirements for intron splicing in vertebrate species do not show a highly conserved sequence motif although there are some additional preferences for nearby nucleotides (Fig. 2). Almost invariably, usually 18-40 nt upstream from the AG-end, an adenine (A) serves as the “branchpoint” nucleophile by forming a 2′-5′ phosphodiester bond with the guanidine (G) of the GU-end of the intron after its cleavage from the upstream exon, shaping the intron into a “lariat” form (another name for a lariat, as used by cowboys, is “lasso”) (step 1 in Fig. 3). This lariat is cleaved from the transcript at the intron AG-end in a second step in which the 3’ end of the upstream exon is connected to the 5’-end of the downstream exon (Fig. 3).

The splicing process is performed by a huge RNA-protein complex called the “spliceosome,” which consists of the five uridine-rich small nuclear ribonucleoproteins (snRNPs) U1, U2, U4, U4, and U6 that engage in the process at various stages and are numbered accordingly (Fig. 4), plus numerous non-snRNP proteins (Will and Lührmann 2011). Fig. 5 shows, as an example, the U1 snRNP, which has a 164 nt snRNA scaffold including a nucleotide fragment complementary to the GU-end of the intron, the seven different Sm proteins also found in other snRNPs, and the U1-specific proteins U1A, U1C, and U1-70K.

In vertebrates, introns are much longer than exons, and their average lengths in human are 5849 nt and 163 nt, respectively (Fig. 6) (Zhu et al. 2009). For this reason, unlike in lower eukaryotes which have much shorter introns that can be directly used for “intron recognition,” in vertebrates the splicing process is initiated by “exon recognition” which involves crosstalk between U1 and U2 snRNPs at either side of an exon (Fig. 7) (Berget 1995).

Within exons, there also are different small fragments that serve as Exonic Splicing Enhancer (ESE) by binding proteins of the SR (serine- and arginine-rich) protein family or as Exonic Splicing Silencer (ESS) by binding proteins of the hnRNP (heterogenous nuclear ribonucleoprotein) A/B type protein family (Fig. 8 and Fig. 9). When bound to ESE sites, SR proteins stabilize complexes including the U1 and U2 snRNPs, and thereby promote splicing; in contrast, when bound to the ESI sites, the hnRNP A/B proteins recruit other hnRNP A/B proteins to cover the exon and inhibit spliceosome formation (Fig. 9).

Figure 1. The basic steps of gene expression in higher eukaryotes, shown in two different simple figures (A and B). The mRNA strand is transcribed from DNA in the nucleus from DNA and bound by various proteins, and undergoes several forms of processing, among which the splicing of introns, before it is transported to the cytoplasm where it can be translated into protein. These figures were used by Dr. Kataoka in his presentation.
Figure 2. The intron sequence requirements for splicing are more relaxed in higher eukaryotes than in lower eukaryotes. Almost invariant in higher eukaryotes are a GU at the start and an AG at the end, and an A upstream of the AG end that serves as branching point,  but there are additional requirements in the neighboring nucleotides for optimizing splicing efficiency. For more information see Moore et al. 2000 and Rogozin et al. 2012. This figure is a slight modification of a slide in Dr. Kataoka’s presentation.
Figure 3. A scheme for splicing reaction with two steps. Almost invariably, usually 18-40 nt upstream from the AG-end, an adenine (A) serves as the “branchpoint” nucleophile by forming a 2′-5′ phosphodiester bond with the guanidine (G) of the GU-end of the intron after its cleavage from the upstream exon, shaping the intron into a “lariat” form (step 1). This lariat is cleaved from the transcript at the intron AG-end in a second step in which the 3’ end of the upstream exon is connected to the 5’-end of the downstream exon. This figure is a slight modification of a slide in Dr. Kataoka’s presentation. See also Katoaka et al. 2021.
Figure 4. Stepwise formation of spliceosome and splicing reaction. Spliceosomal Uridine-rich small nuclear ribonucleoproteins (U snRNPs) are indicated with their names. This figure was used as a slide in Dr. Kataoka’s presentation. See also Kataoka 2017.
Figure 5. Schematic representation of human U1 snRNP. U1 snRNP is composed of one U1 snRNA, seven common Sm proteins and three U1 snRNP-specific proteins (U1-70K, U1A  and U1C). The 5’-end of U1 snRNA is tri-methyl-guanosine-capped. ψ indicates a post transcriptional modification of uridine, and is located in a stretch complementary to the start of an intron. The secondary structure of U1 snRNA includes four stem-loops (I-to-IV). The sequence in red is used for Sm protein binding. This figure is a slight modification of a slide in Dr. Kataoka’s presentation and was made by Dr. Kataoka himself.
Figure 6. The average lengths in human of introns and exons are 5849 nt and 163 nt, respectively (Zhu et al. 2009), which is shown in scale in the drawing above that text. This means that situations can occur as exemplified in the drawing at the bottom, so that in human it would be very hard for spliceosomes to be initiated by intron recognition (instead of by exon recognition). This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 7. Unlike in lower eukaryotes, which have much shorter introns that can be directly used for “intron recognition,” in vertebrates the splicing process is initiated by “exon recognition” which involves crosstalk between U1 and U2 snRNPs at either side of an exon (Berget 1995). This figure was used as a slide in Dr. Kataoka’s presentation
Figure 8. Exons include small fragments that serve as Exonic Splicing Enhancer (ESE) by binding proteins of the SR (serine- and arginine-rich) protein family or as Exonic Splicing Silencer (ESS) by binding proteins of the hnRNP (heterogenous nuclear ribonucleoprotein) A/B type protein family (see Fig. 9). When bound to ESE sites, SR proteins stabilize complexes including the U1 and U2 snRNPs, and thereby promote splicing; in contrast, when bound to the ESI sites, the hnRNP A/B proteins recruit other hnRNP A/B proteins to cover the exon and inhibit spliceosome formation. This figure was used as a slide in Dr. Kataoka’s presentation. See also Kataoka et al. 2021.
Figure 9. List of splicing regulators that belong to the SR (serine- and arginine-rich) protein family or hnRNP (heterogenous nuclear ribonucleoprotein) A/B type protein family. The former can bind to ESE sites and stimulate splicing, whereas the latter bind to ESS sites and inhibit splicing (see Fig. 9). Both families are similar in that they have RNA binding domains (RBD), but the SR proteins have RS (arginine serine) rich domains and the hnRNP A/B type proteins have arginine-glycine-glycine (RGG) repeats. This figure was used as a slide in Dr. Kataoka’s presentation.

 

RNA diseases general

If anything goes wrong with the level of RNA production, editing, splicing, translation, or decay, this can lead to disease, which can be collectively named “RNA diseases.” Fig. 10 shows several examples of where the disease is either caused by mutations in the misregulated mRNA itself or by mutations in one of the factors participating in splicing. Of the former category, Dr. Kataoka gave detailed explanations about Progressive Muscular Dystrophy and Familial Dysautonomia, which I will summarize below. Dr. Katoaka also explained how mutations in the SR protein SRSF2 can promote myelodysplastic syndromes (MDS) cancer but for this I just refer to his article Masaki et al. 2019.

Figure 10. Examples of RNA diseases. In his seminar, Dr. Kataoka gave detailed explanations about Progressive Muscular Dystrophy, Familial Dysautonomia, and myelodysplastic syndromes (MDS). This is a slight modification of a slide used in Dr. Kataoka’s presentation. The data have also been used in Kataoka 2017.

 

RNA disease example: Progressive Muscular Dystrophy

Muscular dystrophies (MD) are a genetically and clinically heterogeneous group of rare neuromuscular diseases that cause progressive weakness and breakdown of skeletal muscles over time (Wikipedia). The disorders differ as to which gene is mutated, which muscles are primarily affected, the degree of weakness, how fast they worsen, and when symptoms begin.

Some types of MD are caused by mutations in the gene for dystrophin, which is a vital part of a protein complex that connects the cytoskeleton of a muscle fiber to the surrounding extracellular matrix through the cell membrane (Blake et al. 2002). The gene is exceptionally long, namely >2 Mb and has 79 exons (Fig. 11) and has been described as the largest gene in the human genome (Blake et al. 2002). Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD) are caused by various mutations that cause (near-)absence of dystrophin protein and dystrophin protein with modifications, respectively; whereas the clinical symptoms of DMD are severe, those of BMB are milder. Because the dystrophin gene is situated on the X chromosome, there is no allelic compensation at the cellular level in women or at all in men.

Dr. Kataoka and co-workers followed a strategy to convert DMD to BMB by promoting skipping of an exon carrying a stop codon or a frameshift mutation, so that instead of no dystrophin, dystrophin with an internal deletion (but continuous reading frame) would be produced. At least in vitro, they could also do this with antisense oligonucleotides (discussed in Nishida, Kataoka et al. 2011), but Dr. Kataoka presented us a case in which this was achieved by chemical treatment.

In a male DMD patient, designated KUC6797, they found that a mutation in exon 31 caused a stop codon but also a partial skipping of exon 31 that resulted in the production of an internally truncated dystrophin protein (Fig. 12 and Fig. 13). By sequence analysis they realized that the mutation had deleted an Exonic Splicing Enhancer (ESE) site for an SR protein (SRp30c) and created an Exonic Splicing Silencer (ESS) site for an hnRNP A/B type protein (hnRNP-A1) (Fig. 14), so that there would be fewer spliceosome factors associated with exon 31 and the U1 snRNP immediately downstream of exon 30 would more often form a complex with the U2 snRNP immediately upstream of exon 32, splicing out exon 31. They established a reporter system using transfected cells and found indeed that increasing the amount of SRp30c or hnRNP-A1 reduced or increased the exon 31 skipping frequency, respectively (Fig. 15).

Dr. Kataoka and co-workers then tried the chemical TG003 (Fig. 16), a kinase inhibitor specific for Cdc-like kinases that were also known to phosphorylate and thereby activate SR proteins, so that their inhibition by TG003 can induce exon skipping (Muraki et al. 2004; Yomoda et al. 2008). And indeed, Dr. Kataoka and co-workers found that TG003 promoted exon 31 skipping in dystrophin mRNA of patient KUC6797 derived muscle cells (Fig. 17) (Nishida, Kataoka et al. 2011).

Notably, it is hard to see how such a direct targeting of splicing factors will not have many side-effects on the splicing of other genes. However, as the next paragraph shows, with some luck a predominantly positive effect and few side-effects may be achieved.

Figure 11. The human dystrophin gene is defective in patients with Duchenne or Becker muscular dystrophy. This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 12. In patient KUC6797, a G-to-T mutation was found in dystrophin exon 31 that introduced a premature stop codon. However, although because of the stop codon no dystrophin protein was expected, some dystrophin protein (though less than in a healthy control) with both the usual N-terminal and C-terminal ends was found in immunohistochemistry micrographs of muscle as shown at the right (from Nishida, Kataoka et al. 2011). This indicates that apart from introducing a stop codon, the G-to-T mutation also promoted a partial skipping of exon 31 (see Fig. 13). This figure is a slight modification of a slide in Dr. Kataoka’s presentation.
Figure 13. In patient KUCG797, by RT-PCR and sequencing of the amplified band, an additional smaller dystrophin transcript was found that revealed skipping of exon 31. Because in this alternative transcript the exons 30 and 32 remain in frame, a smaller dystrophin protein lacking the exon 31 encoded part can be made (Fig. 12). The data are from Nishida, Kataoka et al. 2011). This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 14. The dystrophin gene exon 31 mutation in patient KUCG797 disrupts a SRp30c binding site and creates an hnRNP A1 binding site, which can explain the promotion of exon skipping. This figure shows a SpliceAid software prediction of the RNA-binding protein candidates that can bind to the wild-type or c.4303G>T exon31 of the dystrophin gene by SpliceAid. A positive score was assigned to the sequences that facilitate the defining of exons, such as ESE motifs. With the same criteria, a negative score was assigned to the target sequences that facilitate intron definition, namely ESS motifs. The nucleotide mutated in c.4303G>T is highlighted in both panels. Two proteins, SRp30c/SRSF9 and hnRNP A1, whose scores are drastically changed by this mutation are highlighted by open squares. The surrounding sequences that show high homology to the SRp30c/SRSF9 (left panel) or hnRNP A1 (right panel) SELEX consensus sequence are underlined. The data are from Nishida, Kataoka et al. 2011). This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 15. Dr. Kataoka and co-workers established a reporter system using transfected cells and found indeed (see the prediction in Fig. 14) that increasing the amount of SRp30c or hnRNP-A1 reduced or increased the exon 31 skipping frequency, respectively (Fig. 15). The data are from Nishida, Kataoka et al. 2011). This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 16. TG300 is a chemical compound that inhibits the activity of several kinases, amongst which kinases that participate in the activation of SR proteins. The data are from Muraki et al. 2004. This figure is a slight modification of a slide in Dr. Kataoka’s presentation.
Figure 17. TG003 promotes exon 31 skipping in dystrophin mRNA of patient KUC6797 derived muscle cells in an ex vivo experiment. As indicated by RT-PCR experiments, exon 31 skipping increases with increasing concentrations of TG003. This results in an increased amount of exon31-truncated dystrophin protein, as indicated by Western blotting. The data are from Nishida, Kataoka et al. 2011. This figure was used as a slide in Dr. Kataoka’s presentation.

 

RNA disease example: Familial Dysautonomia

Familial Dysautonomia, also known as Riley-Day Syndrome, is a rare, progressive, recessive genetic disorder of the autonomic nervous system that affects the development and survival of neurons (Wikipedia). It is usually caused by missplicing of exon 20, resulting from an intronic mutation in the inhibitor of kappa light polypeptide gene enhancer in B cells, kinase complex-associated protein (IKBKAP) gene encoding IKK complex-associated protein (IKAP) (which is currently known as elongator protein 1 [ELP1]) (Fig. 18). One of the functions of the Elongator complex (including IKAP/ELP1) is the formation of the C5-substituent of 5-carbamoylmethyl (ncm5), 5-methoxycarbonylmethyl (mcm5), and its derivatives at the wobble uridine in tRNAs recognizing purine-ending codons (Huang et al. 2005); overall, the partial absence of these modifications has only minor effects on translation, but in some neurons proposedly enough to cause Familial Dysautonomia disease (Karlsborn et al. 2014; Yoshida et al. 2015).

The IKBKAP gene is located on Chr. 9, and Familial dysautonomia is inherited in an autosomal recessive manner, meaning that patients have two mutated alleles. This means that the two IKBKAP splicing variants, one with exon-20 and one without it, which are commonly found among patients (Fig. 19) (Cuajungo et al. 2003), are caused by partial skipping of exon-20 of the same mutant IKBKAP mRNA. The frequency of exon 20 skipping differs per cell type and per tissue, and is highest in neural tissues (Fig. 20) (Cuajungo et al. 2003). The frequent skipping of exon 20, leading to less protein, in this disease is caused by a mutation that deteriorated the quality of the splicing site downstream of it (compare Fig. 18 with Fig. 2) (Slaugenhaupt et al. 2001).

It was also already known that kinetin, a plant cytokinin, could rescue the above-discussed splicing defect (Slaugenhaupt et al. 2004), even in clinical trials in which only minimal side effects were reported (Axelrod et al. 2011). Thus, in principle, medical treatment of splicing defects with chemical compounds seems feasible. However, I have been unable to find updates on the progress of kinetin as a possible drug against Familial Dysautonomia.

Dr. Kataoka and co-workers tried to find a chemical for optimizing the frequency of normal transcripts (including exon 20) in a Familial Dysautonomia genetic background beyond the power of kinetin (Yoshida et al. 2015). For this, they created a “green-red” reporter system using transfection of SH-SY5Y blastoma cell line cells (Fig. 21) and a high throughput screening system (Fig. 22). In total, they screened 638 chemical compounds, of which at least some were similar to kinetin. They found that a kinetin-analog that they called RECTAS (RECTifier of Aberrant Splicing; 2-chloro-N-(furan-2-ylmethyl)-7H-purin-6-amine) (Salani et al. 2019) was superior to kinetin in eliciting the proper splicing around IKBKAP exon 20 (Fig. 23 and Fig. 24), and transcriptome analysis revealed that it did not have a major effect on other transcripts (Yoshida et al. 2015). A later study proposed that the effect of RECTAS was mediated by its enhancement of the activity of the SR protein SRSF6 (Ajiro et al. 2021).

As explained at the beginning of this paragraph, in Familial Dysautonomia the wobble uridine in tRNAs is not properly modified (Karlsborn et al. 2014; Yoshida et al. 2015), which is known to interfere with proper translation. Dr. Kataoka showed us how they found this, and how RECTAS could restore proper modification of the tRNA wobble uridine. However, that is a bit complicated to show and explain here, so for that I just refer to their paper (Yoshida et al. 2015).

In summary, affecting splicing by chemical compounds shows a lot of promise, but there seems to be no solid clinical information yet that shows that this really works for treating disease and without side effects. It is exciting that chemical compouns may either enhance or reduce exon skipping, as shown in the examples of progressive muscular dystrophy and familial dysautonomia, respectively.

Figure 18. Familial Dysautonomia (FD) is commonly caused by a mutation in the intron just downstream of the IKBKAP gene exon 20. This leads to a partial skipping of exon 20, which in that case introduces a premature stop codon (the directly connected exons 19 and 21 are not in frame) so that no protein can be made. This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 19. In Familial Dysautonomia, two IKBKAP splicing variants, one with exon-20 and one without it, are commonly found among patients. The data are from Cuajungo et al. 2003. This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 20. The frequency of exon 20 skipping differs per cell type and per tissue. The data are from Cuajungo et al. 2003. This figure was used as a slide in Dr. Kataoka’s presentation.
Figure 21. Dr. Kataoka and co-workers established a red-green fluorescent reporter system, using a reporter gene construct and transfected HeLa cells, for measuring the ratio of exon 20 splicing. This figure was used as a slide in Dr. Kataoka’s presentation. For detailed information see Yoshida et al. 2015.
Figure 22. Dr. Kataoka and co-workers used their reporter system (see Fig. 21) for a high throughput screening of 638 chemical compounds. This figure was used as a slide in Dr. Kataoka’s presentation. For detailed information see Yoshida et al. 2015.
Figure 23. RECTAS works better than kinetin in promoting inclusion of IKBKAP exon 20 in a reporter system. The top half of the figure shows the structure of kinetin and refers to the articles Slaugenhaupt et al. 2004 and Axelrod et al. 2011). The micrographs in the lower half are from the same study as described in Yoshida et al. 2015 and show the green-red screening for measuring exon 20 integration (green) by a reporter system as explained in Fig. 21. This figure is a modification of a slide used in Dr. Kataoka’s presentation.
Figure 24. RECTAS works better than kinetin in promoting inclusion of IKBKAP exon 20 in fibroblasts of FD patients. At the left, for two different patients, RT-PCR analyses for IKBKAP transcript are shown. At the right, the results for IKBKAP protein are shown, with GAPDH as a control. The data are also shown in Yoshida et al. 2015. This figure was used as a slide in Dr. Kataoka’s presentation.

Categories: Posts