Diversity and Plasticity of RNA Beyond the One-Sequence-One-Structure Paradigm Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Chemistry towards Biology Portorož, 8.–
Download ReportTranscript Diversity and Plasticity of RNA Beyond the One-Sequence-One-Structure Paradigm Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Chemistry towards Biology Portorož, 8.–
Diversity and Plasticity of RNA Beyond the One-Sequence-One-Structure Paradigm Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Chemistry towards Biology Portorož, 8.– 12.09.2002 5 ' - en d O CH2 O N a O N1 The chemical formula of RNA consisting of nucleobases, ribose rings, phosphate groups, and sodium counterions Nk = A , U , G , C O OH P O CH2 O N2 Magnesium ions play a special role and act as coordination centers which are indispensible for the formation of full threedimensional structures O O O Na P OH O CH2 O N3 O O O Na P OH O CH2 O N4 O O O N a P O OH O 3 ' - en d 5'-E n d 3'-E n d G C G G A U U U A G C U C A G D D G G G A G A G C M C C A G A C U G A AYA U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A B io ch em ical and chem ical pro bing C rystallo graph y Stru ctu re predictio n N M R , F R E T, ...... 3'-E n d 3'-E n d 5'-E n d 70 5'-E n d 60 10 50 20 30 40 The one sequence – one structure paradigm One day, when biomolecular structures were understood in sufficient detail, we would be able to design molecules with predefined structures and for a priori given purposes. Biomolecular structures are not fully understood yet, but the lack of knowledge in structure and function can be compensated by applying selection methods. 5 ’- G G 4 27 C A C G A = 1.801 10 G G 16 U U U A G C U A A C possib le d ifferent sequences C om binatorial diversity of sequences: N = 4 C U C G U G C A = adenylate U = uridylate C = cytidylate G = guanylate Number of (different) sequences created by common scale random synthesis: 1015 – 1016. Combinatorial diversity of heteropolymers illustrated by means of an RNA aptamer that binds to the antibiotic tobramycin C -3 ’ Taming of sequence diversity through selection and evolutionary design of RNA molecules D.B.Bartel, J.W.Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature 346 (1990), 818-822 C.Tuerk, L.Gold, SELEX - Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249 (1990), 505-510 D.P.Bartel, J.W.Szostak, Isolation of new ribozymes from a large pool of random sequences. Science 261 (1993), 1411-1418 R.D.Jenison, S.C.Gill, A.Pardi, B.Poliski, High-resolution molecular discrimination by RNA. Science 263 (1994), 1425-1429 Amplificat ion Diver sificat ion Select ion Cycle Select ion Desir ed Pr oper t ies ??? no Selection cycle used in applied molecular evolution to design molecules with predefined properties yes Gen et ic Diver sit y E lution of b in ders C h ro m a to g ra p h ic co lu m n R eten tion of b ind ers The SELEX technique for the evolutionary design of aptamers 5’ - G G C A C G A G G U U U A G C A C U A U 5’ - G G 3’ - C C G C A C U U C G A G G U A U G C U C C A G A U C Formation of secondary structure of the tobramycin binding RNA aptamer L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Chemistry & Biology 4:35-50 (1997) C G U G C C - 3’ The three-dimensional structure of the tobramycin aptamer complex L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Chemistry & Biology 4:35-50 (1997) Mapping RNA sequences onto RNA structures The attempt to investigate this mapping is understood as a search for the relations between all possible 4n sequences and all thermodynamically stable structures, which are the structures of minimal free energy. Sequence-structure mappings of RNA molecules were studied by a variety of different experimental and in silico techniques. 5'-E n d S eq uen ce 3'-E n d G C G G A U U U A G C U C A G D D G G G A G A G C M C C A G A C U G A AYA U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A 3'-E n d 5'-E n d 70 60 10 S eco n da ry stru ctu re Tertiary stru ctu re 50 20 30 5'-E n d 40 3'-E n d S ym b olic n ota tion What is an RNA structure? The secondary structure is a listing of base pairs, and it is understood in contrast to the full 3D-structure dealing with atomic coordinates. An intermediate state of structural details is provided by RNA threading or other toy models. RNA Secondary Structures and their Properties RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. Secondary structures are folding intermediates in the formation of full three-dimensional structures. D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem. 52:751-762 (2001) RNA Minimum Free Energy Structures Efficient algorithms based on dynamical programming are available for computation of secondary structures for given sequences. Inverse folding algorithms compute sequences for given secondary structures. M.Zuker and P.Stiegler. Nucleic Acids Res. 9:133-148 (1981) Vienna RNA Package: http:www.tbi.univie.ac.at (includes inverse folding, suboptimal structures, kinetic folding, etc.) I.L.Hofacker, W. Fontana, P.F.Stadler, L.S.Bonhoeffer, M.Tacker, and P. Schuster. Mh.Chem. 125:167-188 (1994) C riterion o f M in im u m F ree E n erg y UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG S equ ence S p ace S hap e S pace Many sequences from the same minimum free energy secondary structure S k = ( I.) S eq u ence spa ce fk = f (S k ) P h en otyp e sp ace N o n -n eg ativ e n um b ers Mapping from sequence space into phenotype space and into fitness values S k = ( I.) S eq u ence spa ce fk = f (S k ) P h en otyp e sp ace N o n -n eg ativ e n um b ers S k = ( I.) S eq u ence spac e fk = f (S k ) P h en otyp e sp ace N o n -n eg ativ e n um b ers A connected neutral network G ian t C om pon en t A multi-component neutral network T> 0K ,t T > 0 K , t finite 3 .4 0 2 .8 0 49 4 4 4 64 2 41 37 34 35 45 40 39 38 36 33 32 31 43 29 30 28 24 3 .1 0 19 17 16 3 .40 11 14 10 3.0 0 5 .10 9 18 15 2 .9 0 13 12 8 7 6 5 4 3 7 .4 0 2 5 .90 S2 23 22 21 20 S10 S9 S8 S7 S5 S6 S4 S3 27 25 26 2.6 0 F ree E n erg y 48 47 3 .1 0 3.3 0 T= 0K ,t S1 S0 M inim um F ree E n ergy Structure S0 S u boptim al S tructu res Different notions of RNA structure including suboptimal conformations S1 K inetic Structures S0 Partition Function of RNA Secondary Structures John S. McCaskill. The equilibrium function and base pair binding probabilities for RNA secondary structure. Biopolymers 29 (1990), 1105-1119 Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, L. Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster. Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie 125 (1994), 167-188 3' Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function 5' UUGGAGUACACAACCUGUACACUCUUUC Example of a small RNA molecule: n=28 C U U G G A G U A C A C A A C C U G U A C A C U C U U U C C U U U C U C A C A U G U C C A A C A C A U G A G G U U U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U U U G U C G C U A A G C se cond suboptim al c onfiguration U A A U E 0 = 0.55 kc al / m ole C G A U C C A C A C U U first suboptim a l configura tion U U U G C G U A C G A E 0 1 = 0.50 kc al / m ole A U A 3' C U C U C A U U C G G U C 5' U U A G A C A U A U C G A U C C A A C G C U C A A C m inim um free energy co nfig uratio n G 0 = - 5 .39 kcal / m ole „Dot plot“ of the minimum free energy structure (lower triangle) and the partition function (upper triangle) of a small RNA molecule (n=28) with low energy suboptimal configurations 5 '-E n d S equenc e 3 '-E n d G C G G A U U U A G C U C A G D D G G G A G A G C M C C A G A C U G A AYA U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A 3 '-E n d 5 '-E n d 70 60 S econda ry Structure 10 50 20 30 S ym bolic N otation 40 5 '-E n d Phenylalanyl-tRNA as an example for the computation of the partition function 3 '-E n d G first subo ptim al config uration E 0 1 = 0 .43 kcal / m ole 3’ 5’ tR N A p h e w ith o u t m odified bases G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A A C C A C G C U U A A G A C A C C U A G C P T G U G U C C U MG A G G U C U A Y A A G U C A G A C C M C G A G A G G G D D G A C U C G A U U U A G G C G G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A A G C C G A U C G P U T C C C A A A C G C U U A A G G C G G A U U U C M G A G C A A U C U G C C G U A C G M G C U A C G U A Y A A A C U G A G G D D G G first subo ptim al co nfigu ration E 0 1 = 0.94 k cal / m o le 3’ 5’ tR N A p h e w ith m odified b ases Kinetic Folding of RNA at Elementary Step Resolution The RNA folding process is resolved to base pair closure, base pair cleavage and base pair shift. The kinetic folding behavior is determined by computation of a sufficiently large ensemble of individual folding trajectories and taking an average over them. The folding behavior is illustrated by barrier trees showing the path of lowest energy between two local minima of free energy. C.Flamm, W.Fontana, I.L.Hofacker and P.Schuster. RNA, 6:325-338 (2000) clo su re cleav a ge sh ift Move set for elementary steps in kinetic RNA folding Mean folding curves for three small RNA molecules with n=15 and very different folding behavior S (h) 5 S Free en erg y G 0 S S S (h) 7 S (h) 6 Suboptim al conform ations Search for local minima in conformation space Sh L ocal m inim um (h) 2 (h) 9 (h) 1 0 G G 0 F ree energ y Saddle point T k Sk F ree energy S T k S Sk "R eaction coordinate" "B arrier tree" S3 S2 O I1 = A C U G A U C G U A G U C A C Example of an inefficiently folding small RNA molecule with n = 15 S1 S0 S4 S2 S3 S1 O I2 = A U U G A G C A U A U U C A C Example of an easily folding small RNA molecule with n = 15 S0 S3 S2 S1 O I3 = C G G G C U A U U U A G C U G Example of an easily folding and especially stable small RNA molecule with n = 15 S0 Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC 3’-end C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G M inim u m free en ergy co nform a tion S 0 One sequence is compatible with two structures G G C G C G C G C G U A G C G C G C G C A A C A U U A C G U A U A G C U A G C C G C G C G C G C G C G C G C G G U G C A U G U U S uboptim al c onform ation S 1 A C 3.40 44 46 42 41 43 45 40 38 39 36 33 37 34 35 32 29 30 28 24 27 25 26 3 .1 0 16 19 17 13 3.40 12 11 14 10 3 .0 0 5.10 9 18 15 2 .9 0 20 23 22 21 2 .6 0 31 3 .1 0 3.3 0 49 2 .8 0 48 47 8 7 6 5 4 7.40 3 2 5 .9 0 Barrier tree of a sequence with two conformations S1 S0 Is there experimental evidence for structural multiplicity of RNA sequences? Are there RNA molecules with multiple functions? How can RNA molecules with multiple functions be designed? OH 3' OH 5' U A G C C G C G A U A C lea v a g e site C A G A A G G C C A C C G G G G G U C G C C C C A G C G G ppp 5' C U G A G U A T h e " h a m m erh ea d " rib o zy m e The smallest known catalytically active RNA molecule OH 3' A ribozyme switch E.A.Schultes, D.B.Bartel, One sequence, two ribozymes: Implication for the emergence of new ribozyme folds. Science 289 (2000), 448-452 Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B) The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures Reference for the definition of the intersection and the proof of the intersection theorem Two neutral walks through sequence space with conservation of structure and catalytic activity Sequence of mutants from the intersection to both reference ribozymes Reference for postulation and in silico verification of neutral networks 3 '-E n d 5 '-E n d 70 60 10 50 20 30 40 From RNA secondary structures to full three-dimensional structures. Example: Phenylalanyl-transfer-RNA Which perspectives have RNA structure modelling and elaborate sequencestructure analysis? Secondary structures are based on the identification of base pairs with defined and only marginally varying geometries that fit into A- or A’-type helices. Until now a great variety of other classifiable base pairs have been found by crystallography and NMR. They can be readily included in structure prediction methods with are similar to the current algorithms for conventional secondary structures. What is needed, however, is the determination of thermodynamic parameters for these unconventional base-base interactions, as it was done in the nineteen-seventies for DNA and RNA double helical and loop structures. So far these data are scarce except H-type pseudo-knots and end-to-end stacking of helices. It seems that the prediction of RNA structures will be an easier task than that of proteins. Classification of purinepyrimidine base pairs Classification of purine-purine base pairs Classification of pyrimidinepyrimidine base pairs General classification of base pairs N.B.Leontis and E. Westhof, RNA 7:499-512 (2001) Coworkers Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Bärbel Stadler, Andreas Wernitznig, Universität Wien, AT Michael Kospach, Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber