Document 7237184

Download Report

Transcript Document 7237184

Systematic Identification of
Protein Domains for Structure
Determination
Ming Luo, Ph.D.
University of Alabama at Birmingham
March 29, 2004
NIH
Current Progress on
C. elegans Proteins
Clone & expression
Protein expressed (1 mL)
Soluble confirmed (1 L)
Purified (6 L)
7500
3500
625
200
6000
2800
500
16 0
4500
2 10 0
375
12 0
3000
14 0 0
250
80
15 0 0
700
12 5
40
0
0
0
O c t00
Apr01
O c t01
Apr02
O c t02
Apr03
O c t03
Apr04
O c t00
Apr01
O c t01
Apr02
O c t02
Apr03
O c t03
Apr04
0
O c t00
Apr01
O c t01
Apr02
O c t02
Apr03
O c t03
Apr04
O c t00
Apr01
O c t01
Apr02
O c t02
Apr03
O c t03
Selected ORFs
Cloned
Expressed
Soluble (1 L)
Purified * (6 L)
4/7/2003
14,440
2,342
1,369
268
110
3/7/2004
15,556
7,326
3,218
503
189
CLONED
EXPRESSED
SOLUBLE
PURIFIED
CRYSTALLIZED
X-RAY DATA
STRUCTURE
Nov, 03
Mar, 04
INCREASE
4762
7326
2564
2293
3128
835
368
503
135
152
189
37
58
65
7
14
9
18
11
* Unique ORFs, each expressed and purified multiple times.
4
2+1
Apr04
Domain Identification
Methods
1.
Conserved Sequence
(e.g. Pfam)
Markley
2.
Spontaneous Degradation
3.
Proteolysis
4.
Functional Data
Predict Domains by Sequence
356
1
29
2-H9
286
1
320
151
304
346
1
55
11-D11
320
1
313
25
11-D5
11-E3
273
1
500
41
278 283
20-D7
436
1
647
323
475
Program used: SMART (http://smart.embl-heidelberg.de/)
Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864
Letunic et al. (2002) Nucleic Acids Res 30, 242-244
28-C5
Four expressed
One soluble
None purified
Spontaneous Degradation
1F11
76F6
3D2
Purified protein samples were stored at 4°C
over one month.
Mass Spectrometry
Solution Specimen
24-Sep-200315:28:43
Eluted from Gel
30.97215271
04-Nov-200306:43:10
3D2_NATIVE 24 (0.824) M1 [Ev-77696,It12] (Gs,0.750,843:1558,1.00,L50,R50); Sm (Mn, 2x3.00); Cm (20:40-1:12)
TOF MS ES+
790
16809
100
%
31.31411934
18H1 88 (3.002) M1 [Ev-64188,It10] (Gs,0.750,627:1371,1.00,L50,R50); Sm (Mn, 2x3.00); Cm (87:90-54:85)
TOF MS ES+
1.11e3
25440
100
%
17969
12720
33618
48183
31214
22412
8984
24679
35938
37192
42916 44823
8480
19961
12340
6360
0
5000
10000
46631
40156
26953 28750
15000
20000
25000
30000
3D2
35000
40000
45000
mass
50000
0
5000
7500
19079
9539
10000
12500
15000
17500
20000
24714
22012
22500
25000
27753
27500
31096
30000
33750
32500
18H1
36454
35000
38955
37500
41164
40000
43676
42500
45495 47376
45000
47500
mass
50000
MS + AA Sequencing
76F6
MS
21279
AA Code
GSQSTSL
3D2
18-210
261
MS
19695
AA Code
SAIKD
140-309
379
Proteolysis
Min 0
5 10 15 20 60 MW
Trypsin Digestion
1.
Trypsin:protein 1:200,
10 mM Tris, pH7.6, 37°C.
2.
N-terminal Sequencing
after transfer to PVDF
ELTSAEK---
3.
Mass Spectrometry using
solution mixture
19277
17774
9H3
Result:
59-212
Functional Data
1D10
Predicted Signal Peptide parameters from
Soren Brunak's SignalP server:
Signal peptide predicted:
HMM-cleavage prediction:
MPKLPLLLSFPLLFFASFAYA-(22)DEDFVT
ANN-cleavage prediction:
MPKLPLLLSFPLLFFASFAYA-(22)DEDFVT
79D4
SUMMARY
SUMMARY OF DOMAIN IDENTIFICATION
# of ORFs
Domain ID by
14
Degradation
23
Proteolysis
8
Functional
6
Sequence
Expressed
Soluble
Purified
Xtal
Structure
11
10
10
7
5/(1 NMR)
3
11
7
7
1
1
0
8
5
5
3
2
2
7
4
1
0
0
0
Total
51
37
26
23
11
8
5
CONCLUSIONS
 Smaller structural domains are most suitable for
HTP structure determination.
 Domains experimentally identified from folded
proteins are most reliable.
 Spontaneously occurring or limited proteolysis,
followed by N-terminal sequencing and mass
spectrometry, are most efficient approaches.
Our Team
TARGET SCREEN AND AUTOMATION
* Chi-Hao Luan , Team Leader
ShiHong Qiu
Rita Gray
PROTE IN PRODUCTION
* Robert Bunzel , Team Leader
Danlin Luo,
Jennifer Zhou
Alireza Ara bshahi Elizav eta Karpova
Annette McKinstry
X-RAY CRYSTALLOGRAPHY
* Songlin Li , Team Leader
Jindrich Symersky
Norbert Schormann
Guangda Lin
Shany un Lu
Wen Ying Huang
Zhuhu a Cao
Qiao Shang
CRYSTALL IZATION
* Larry DeLucas , Team Leader
Songlin Li
Yo uhong Zhang
BIOINFORMATICS
* Mike Car son , Team Lea der
David Johnson
Jun Tsao