ABRF2008 Poster - Association of Biomolecular Resource

Download Report

Transcript ABRF2008 Poster - Association of Biomolecular Resource

Protein Sequencing Research Group (PSRG): Results of the PSRG 2012 Study
Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-Year Study
R.D. English1, P.R. Jalili2, V. Katta3, K. Mawuenyega4 , H.A. Remmer5, J. Simpson6, D. Suckau7, J.J. Walters2 , B. Xiang8
1University
of Texas Medical Branch, United States, 2Sigma-Aldrich, St. Louis, MO, United States, 3Genentech Inc., San Francisco, CA, United States, 4Washington University School of Medicine, St. Louis, MO, United States,
5University of Michigan, Ann Arbor, MI, United States, 6United States Pharmacopeia, Rockville, MD, United States, 7Bruker Daltonics, Bremen, Germany, 8Monsanto Company, St. Louis, MO, United States
OBJECTIVE: Obtain N-terminal sequence of three
Table 3. Summary of Mass Spectrometry Results
BSA Sequence
standard proteins supplied as separated samples.
INTRODUCTION
Establishing N-terminal sequence of intact proteins plays a critical role in
biochemistry and drug development. N-terminal sequence analysis is necessary for
QC of protein biologics, for determining sites of signal peptide cleavage events, for
characterizing monoclonal antibodies, and in elucidating sequences of genes from
uncommon species. N-terminal sequencing is in the midst of a technology
transition from classical Edman sequencing to mass spectrometry (MS) based
terminal sequencing. Protein homogeneity (absence of interfering protein, nonprotein contaminants and buffer components) is absolutely critical to the success of
analysis. While it has been very straightforward to isolate the protein of interest out
of protein mixtures in preparation for Edman sequencing, core facilities now need
to adopt novel sample preparation techniques to isolate proteins in high purity and
make them amenable for terminal sequencing by MS. There is a lack in easy-touse, field proven methods for sample clean-up.
To address the upcoming change in technology platforms, the PSRG is conducting
a two-year study with the ultimate goal of sample preparation and terminal
sequencing of a protein mixture. This year’s study is the first phase towards the
overall goal and entails terminal sequencing and identification of (fusion) proteins,
which were provided as separated, homogeneous proteins. Participants used
Edman and/or MS techniques, along with bioinformatics tools to derive the termini
of the sample proteins. Study participants were directed to a website to
anonymously upload sequences and data.
MATERIALS AND METHODS
Table 1. Study Design – The Samples
MASS SPECTROMETRY RESULTS
Instrument
PSRG-123
Z10
K38
C10
E20
L36
ultrafleXtreme
maXis
maXis,LTQ-XL Orbitrap
ultraflex
4800
ultrafleXtreme
Software
BioTools,Mascot, BLAST
BioTools,Mascot, BLAST
Mascot
BioTools
manual, ISDetect, BLAST
BioTools,Mascot
Separation
N/A
N/A
N/A
N/A
N/A
N/A
MW [Da]
MS
MALDI,TD-ISD
ESI-MW, TD-ETD
ESI,TD-CID
MALDI,TD-ISD, Edman
MALDI,TD-ISD
MALDI,TD-ISD
66427.38
Protein A Sequence
PSRG-123
Z10
K38
C10
E20
L36
P30A
P30B
ultrafleXtreme
maXis
maXis,LTQ-XL Orbitrap
ultraflex
4800
ultrafleXtreme
BioTools,Mascot, BLAST
BioTools,Mascot, BLAST
Mascot
BioTools
manual, ISDetect, BLAST
BioTools,Mascot
N/A
N/A
N/A
N/A
N/A
N/A
SDS-PAGE
SDS-PAGE
MALDI,TD-ISD/T3
ESI-MW, TD-ETD
ESI-MW, TD-CID, BU
MALDI.TD-ISD, Edman
MALDI,TD-ISD/T3,BU
MALDI,TD-ISD, BU
BU, MS2
BU, MS2
Endostatin Sequence1
Endostatin Sequence2
PSRG-123
PSRG-123
Z10
Z10
K38
K38
C10
C10
E20
E20
L36
L36
P30A
P30B
ultrafleXtreme
ultrafleXtreme
maXis
maXis
maXis,LTQ-XL Orbitrap
maXis,LTQ-XL Orbitrap
ultraflex
ultraflex
4800
4801
ultrafleXtreme
ultrafleXtreme
BioTools,Mascot, BLAST
BioTools,Mascot, BLAST
BioTools,Mascot, BLAST
BioTools,Mascot, BLAST
Mascot
Mascot
BioTools
BioTools
manual, ISDetect, BLAST
manual, ISDetect, BLAST
BioTools,Mascot
BioTools,Mascot
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
on/off purification
HPLC
SDS-PAGE
SDS-PAGE
MALDI-MW,TD-ISD
MALDI-MW,TD-ISD
ESI-MW, TD-ETD
ESI-MW, TD-ETD
ESI-MW,TD-CID, BU
ESI-MW,TD-CID, BU
MALDI,TD- ISD, Edman
MALDI,TD-ISD, Edman
MALDI, TD-ISD,/T3, Edman
MALDI, TD-ISD,/T3, Edman
MALDI,TD-ISD
MALDI,TD-ISD
BU, CID
BU, CID
met
44612.6
44621
met2
-
19445.9
19963.4
D
T
H
K
S
E
I
A
H
R
F
K
D
L
G
E
E
H
F
K
G
L
V
L
I
A
F
S
Q
Y
L
Q
Q
C
P
F
D
E
H
V
K
L
V
N
E
L
T
E
F
A
K
T
C
V
A
D
E
S
H
A
C-term
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Calls
M
-
K
-
W
-
V
-
T
-
F
-
I
-
S
-
L
-
L
-
L
-
L
-
F
-
S
-
S
-
A
-
Y
-
S
-
R
-
G
-
V
-
F
-
R
-
R
-
D
-
T
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
M
L
R
P
V
E
T
P
T
R
E
I
K
K
L
D
G
L
A
Q
H
D
E
A
Q
Q
N
A
F
Y
Q
V
L
N
M
P
N
L
N
A
D
Q
R
N
G
F
I
Q
S
L
V
G
-
I
-
A
-
..
-
..
-
..
-
I
-
S
-
G
-
G
T
T
V
V
V
T
-
P
-
A
D
D
A
L
L
N
R
A
A
A
R
-
A
A
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
R
D
D
F
F
Q
Q
P
P
V
V
L
L
H
H
L
L
V
V
A
A
L
L
N
N
S
S
P
P
L
L
S
S
G
G
G
G
M
M
R
R
G
G
I
I
R
R
G
G
A
A
D
D
F
F
Q
Q
C
C
F
F
Q
Q
Q
Q
A
A
R
R
A
A
V
V
G
G
L
L
A
A
G
G
T
T
F
F
R
R
A
A
F
F
L
L
S
S
S
S
L
L
Q
Q
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
H
19445.3
19963.0
3.2 ppm
3.1 ppm
19447
19964
H
-
-
-
-
-
-
-
-
-
-
-
-
-
A
A
S
-
-
-
-
-
-
P
P
-
-
-
-
-
-
-
-
-
-
K
D
D
P
S
Q
S
A
N
V
-
-
-
-
-
D
D
L
L
Y
Y
S
S
I
I
V
V
-
-
-
-
-
-
-
R
L
Q
D
L
R
R
L
L
Q
Q
D
D
L
L
didn´treport the full sequence , but could have done after careful evaluation
Fusion protein was not considered, causing the erroneous sequence
Gus Luciferase assigned to N-term: wrong hypothesis caused wrong data interpretation
-
LEGEND
correct N-terminal call
Top-Down calls
Edman calls
wrong call
Bottom-Up
TD
BU
ISD
T3
ß-glucuronidase recognized
marked with "-"
blue/green
yellow
red
white
Top-Down
Bottom-Up
In-Source Decay
T³-Sequencing
Protein A
Lys-excision
Lys-excision
Lys-excision
Lys-excision
Lys-excision
Y
Y
S
S
I
I
 Top-Down with ETD or ISD provides reliable N-term sequences.
 Edman and Top-Down complement each other very well: Edman for the first ~10 residues, Top-Down for the inexpensive extension of calls.
 Edman was replaced successfully in validating the N-term sequence by either T³-sequencing or Bottom-Up work.
 Successful analysis of the fusion Protein A required greater user expertise.
 Top-Down CID provided the least robust assignments and were most easily misinterpreted.
 Top-Down by ETD or ISD enabled the detection of the C-terminal removal of lysine and intact MW determination validated these findings.
 Ragged N-termini in case of Endostatin were recognized by those that determined the intact molecular weight(s) or that used Edman.
 Use of protein HPLC allowed the separation of Endostatin fragments but resulted in shortened readouts.
 Bottom-Up alone did not assign the termini correctly, and uncertainty in assigning sequences near Protein A fusion site caused incorrect assignments.
EDMAN DEGRADATION RESULTS
Table 4. Average # of Residues Identified Using Edman

Participants were asked to analyze the samples for terminal sequencing
using any technology available.

Participants received all three proteins with identification in sufficient
amounts to sequence each protein utilizing all three technologies. Feasibility
of analysis had been previously validated by PSRG members.

Participants also filled out a survey, all responses were kept anonymous.
Table 5. Summary of N-Terminal Sequencing Results
 Edman sequencing allows for direct determination of the N-term sequence.
 Labs returned N-term data correlating well with published protein sequences.
 Edman can produce data with or without separation (SDS-PAGE and
chromatography). However, the lower abundant second Endostatin sequence in the
mixture was more difficult to deal with, causing shorter read length, erroneous
assignments, and 2 participants failed to detect it.
 No C-term data were produced with Edman.
 The N-terminus (methylated Met-1) in protein A was not detected, but enzymatic and
chemical methods allowed to obtain correct sequences from residue 2 onwards.
Calling the correct ß-glucuronidase sequence was straightforward for all that used
Edman while a correct Top-Down sequencing result depended much more on the right
“work-hypothesis” to be successful and additional validation efforts.
 Some Edman participants reported an N-term Phe (F) or an N-term unknown amino
acid (X), rather than the met-M, presumably because of possible co-elution of PTH Phe
and PTH Methyl-Met on Edman sequencers (under investigation while poster in print).
 All Participants used ABI instrumentation, with the exception of Participant A00 who
used a Shimadzu product.
Table 2. Instrumentation used by Participating Laboratories
Data analysis
Samples sent
Year 1 (2012) to participants
Year 2 (2013)
Study
Study
announcement
announcement
Discussed ideas for 2012 study.
Agreement upon a study design.
ABRF
2013
Data analysis
Feb ‘13
Oct ‘12
Mar ‘12
Deadline for
returning data
Jun ‘12
 9 labs analyzed the reference protein BSA, 8 correctly determined the N-terminus.
 13 labs analyzed Protein A , 4 correctly determined the N-terminus (methyl-Met).
 14 labs analyzed Endostatin, 12 labs correctly determined the N-terminus , only 7
identified the presence of the second N-terminus.
May ‘11
 Out of 14 respondents,
ABRF
2012
May ‘12
 7 of the 14 labs utilized Edman sequencing , 6 top-down MS and 1 bottom-up MS (5
used bottom-up for confirmation).
Extended
deadline
for
returning
data
Oct ‘11
 14 of the 25 participating laboratories (56%) completed the survey.
Feb ‘11
ABRF 2011
Jan ‘12
Settled on 3 standard proteins for
distribution as separated proteins in
Year 1 of the study
Sep ‘11
 25 laboratories from 12 countries requested samples for Edman sequencing and most
of the labs (23) also for MS sequencing.
2012/2013 PSRG: TIMELINE OF THE 2-YEAR STUDY
Aug ‘11
PARTICIPATION AND SURVEY RESULTS
Distribution of
proteins in mixture for
year 2 of the study
ACKNOWLEDGEMENTS
RepliGen for the generous gift of rec. Protein A.
Sigma for the generous gift of Endostatin.
Xuemei Luo (University of Texas Medical Branch) for data accumulation
and anonymity.
Steve Smith (University of Texas Medical Branch) and
Larry Dangott (Texas A&M University) for Edman sequencing to provide
reference data for this study.
Anja Resemann (Bruker Daltonics) for Top-Down MS sequencing to
provide reference data for this study.
The ABRF Executive Board for support and scrutiny of study proposal.
All Participating Labs for analyzing samples and returning data.