Transcript 生物計算
Chapter 8 Proteomics
暨南大學資訊工程學系 黃光璿 2004/06/07 1
proteome the sum total of an organism’s proteins genome the sum total of an organism’s genetic material 2
8.1 From Genomes to Proteomes
We want to know what proteins are present in cells; what those proteins do and how they function.
However, it’s not easy.
3
Why?
1.
2.
3.
The longevity ( different.
壽命 ) of an mRNA and the protein it codes for are very Many proteins are extensively modified after translation.
Many proteins are not functionally relevant until they are assembled into larger complexes or delivered to an appropriate location.
4
4.
Proteins require more careful handling than DNA.
Function may change.
Protein identification requires mass spectrometric analysis specific antibodies.
Obtaining large numbers of protein molecules requires chemical isolation for living cells.
5
8.2 Protein Classification
Based on protein function six categories evolutionary history & structural similarity 1000 homologous families 6
8.2.1 Enzyme Nomenclature
Started at 1950s
International Union of Biochemistry and Molecular Biology
7
8.2.2 Family and Superfamily
Modern-day proteins may be derived from ~ 1000 original proteins.
folds superfamilies families databases SCOP , CATH , DALI 8
fold the same major secondary structure & topological connections superfamily probable evolutionary relationships family clear evolutionary relationships 9
10
11
8.3 Experimental Techniques
2D Electrophoresis Mass Spectrometry 12
2D Electrophoresis
liver http://tw.expasy.org/cgi-bin/map1 kidney
13
14
15
Problems tens of thousand v.s. thousands under presentation of membrane-bound proteins difficult to determine exactly which protein is represented 16
8.3.2 Mass Spectrometry
2D mass spectrometry, for identification 17
8.3.3 Protein Microarrays
Use antibodies as probes.
Problems Single proteins will interact with multiple probes.
The binding kinetics of each probe are different.
Proteins are sensitive to their environment.
18
8.4 Inhibitors and Drug Design
development & testing of a new drug ~ 15 years, US$ 700 million discovery target identification lead discovery & optimization toxicology ( 毒理學 ) pharmacokinetics testing 19
HIV protease has an active site; cuts a single, large polypeptide chain into many proteins.
20
8.5 Ligand Screening
21
8.5.1 Ligand Docking
Determine how two molecules of known structure will interact.
Three issues: Identify the energy of a particular molecular conformations.
Search for the conformation that minimizes the free energy.
22
How to deal with flexibility in both the protein and the putative ligand.
Lock and key approaches rigid protein structure, flexible ligand structure induced fit docking flexible in both protein and ligand 23
Softwares AutoDock FTDock DOCK Hammerhead Gold FlexX 24
8.5.2 Database Screening
Primary consideration complete and accurate search with a reasonable computational complexity SLIDE Fig. 8.4
25
26
8.6 X-Ray Crystal Structures
W. C. Roentgen (1895) discovered X rays.
M. von Laue (1912) discovered crystals diffract X-rays.
D. Hodgkin, etc. (1950s), crystallized complex organic molecules and determined their structures.
27
grow a crystal of the protein 28
29
30
File formats PDB formatted text mmCIF (MacroMolecular Crystallographic Information File) 31
databases & resources PDB PIR ExPASy 32
Visualizing Tools Fig. 8.8
RasMol Swiss PDB viewer VMD (Visual Molecular Dynamics) Spock Protein explorer DINO 33
8.7 NMR Structures
~ 200 amino acids the structures determined are not unique 34
8.8 Empirical Methods and Prediction Techniques Example: Fig. 8.9
extracting features learning, training testing 35
36
8.9 Post-Translational Modification Prediction Remove segments of a protein.
Covalently attach sugars, phosphates, or sulfate groups into surface residues.
Cross-link residues within a protein (disulfide bond).
37
8.9.1 Protein Sorting
38
associated with membranes not associated with membranes Table 8.3 (Case 2) 39
PSORT : nearest neighbor classifier Prediction of protein subcellular localization SignalP : artificial neural networks Prediction of signal peptide cleavage sites 40
8.9.2 Proteolytic Cleavage
chymotrypsin cleaves polypeptides on the C-terminal side of bulky and aromatic residues trypsin cleaves on the carboxyl side elastase cleaves on the C-terminal side of small residues 41
Prediction proteasomes, > 98%, by neural network 42
8.9.3 Glycosylation
The process of covalently linking an oligosaccharide to the side chain of a protein surface residue ( 科學人 ) N-linked, 75% O-linked, 85% by neural network 43
8.9.4 Phosphorylation
kinases : add phosphatases : remove signal NetPhos , > 70%, neural network 44
參考資料及圖片出處
1.
2.
Fundamental Concepts of Bioinformatics Dan E. Krane and Michael L. Raymer, Benjamin/Cummings, 2003. Merrian-Webster Dictionary 45