Transcript Document
History of the PDB
1970s
• Community discussions about how to establish an archive of protein
structures
• Cold Spring Harbor meeting in protein crystallography
• PDB established at Brookhaven (October 1971; 7 structures)
1980s
• Number of structures increases as technology improves
• Community discussions about requiring depositions
• IUCr guidelines established
• Number of structures deposited increases
• Independent biological databases established – e.g., the NDB
Recent History
1990s
• mmCIF project began
• First formal definition of PDB format (1996)
• Structural genomics begins
• PDB moves to RCSB (1998)
2000’s
• New PDB entries conform to standards
• Legacy files remediated (9000 entries)
• wwPDB formed
Year
Number of released entries
PDB Timeline
1993
1998
2003
2008
Total structures
1727
8942
23793
60000?
# of structures
deposited/year
792
2178
4831
9000?
Average #of Web
hits/day
N/A
57000
180000
?
Macromolecular
X-ray crystallography
Cryo-EM
High Field
NMR
PDB has always been an
international resource
• Data come freely from all over the world
• Data distribution has been global since the inception
• Data distribution is now unconditionally free
• Data processing tasks have been distributed since the
1990’s
Distribution of Data Processing
July 1 2003-June 30 2004
Mon th
Jul
Aug
Sept
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Total
RCSB-Ru tge rs
260
267
254
264
277
371
272
241
362
241
332
342
3483
Osaka
63
46
54
58
109
64
54
53
92
69
238
48
948
EBI
58
39
82
65
36
60
53
50
104
50
49
76
722
Total
381
352
390
387
422
495
379
344
558
360
619
466
5153
http://www.wwpdb.org/
• Formalization of current working practice
• Members
– RCSB (Research Collaboratory for Structural Bioinformatics)
– PDBj (Osaka University)
– Macromolecular Structure Database (EBI)
• MOU signed July 1, 2003
• Announced in NSB November 21, 2003
Mission of wwPDB
Maintain a single archive of macromolecular
structural data that is freely and openly available
to the global community
Guidelines and Responsibilities
• All members act as distribution sites for data
• One member is the archive keeper (RCSB)
– Issues entry ID’s
– Sole write access
• All format documentation publicly available
• Strict rules for redistribution of PDB files
• All sites can create their own web sites
Maintain format standards
PDB
PDB Exchange (mmCIF)
Mechanism for extension based on new demands
• PDBML
Derived from mmCIF
Operation
• New web site for wwPDB that points to three sites
• Planning meeting March 29, 2004
• Constitute an Advisory Board
– 2 representatives from each site
– 1 IUCR representative
– 1 ICMRBS representative
– Funding representatives invited to attend meetings
• First wwPDBAC meeting November 21, 2004
Blue Sky
Further cooperation among data centers
Coordination among members of Advisory Committees
Coordination among funding agencies
RCSB
Helen M. Berman
Mission
To provide the most accurate, well-annotated data in the most
timely and efficient way possible to facilitate new
discoveries and advances in science
Operated by three members of the RCSB:
Rutgers, The State University of New Jersey; San Diego Supercomputer Center at the University of
California, San Diego; Center for Advanced Research in Biotechnology/UMBI/NIST.
Supported by:
NIGMS
Data Processing Data Flow
Tracking and Assembling Data
Target Tracking
Status Codes
TargetDB
Target and Protocol Tracking
Protocols
Target
Selection
Sample
preparation
PepcDB
Status Codes
Data
Collection
Data
Processing
Structure
Solution
Refinement
PDB
Deposition
Merging and
integration
Incremental Assembly
RCSB-PDB Team
RCSB PDB Team: Ken Addess, Helen M. Berman, Wolfgang F. Bluhm, Phil Bourne, Kyle Burkhardt, Li Chen, Sharon Cousin,
Jim Croker, Nita Deshpande, Shuchismita Dutta, Zukang Feng, Lew-Christiane Fernandez, Judith L. Flippen-Anderson, Gary
Gilliland, Rachel Kramer Green, Vladimir Guranovic, Shri Jain, Ann Kagehiro, Charlie Knezevich, Andrei Kouranov, Kevin
Lwinmoe, Jeff Merino-Ott, Irina Persikova, Suzanne Richman, Melcoir Rosas, Kathryn Rosecrans, Bohdan Schneider, Wayne
Townsend-Merino, Susan Van Arnum, Elizabeth Walker, John Westbrook, Alice Xenachis, Huanwang Yang, Jasmin Yang,
Christine Zardecki, Cindy Zhang
www.pdb.org • [email protected]
RCSB responsibilities
with respect to wwPDB
• Distribute PDB identifiers to all sites
• Maintain PDB exchange dictionary-sole write access
• Update ftp site weekly with files from RCSB, MSD, and PDBjsole write access
EBI
Kim Henrick
Macromolecular Structure Database
MSD
Structural database
infrastructure services
for Europe
http://www.embl.de/
from 1974
http://www.ebi.ac.uk/
from 1996
The European Molecular Biology Laboratory
(EMBL) is supported by sixteen countries.
Consists of the main Laboratory in Heidelberg
(Germany), Outstations in Hamburg (Germany),
Grenoble (France) and Hinxton (U. K.), and an
external Research Programme in Monterotondo
(Italy).
http://www.ebi.ac.uk/Information/sitemap.html
http://www.ebi.ac.uk/msd/
Tasks
Deposition
Annotation
Relational Database for PDB
Training/Outreach
Tasks
Standards
Harvest pipeline
Integration
WWW Search & Retrieval systems
Visualisation
API
EU-TEMBLOR
BBSRC- UK
BBSRC- UK
EU-BIOXHIT
MRC- UK
EU-IIMS
EU-NMRQUAL
(RECOORD)
EU-3DEM
EU-AUTOSTRUCT
BBSRC- UK
(Databank for
Experimental
NMR spectra)
EU-BIOSAPIENS
EU-SPINE
Harry Boutselakis search database
Dimitris Dimitropoulos MSDChem
Joel Fillon eHTPX
Adel Golovin active site
Kim Henrick group leader
Ayzaz Hussain PDB depositions
John Ionides NMR / data model
Melford John DBA
Peter Keller deposition database leader
Eugene Krissinel MSDfold
Phil McNeil database development
Avi Naim EM validation
Richard Newman EM / PDB depositions
Tom Oldfield search system leader
Anne Pajon NMR / data model
Jorge Pineda database development
Abdel-Krim Rachedi visualization
Janet Roser-Copeland outreach
Andre Sitnov deposition system
Siamak Sobhany API
Antonio Suarez-Uruena mapping
Jawahar Swaminathan PDB depositions
Mohammed Tagari EM / deposition system
John Tate search system
Swen Tromm validation
Sameer Velankar search system
Wim Vranken NMR
The MSD group
PDBj
Haruki Nakamura
PDBj
Protein Data Bank
Japan
http://www.pdbj.org/
At Institute for Protein Research,
Osaka Univ., financially assisted by
5-years grant since 2001 from the
Institute for Bioinformatics Research
and Development (BIRD), Japan
Science and Technology Agency (JST).
http://www-bird.jst.go.jp/
Mission of PDBj as the BIRD-JST project
1) To continuously collect structural data by curating, processing,
and editing them, collaborating with other members of wwPDB.
2) To develop an advanced database and viewers (a browser and a
molecular graphic viewer) based on PDBML, with additional
data items annotated from the literatures and other databases.
3) To construct several secondary databases and service programs
relating to the protein structure database.
4) To construct a protein encyclopedia for students covering the
high-school students, undergraduate students, and general
people in the society as an educational activity.
5) To start a biomolecular NMR database, collaborating with
BMRB by opening the mirror site and curating NMR data.
Organization of PDBj
PDBj committee, at Institute for Protein
Research, Osaka Univ.
wwPDB
PDBj
office
Programmer, SE,
Secretary
PDB curation and
data processing
group (6)
BMRB database
management
group (3)
PDBML
development and
data annotation
group (5)
IPR, Osaka Univ.
IPR, Osaka Univ.
IPR, Osaka Uni.
T. Tsukihara, S. Hase, H. Nakamura (IPR)
N. Go (JAERI)
Y. Kai (Fac. Engineering, Osaka Univ.)
Y. Nishimura (Yokohama City Univ.)
S. Wakatsuki (Photon Factory)
T. Iizuka (Spring-8)
Secondary DB
development group
eProtS
development
group
(2+temporary)
IPR, Osaka Univ.
eF-site, jV
ProMode
ASH
(H. Nakamura, K. Kinoshita
IPR, Osaka Univ.)
(H. Wako et al.,
Waseda Univ.)
(H. Toh et al.,
Kyoto Univ.)
Seq. Navigator/
Str. Navigator
(D. Standley, IPR, Osaka Univ.)
25000
4500
Yearly PDBj registration number
Yearly wwPDB registration number
Total wwPDB registration number
4000
20000
Processed structures by PDBj
3500
Processed structures in the world
Total available structures
3000
15000
2500
2000
10000
1500
1000
5000
500
year
We process nearly 20%
deposited data of the entire
world, mainly from Asian
and Oceania regions
Monthly Processing Number at PDBj
Monthly
processed data in 2004
2002
2000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
1978
1976
0
1974
0
Total registration number
5000
1972
Yearly registration number
Processed data numbers at PDBj
250
200
150
100
50
0
1
2
3
4
5
6
Month
7
8
9
10
PDBjViewer
(jV)
Additional information in xPSSS
Total number in PDBML
27,855
GO Information
(Biological Process, Molecular Function, Cellular Component)
20,570
Extracted from Literatures by Annotators
10,181
Function Information from eF-site
10,051
Function Information from Swiss-Prot
(ACT_SITE, BINDING, DNA_BIND, NP_BIND, ZN_FING,
TRANSMEM)
Function Information from CATRES –EBIextCATERS by Sequence Homology
Medline Information
20,391
172
2,411
22,336
(as of October 2004)
Development of
Secondary
Databases
Protein Molecular
Surface Database,
eF-site (Kinoshita
& Nakamura)
Protein Dynamics
Database, ProMode
(Wako & Endo)
Sequence Navigator &
Structure Navigator
(Standley)
Alignment of Structural
Homologues, ASH
(Standley & Toh)
Encyclopedia of Protein
Structures, eProtS
(Ito & Nakamura)
PDBj members
H. Nakamura, C. Kamada, H. Sakamoto, D. Standley, T. Kosada,
E. Nakatani
A. Paehler, R. Yamashita, A. Yoshihara, Y. Matsuki (BIRD-JST)
H. Akutsu (Institute for Protein Research, Osaka Univ.)
N. Ito (School of Biomedical Science, Tokyo Medical & Dental Univ.)
K. Kinoshita (Institute of Medical Science, Univ. Tokyo)
H. Wako (Waseda Univ.), S. Endo (Kitasato Univ.)
H. Toh (Institute for Chem. Research, Kyoto Univ.)
T. Okawa (Graduate School of Informatics Science, Osaka Univ.)
S. Saeki, A. Takahashi, Y. Shimizu, K. Kobayashi
Y. Ikegawa, R. Igawashi, Y. Kengaku, M. Kusunoki
wwPDBAC
Roles
Roles and Responsibilities of the
wwPDBAC
• Advice on policy issues
– When to “retire” a format
– Versioning (yes/no)
– Scope of PDB 3D data
– Copyright
• Review adherence to guidelines as per wwPDB agreement
• Provide advice on how best to work with funding agencies
• Advice on new global projects
• Advocacy