Transcript Document
History of the PDB 1970s • Community discussions about how to establish an archive of protein structures • Cold Spring Harbor meeting in protein crystallography • PDB established at Brookhaven (October 1971; 7 structures) 1980s • Number of structures increases as technology improves • Community discussions about requiring depositions • IUCr guidelines established • Number of structures deposited increases • Independent biological databases established – e.g., the NDB Recent History 1990s • mmCIF project began • First formal definition of PDB format (1996) • Structural genomics begins • PDB moves to RCSB (1998) 2000’s • New PDB entries conform to standards • Legacy files remediated (9000 entries) • wwPDB formed Year Number of released entries PDB Timeline 1993 1998 2003 2008 Total structures 1727 8942 23793 60000? # of structures deposited/year 792 2178 4831 9000? Average #of Web hits/day N/A 57000 180000 ? Macromolecular X-ray crystallography Cryo-EM High Field NMR PDB has always been an international resource • Data come freely from all over the world • Data distribution has been global since the inception • Data distribution is now unconditionally free • Data processing tasks have been distributed since the 1990’s Distribution of Data Processing July 1 2003-June 30 2004 Mon th Jul Aug Sept Oct Nov Dec Jan Feb Mar Apr May Jun Total RCSB-Ru tge rs 260 267 254 264 277 371 272 241 362 241 332 342 3483 Osaka 63 46 54 58 109 64 54 53 92 69 238 48 948 EBI 58 39 82 65 36 60 53 50 104 50 49 76 722 Total 381 352 390 387 422 495 379 344 558 360 619 466 5153 http://www.wwpdb.org/ • Formalization of current working practice • Members – RCSB (Research Collaboratory for Structural Bioinformatics) – PDBj (Osaka University) – Macromolecular Structure Database (EBI) • MOU signed July 1, 2003 • Announced in NSB November 21, 2003 Mission of wwPDB Maintain a single archive of macromolecular structural data that is freely and openly available to the global community Guidelines and Responsibilities • All members act as distribution sites for data • One member is the archive keeper (RCSB) – Issues entry ID’s – Sole write access • All format documentation publicly available • Strict rules for redistribution of PDB files • All sites can create their own web sites Maintain format standards PDB PDB Exchange (mmCIF) Mechanism for extension based on new demands • PDBML Derived from mmCIF Operation • New web site for wwPDB that points to three sites • Planning meeting March 29, 2004 • Constitute an Advisory Board – 2 representatives from each site – 1 IUCR representative – 1 ICMRBS representative – Funding representatives invited to attend meetings • First wwPDBAC meeting November 21, 2004 Blue Sky Further cooperation among data centers Coordination among members of Advisory Committees Coordination among funding agencies RCSB Helen M. Berman Mission To provide the most accurate, well-annotated data in the most timely and efficient way possible to facilitate new discoveries and advances in science Operated by three members of the RCSB: Rutgers, The State University of New Jersey; San Diego Supercomputer Center at the University of California, San Diego; Center for Advanced Research in Biotechnology/UMBI/NIST. Supported by: NIGMS Data Processing Data Flow Tracking and Assembling Data Target Tracking Status Codes TargetDB Target and Protocol Tracking Protocols Target Selection Sample preparation PepcDB Status Codes Data Collection Data Processing Structure Solution Refinement PDB Deposition Merging and integration Incremental Assembly RCSB-PDB Team RCSB PDB Team: Ken Addess, Helen M. Berman, Wolfgang F. Bluhm, Phil Bourne, Kyle Burkhardt, Li Chen, Sharon Cousin, Jim Croker, Nita Deshpande, Shuchismita Dutta, Zukang Feng, Lew-Christiane Fernandez, Judith L. Flippen-Anderson, Gary Gilliland, Rachel Kramer Green, Vladimir Guranovic, Shri Jain, Ann Kagehiro, Charlie Knezevich, Andrei Kouranov, Kevin Lwinmoe, Jeff Merino-Ott, Irina Persikova, Suzanne Richman, Melcoir Rosas, Kathryn Rosecrans, Bohdan Schneider, Wayne Townsend-Merino, Susan Van Arnum, Elizabeth Walker, John Westbrook, Alice Xenachis, Huanwang Yang, Jasmin Yang, Christine Zardecki, Cindy Zhang www.pdb.org • [email protected] RCSB responsibilities with respect to wwPDB • Distribute PDB identifiers to all sites • Maintain PDB exchange dictionary-sole write access • Update ftp site weekly with files from RCSB, MSD, and PDBjsole write access EBI Kim Henrick Macromolecular Structure Database MSD Structural database infrastructure services for Europe http://www.embl.de/ from 1974 http://www.ebi.ac.uk/ from 1996 The European Molecular Biology Laboratory (EMBL) is supported by sixteen countries. Consists of the main Laboratory in Heidelberg (Germany), Outstations in Hamburg (Germany), Grenoble (France) and Hinxton (U. K.), and an external Research Programme in Monterotondo (Italy). http://www.ebi.ac.uk/Information/sitemap.html http://www.ebi.ac.uk/msd/ Tasks Deposition Annotation Relational Database for PDB Training/Outreach Tasks Standards Harvest pipeline Integration WWW Search & Retrieval systems Visualisation API EU-TEMBLOR BBSRC- UK BBSRC- UK EU-BIOXHIT MRC- UK EU-IIMS EU-NMRQUAL (RECOORD) EU-3DEM EU-AUTOSTRUCT BBSRC- UK (Databank for Experimental NMR spectra) EU-BIOSAPIENS EU-SPINE Harry Boutselakis search database Dimitris Dimitropoulos MSDChem Joel Fillon eHTPX Adel Golovin active site Kim Henrick group leader Ayzaz Hussain PDB depositions John Ionides NMR / data model Melford John DBA Peter Keller deposition database leader Eugene Krissinel MSDfold Phil McNeil database development Avi Naim EM validation Richard Newman EM / PDB depositions Tom Oldfield search system leader Anne Pajon NMR / data model Jorge Pineda database development Abdel-Krim Rachedi visualization Janet Roser-Copeland outreach Andre Sitnov deposition system Siamak Sobhany API Antonio Suarez-Uruena mapping Jawahar Swaminathan PDB depositions Mohammed Tagari EM / deposition system John Tate search system Swen Tromm validation Sameer Velankar search system Wim Vranken NMR The MSD group PDBj Haruki Nakamura PDBj Protein Data Bank Japan http://www.pdbj.org/ At Institute for Protein Research, Osaka Univ., financially assisted by 5-years grant since 2001 from the Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST). http://www-bird.jst.go.jp/ Mission of PDBj as the BIRD-JST project 1) To continuously collect structural data by curating, processing, and editing them, collaborating with other members of wwPDB. 2) To develop an advanced database and viewers (a browser and a molecular graphic viewer) based on PDBML, with additional data items annotated from the literatures and other databases. 3) To construct several secondary databases and service programs relating to the protein structure database. 4) To construct a protein encyclopedia for students covering the high-school students, undergraduate students, and general people in the society as an educational activity. 5) To start a biomolecular NMR database, collaborating with BMRB by opening the mirror site and curating NMR data. Organization of PDBj PDBj committee, at Institute for Protein Research, Osaka Univ. wwPDB PDBj office Programmer, SE, Secretary PDB curation and data processing group (6) BMRB database management group (3) PDBML development and data annotation group (5) IPR, Osaka Univ. IPR, Osaka Univ. IPR, Osaka Uni. T. Tsukihara, S. Hase, H. Nakamura (IPR) N. Go (JAERI) Y. Kai (Fac. Engineering, Osaka Univ.) Y. Nishimura (Yokohama City Univ.) S. Wakatsuki (Photon Factory) T. Iizuka (Spring-8) Secondary DB development group eProtS development group (2+temporary) IPR, Osaka Univ. eF-site, jV ProMode ASH (H. Nakamura, K. Kinoshita IPR, Osaka Univ.) (H. Wako et al., Waseda Univ.) (H. Toh et al., Kyoto Univ.) Seq. Navigator/ Str. Navigator (D. Standley, IPR, Osaka Univ.) 25000 4500 Yearly PDBj registration number Yearly wwPDB registration number Total wwPDB registration number 4000 20000 Processed structures by PDBj 3500 Processed structures in the world Total available structures 3000 15000 2500 2000 10000 1500 1000 5000 500 year We process nearly 20% deposited data of the entire world, mainly from Asian and Oceania regions Monthly Processing Number at PDBj Monthly processed data in 2004 2002 2000 1998 1996 1994 1992 1990 1988 1986 1984 1982 1980 1978 1976 0 1974 0 Total registration number 5000 1972 Yearly registration number Processed data numbers at PDBj 250 200 150 100 50 0 1 2 3 4 5 6 Month 7 8 9 10 PDBjViewer (jV) Additional information in xPSSS Total number in PDBML 27,855 GO Information (Biological Process, Molecular Function, Cellular Component) 20,570 Extracted from Literatures by Annotators 10,181 Function Information from eF-site 10,051 Function Information from Swiss-Prot (ACT_SITE, BINDING, DNA_BIND, NP_BIND, ZN_FING, TRANSMEM) Function Information from CATRES –EBIextCATERS by Sequence Homology Medline Information 20,391 172 2,411 22,336 (as of October 2004) Development of Secondary Databases Protein Molecular Surface Database, eF-site (Kinoshita & Nakamura) Protein Dynamics Database, ProMode (Wako & Endo) Sequence Navigator & Structure Navigator (Standley) Alignment of Structural Homologues, ASH (Standley & Toh) Encyclopedia of Protein Structures, eProtS (Ito & Nakamura) PDBj members H. Nakamura, C. Kamada, H. Sakamoto, D. Standley, T. Kosada, E. Nakatani A. Paehler, R. Yamashita, A. Yoshihara, Y. Matsuki (BIRD-JST) H. Akutsu (Institute for Protein Research, Osaka Univ.) N. Ito (School of Biomedical Science, Tokyo Medical & Dental Univ.) K. Kinoshita (Institute of Medical Science, Univ. Tokyo) H. Wako (Waseda Univ.), S. Endo (Kitasato Univ.) H. Toh (Institute for Chem. Research, Kyoto Univ.) T. Okawa (Graduate School of Informatics Science, Osaka Univ.) S. Saeki, A. Takahashi, Y. Shimizu, K. Kobayashi Y. Ikegawa, R. Igawashi, Y. Kengaku, M. Kusunoki wwPDBAC Roles Roles and Responsibilities of the wwPDBAC • Advice on policy issues – When to “retire” a format – Versioning (yes/no) – Scope of PDB 3D data – Copyright • Review adherence to guidelines as per wwPDB agreement • Provide advice on how best to work with funding agencies • Advice on new global projects • Advocacy