Protein-Nucleic Acid Interactions Helen M. Berman
Download
Report
Transcript Protein-Nucleic Acid Interactions Helen M. Berman
Data Management in the
http://www.pdb.org/ • [email protected]
History of the PDB
1970s
Community discussions about how to establish a PDB
Cold Spring Harbor meeting in protein crystallography
PDB established at Brookhaven (October 1971; 7 structures)
1980s
Number of structures increases as technology improves
Community discussions about requiring depositions
IUCr guidelines established
Number of structures deposited increases
Independent biological databases established – e.g., the NDB
1990s
mmCIF project completed
Structural genomics begins
PDB moves to RCSB
2000s
RCSB PDB renewed
wwPDB established
PDB Mission
To provide the most accurate, well-annotated
data about macromolecular structure in the
most timely and efficient way possible to
facilitate new discoveries and advances in
science
Year
Number of released entries
The Data Pipeline
Structure Determination Pipeline
(X-ray)
Hypothesis
Driven Target
Selection
Crystallomics
Data
Collection
Structure
Determination
Data
Deposition
Publication
Data
Release
Isolation, Expression,
Purification,Crystallization
Data Processing Data Flow
System for Data Collection and
Archiving
Depositor
MAXIT
Validation
Data
ADIT
AutoDep
Input Tool
Reports
Final Files
Data
Views
Metadata
Dictionaries
Database
Loader
Data Processing System
Features
Different dictionaries without software changes
Simple customization of both functionality and content
Automatically scales with changes in content
Can be distributed to multiple deposition sites
Reference data and standard nomenclature (ERFs)
Data Content
of Each PDB Entry
1970’s
Name, source, reference, resolution,
sequence,secondary structure, crystal data, coordinates,
unstructured remarks
1990’s
Name, source, reference,resolution, refinement details,
data collection and processing details,symmetry details,
biological unit information, missing residues, related
entries, sequence, ligand and ions, secondary structure,
crystal data, coordinates, few unstructured remarks
Annotation and Validation
ADIT
Reviewing, adding, correcting entry information
Maxit
File format conversions
Blast Automation Tool results
Validation Server Reports
Ligand Depot, ChemDraw
RasMol for Visualization
PubMed, Citation Tracker, Citation Tool
Extending Data Dictionaries for
Deposition
X-ray
Structure determination data items
http://deposit.pdb.org/mmcif/sg-data/xstal.html
NMR
Structure determination data items
http:// deposit.pdb.org /mmcif/sg-data/nmr.html
Protein Production
http:// deposit.pdb.org /mmcif/sg-data/protprod.html
Growth of Molecular Complexity
Deposition Xray/NMR/EM by
Deposition of X-ray, NMR & EM structures
year
by
year
2500
2000
X-ray
1500
NMR
EM
1000
500
Year
2003
2001
1999
1997
1995
1993
1991
1989
1987
1985
1983
1981
1979
1977
1975
1973
1971
1969
0
MARCH
2005
Sample
Description
Cryo-EM Dictionary Proposal
Biochemical
Preparation
EM Specimen
Preparation
em_sample_preparation
em_vitrification
em_assembly
em_sample_support
em_entity_assembly
em_array_formation
em_entity_assembly_list
em_solution_composition
em_virus_entity
em_stain
em_cryo_stain
em_embedding_agent
em_filaments
em_imaging
em_detector
em_image_scans
em_microscope
em_micrographs
em_electron_diffraction
em_icos_virus_shells
em_single_particle
EM Data Collection
Image Processing
em_singleparticle_selection
Structure
Analysis
em_electron_diffraction_phase
em_3d_fitting
em_electron_diffraction_pattern
em_3d_reconstruction
em_2d_crystal
em_3d_fitting_list
em_particle_picking
em_classes
em_particle_picking_list
em_refinement
em_filament_selection
em_fsc_curve
em_filament_reconstruction
New categories
recommended at
the Oct 2004
workshop
are in pink
Target Registration Database
TargetDB • http://targetdb.pdb.org/
All targets downloadable in XML (~51,000 Targets)
Targets downloaded from 18 centers weekly
Target search by:
Sequence (FASTA), project target ID, project site, status (selected,
cloned, expressed, … in PDB), update date, protein name, source
organism
Report output in HTML, FASTA, and XML
Integrates PDB entry sequences (~55,600 sequences)
Includes PDB pre-release sequence data
Provides links to related sequence databases
Open to all Structural Genomics projects
Summary reports of target or project progress
Protein Expression Purification and
Crystallization Database (PepcDB)
Extends content of TargetDB
All protocols for cloning, expression, purification are
stored and are searchable
Reports provide links to status history, related
protocols, project, sequence and domain databases
Tracking, Assembling and
Archiving Data
Target Tracking
TargetDB
Target and Protocol Tracking
Protocols
Target
Selection
Sample
preparation
PepcDB
Data
Collection
Data
Processing
Structure
Solution
Refinement
PDB
Merging and
integration
Incremental Assembly
Current Query System
Reengineered Web Site
pdbbeta.rcsb.org
Built on curated data
Three-tier architecture
Database tier
Middle tier
Presentation tier
Feedback from users
Help desk
Usability engineering
Focus groups
Went into public beta testing in July 2004
Navigation and Query
Persistent
Integrated Help
Search Box (Context-sensitive)
Persistent
Navigation Bar
Site Search
Getting
Started
Hierarchical
Menu Items
Worldwide PDB (wwPDB)
RCSB (Research Collaboratory for Structural
Bioinformatics)
PDBj (Osaka University)
Macromolecular Structure Database (EBI)
To ensure that PDB files remain in a single archive
to best serve the worldwide community of depositors
and users
http://www.wwpdb.org/
Acknowledgements
Operated by the Research Collaboratory of Structural Bioinformatics
Supported by:
NIGMS
RCSB-PDB Team
RCSB PDB Team: Ken Addess, Helen M. Berman, Wolfgang F. Bluhm, Phil Bourne, Kyle Burkhardt, Li Chen, Sharon Cousin,
Jim Croker, Nita Deshpande, Shuchismita Dutta, Zukang Feng, Lew-Christiane Fernandez, Judith L. Flippen-Anderson, Gary
Gilliland, Rachel Kramer Green, Vladimir Guranovic, Shri Jain, Ann Kagehiro, Charlie Knezevich, Andrei Kouranov, Kevin
Lwinmoe, Jeff Merino-Ott, Irina Persikova, Suzanne Richman, Melcoir Rosas, Kathryn Rosecrans, Bohdan Schneider, Wayne
Townsend-Merino, Susan Van Arnum, Elizabeth Walker, John Westbrook, Alice Xenachis, Huanwang Yang, Jasmin Yang,
Christine Zardecki, Cindy Zhang
www.pdb.org • [email protected]