Transcript Slide 1

http://www.pdb.org/

Experimental approaches for structural biology

• X-ray crystallography • NMR • cryoEM

cryoEM

Where to get structural data?

• biological molecules – PDB – Protein Data Bank http://www.pdb.org

free – NDB – Nucleic Data Bank http://ndbserver.rutgers.edu/ • organic molecules – CSD – Cambridge Structural Database paid

PDB History

1957

• Myoglobin structure determined

1970’s

• Discussions how to establish an archive of protein structures • PDB established at Brookhaven – Oct 1971, 7 structures

1980’s

• Technology takes off – molecular biology, instrumentation, computer hardware and software • Number of structures increases • Structural biology is able to focus on medical problems • IUCr requires data deposition to the PDB

1990’s

• Complexity of structures increases • Structural genomics begins

Current state of the PDB

• 23. 11. 2014 – 105 025 structures in the PDB archive • 8 550 new structures deposited in 2014 so far • Depositions by macromolecule type – 92.6 % Proteins (97 089 structures) – 2.8 % Nucleic acids (2 769 structures) – 4.5 % Protein-nucleic acid complexes (5 143 structures) • Depositions by experimental technique: – 88.0% x-ray diffraction (93 200 structures) – 11.2% solution NMR (10 705 structures) – 0.5% cryo-EM (864 structures)

data as of 26. 11. 2012 http://www.pdb.org/pdb/static.do?p=general_information/pdb_statistics/index.html

PDB ID

• Each structure in the PDB is represented by a 4 character identifier of the form [0 9][a-z,0-9][a-z,0-9][a-z,0-9] • 1B3T

Data formats of PDB • PDB format, mmCIF (and derived xml PDBML) • Dictionary resources at:

http://mmcif.pdb.org/

• mmCIF is the PDB archival format

– all data released in all three formats

PDB Format

 legacy format  http://www.wwpdb.org/docs.html

 fortran-like 80 column-wide  not structured enough to describe complicated 3D objects  its limits have been broken several times  99,999 atoms, 34 (or 58) chains  readable by most programs

model – chain – residue – atom

mmCIF language

 based on community-agreed definitions  allows adding new features and customization  mmCIF categories are easily transformed to database tables  not designed to be read by humans, data should be viewed through programs and databases http://ich.vscht.cz/~cechp/mmcif/

Pubmed, MEDLINE, Entrez etc.

http://www.pubmed.gov

http://www.pubmed.org

NCBI

National Institute of Health (NIH) – U. S. government National Library of Medicine (NLM) National Centre for Biotechnology Information (NCBI)

NCBI (founded 1988, http://www.ncbi.nlm.nih.gov/ ) • Genomic sequences -

GenBank

– open access annotated collection of all available nucleotide sequences, doubles each 18 months (October 2008 – 97 381 682 336 bp), new release every 2 months,

accession number

(U49845) required upon publication •

OMIM

– Online Mendelian Inheritance in Man, db of diseases together with their genetic components •

PubChem

bioacivities (http://pubchem.ncbi.nlm.nih.gov/) – db of small organic molecules, includes the information about their •

Entrez

(http://www.ncbi.nlm.nih.gov/sites/gquery) – federated search engine offering unified access to all NCBI databases

MEDLINE

• journal citations and abstracts for biomedical literature • since 1996 - free access to MEDLINE via PubMed. • PubMed - Web-based retrieval system developed by the NCBI at the NLM. It is part of NCBI's Entrez.

• PubMed contains – abstracts – links to full-text articles – links to other databases – …and much more

What’s in Pubmed

• Most PubMed records are MEDLINE

citations

.

– citations and author abstracts from approx. 5 200 biomedical journals – diverse topics: microbiology, delivery of health care, nutrition, pharmacology and environmental health.

– currently over 19 million references dating back to 1948 – new material added Tuesday through Saturday – about 90% records are from English-language sources or have English abstracts – Approximately 79% of the citations are included with the published abstract

What’s in Pubmed

• Pubmed Central (PMC) – http://www.pubmedcentral.nih.gov/ – db of free full texts – since 2007 paper funded by NIH must be freely available through PMC no later tha 12 month since publishing • NCBI Bookshelf – http://www.ncbi.nlm.nih.gov/sites/entrez?db=books – free biomedical books (biochemistry, molecular biology, …)