Transcript Title

Professional Development Course 1 –
Molecular Medicine
Gene/Protein Knowledge Bases
June 14, 2012
Ansuman Chattopadhyay, PhD
Head, Molecular Biology Information Services
Health Sciences Library System
University of Pittsburgh
[email protected]
http://www.hsls.pitt.edu/guides/genetics
Objectives
•Gene-centric information gateways
•Protein-centric information hubs
http://www.hsls.pitt.edu/moblio
Topics
Gold Standards:
•NCBI Gene
•EBI Uniprot
Noteworthy Databases
•UCSC Gene Detail Page
•GeneCards
Commercial –HSLS Licensed Knowledge Bases
•NextBio
•BioBase Knowledge Library
•Ingenuity IPA
•Metacore
http://www.hsls.pitt.edu/molbio
Gene/Protein Information
Chromosomal location, mRNA,
genomic seq, orthologs, paralogs,
regulatory elements,
Amino acid seq, domain architecture,
protein structure, post translational
modifications
Gene expression, biological pathways,
protein interaction map, disease association,
biomarkers
http://www.hsls.pitt.edu/molbio
Common gene questions
What diseases are
associated with it?
What is its function?
Which tissues express it?
What are its neighboring genes?
What is its genomic seq?
How many splice variants are there?
What are its intron-exon architecture?
http://www.hsls.pitt.edu/molbio
How can I get
its cDNA clone?
Common protein questions
What is its
Function?
Amino acid sequence?
… molecular wt? isoelectric point (PI)?
…post translational modifications?
… presence of domain/pattern/profile?
… hydrophobicity?
… homologous orthologs? Etc.
Structure?
… secondary and tertiary?
Interaction Partner?
http://www.hsls.pitt.edu/molbio
Bioinformatics Databases & Software Providers

National Center for Biotechnology
Information (NCBI)




Home page
Site map
Resource Guide
European Bioinformatics Institute (EBI)



Home page
Databases
Software
http://www.hsls.pitt.edu/molbio
Entrez Gene

each record represents a single gene from a
given organism

Statistics


Gene: 7974 organisms
Genbank: 160,000 organisms
http://www.hsls.pitt.edu/molbio
NCBI : Entrez Gene
Chromosomal
Localization
Amino acid
Genomic mRNA Sequence
Sequence
Sequence
Homologous
Sequences
Expression
Profile
Disease
3D Structure
SNP
http://www.hsls.pitt.edu/molbio
Interacting
Partners
Entrez Gene
Find:
 gene symbols and aliases
 sequences: genomic, mRNA, protein
 intron-exon architecture
 genomic context: neighboring and antisense
genes
 Interacting partners
 associated gene ontology terms: function,
cellular component and biological process
http://www.hsls.pitt.edu/molbio
Sequence Information
Find sequence information for a gene
-genomic
-mRNA
-promoter
-protein
- intron-exon coordinates
Resources
NCBI Entrez Gene: http://www.ncbi.nlm.nih.gov/gene
Link to the video tutorial:
http://media.hsls.pitt.edu/media/clres2705/sequence.swf
http://media.hsls.pitt.edu/media/clres2705/sequence_2.swf
http://www.hsls.pitt.edu/molbio
NCBI Sequence Databases

GenBank


GenPept


archival database of nucleotide sequences
from >160,000 organisms More info
conceptual translation of GenBank CDS
Refseq

based on GenBank record, non-redundant
expert verified databases of reference
sequences
http://www.hsls.pitt.edu/molbio
International Nucleotide Sequence
Database Collaboration
http://www.hsls.pitt.edu/molbio
Primary Vs Derivative databases
http://www.hsls.pitt.edu/molbio
RefSeq Scope & Accessions

Genomic DNA



NC_123456 - complete genome, complete
chromosome, complete plasmid
NG_123456 - genomic region
NT_123456 - genomic contig

mRNA - NM_123456
Protein - NP_123456

more about RefSeq scope and accessions...

http://www.hsls.pitt.edu/molbio
UniProt: Protein
Knowledge Base
UniprotKB

Universal Protein Resource : a comprehensive, centralized protein
information resource

Developed by a consortium:

European Bioinformatics Institute (EBI)
the Swiss Institute of Bioinformatics (SIB)
the Protein Information Resource (PIR)

Comprised of:


--Swiss-Prot: biologist-curated annotation data
--TrEMBL: computationally annotation data
--PIR-International Protein Sequence Database (PIR-PSD):

Funded by: NIH, NSF, the European Union and the Swiss Federal
government
Tutorial Video: http://www.youtube.com/watch?v=TCF3qWn7siI&feature=youtube_gdata
http://www.hsls.pitt.edu/molbio
PROTEIN
sequences, domains,
post-translational modifications
& structures
Start with a protein sequence and find the following:
-domains
-post translational modifications
-secondary Structures
-calculated molecular wt and isoelectric point
-hydrophobicity plot
-peptide digestion
Resources
Uniprot: http://www.uniprot.org/
Link to the video tutorial:
http://media.hsls.pitt.edu/media/clres2705/uniprot.swf
http://media.hsls.pitt.edu/media/clres2705/uniprot2.swf
http://www.hsls.pitt.edu/molbio
Protein Domains

Wikipedia:

A protein domain is a part of protein sequence and
structure that can evolve, function, and exist independently
of the rest of the protein chain. Each domain forms a
compact three-dimensional structure and often can be
independently stable and folded. Many proteins consist of
several structural domains. One domain may appear in a
variety of evolutionarily related proteins. Domains vary in
length from between about 25 amino acids up to 500 amino
acids in length. The shortest domains such as zinc fingers
are stabilized by metal ions or disulfide bridges. Domains
often form functional units, such as the calcium-binding EF
hand domain of calmodulin.
http://www.hsls.pitt.edu/molbio
Protein Domain: SH3

Src homology 3 domains; SH3 domains bind to proline-rich ligands
with moderate affinity and selectivity, preferentially to PxxP motifs;
they play a role in the regulation of enzymes by intramolecular
interactions, changing the subcellular localization of signal pathway
components and mediate multiprotein complex assemblies.
http://www.hsls.pitt.edu/molbio
Homologous Sequences
Homologene
What are its homologous genes?
Entrez Gene Page:
Link
Homologene
change Display
settings
http://www.hsls.pitt.edu/molbio
Find homologous sequence information of a gene
-What % identity a human gene shares with its mouse
homologue
Resources
NCBI Entrez Gene: http://www.ncbi.nlm.nih.gov/gene
Link to the video tutorial:
http://media.hsls.pitt.edu/media/clres2705/homologene.swf
http://www.hsls.pitt.edu/molbio
NCBI Entrez Gene
http://www.hsls.pitt.edu/molbio
Published Probe
Sequences
Retrieve probe sequences published in literature for a
gene
•gene silencing (siRNA)
•realtime PCR
•genotyping
Resources
NCBI Probe Database: http://www.ncbi.nlm.nih.gov/probe
Ingenuity IPA:
Link to the video tutorial:
http://media.hsls.pitt.edu/media/clres2705/probe.swf
http://www.hsls.pitt.edu/molbio
Functions
GeneOntology (GO)
http://www.geneontology.org/
http://www.hsls.pitt.edu/molbio
Levels of abstraction
Gene Ontology (GO)
Khatri, P. et al. Bioinformatics 2005 21:3587-3595; doi:10.1093/bioinformatics/bti565
Copyright restrictions may apply.
http://www.hsls.pitt.edu/molbio
General gene / protein
Information
functions, mutations,
disease associations,
biomarkers & drug interactions
HSLS Licensed Databases

Metacore from GeneGO


BKL from Biobase


portal.genego.com
http://goo.gl/9wpwG
NextBio

http://goo.gl/bpuUC
http://www.hsls.pitt.edu/molbio
Metacore: Search and Browse
.. retrieve information for your gene of interest
….find drugs inhibiting kinases involved in apoptosis
… find common genes for breast cancer and colorectal cancer
…. find common target for erlotinib and gefintinib
Resource
Metacore: http://portal.genego.com/
Link to the video tutorials:
http://media.hsls.pitt.edu/media/molbiovideos/metacore1.swf
http://media.hsls.pitt.edu/media/molbiovideos/metacore2.swf
http://www.hsls.pitt.edu/molbio
Gene Expression
Information
Retrieve gene expression information
What is the expression level of EGFR in human liver
tissue Or in HeLa cell line Or in colon cancer?
-
Resources
EBI Gene Expression Atlas: http://www.ebi.ac.uk/gxa/
BioGPS: http://biogps.gnf.org/#goto=welcome
GeneCards: http://www.genecards.org
Link to the video tutorial:
http://media.hsls.pitt.edu/media/clres2705/expression.swf
http://www.hsls.pitt.edu/molbio
GeneCards
http://www.genecards.org/
http://www.hsls.pitt.edu/molbio
Protein Interactions &
Biological Pathways
Signaling Pathway Map
http://www.hsls.pitt.edu/molbio
Biological Pathways & PPI
Databases

BioGrid: http://thebiogrid.org/

STRING: http://string-db.org/
http://www.hsls.pitt.edu/molbio
- Retrieve interacting partners of a protein of
your interest
-What proteins interact with human EGFR?
Resources
BioGrid:
http://thebiogrid.org/
STRING: http://string-db.org/
Link to the video tutorial:
http://media.hsls.pitt.edu/media/molbiovideos/ppi.swf
http://www.hsls.pitt.edu/molbio
BioGrid
http://www.hsls.pitt.edu/molbio
BioGrid Search Result Page
http://www.hsls.pitt.edu/molbio
Thank you!
Any questions?
Carrie Iwema
[email protected]
412-383-6887
Ansuman Chattopadhyay
[email protected]
412-648-1297
http://www.hsls.pitt.edu/molbio