Transcript Slide 1
OPTICAL MAPPING AS A
METHOD OF WHOLE GENOME
ANALYSIS
MAY 4, 2009
COURSE: 22M:151
PRESENTED BY:
AUSTIN J. RAMME
Presentation Outline
Introduction to Optical Mapping
Definitions for Understanding
Modern Optical Mapping Process
Data Analysis
◦ Overview
◦ Steps to Restriction Map Generation
Applications of Optical Mapping
Conclusions
Optical Mapping (OM) Introduction
The number of identified polygenetic diseases is ever
increasing
Methods to analyze the entire genome will enhance current
diagnostic and treatment methods for a variety of diseases
Patient-specific genomic analysis has become the goal in
genetics-based medical research
Optical mapping(OM) is an automated method of ordered
restriction map generation with a goal of whole genome
analysis that avoids the limitations inherent to traditional
techniques
Definitions
Restriction Enzymes
◦ Proteins that cleave DNA molecules based on
a specific base pair sequence (e.g. ATCG)
+
=
http://www.belchfire.net/screenshots/Pacman.jpg
http://www.dnavitaminpro.com/wp-content/uploads/2008/07/dna-horizontal.jpg
http://static.rbytes.net/full_screenshots/z/e/zenwaw-pacman.jpg
Definitions
Restriction Map
◦ Representation of the cut sites on a given DNA molecule to
provide spatial information of genetic loci
DNA strand
Optical Mapping
◦ Process to generate ordered restriction maps from single DNA
molecules
Optical Map
◦ Ordered restriction map of a portion of genomic DNA
[2]
Slide Removed for Online Posting
Computer Representation of Imaging Data
Imaged datasets are converted into barcode
patterns corresponding to the cleaved
fragments
Lengths are determined using an internal λ
standard and fluorescence intensity values
Imaged Cleaved DNA Fragments
[5]
Computer Representation of
Ordered DNA Fragments
Raw Data
Description
◦ Image collection containing genomic restriction fragments of
known length deposited in an ordered manner
◦ Fragments represent randomly sheared genomic DNA
◦ Each OM imaging study redundantly represents the entire
genomic region of interest
Challenges with analyzing individual DNA molecules:
◦
◦
◦
◦
◦
Extra cut sites - physical breakage
Missing cut sites - partial digestion
Loss of small fragments
Sizing error
Chimeric maps- physically overlapped molecules
Combining multiple OMs gives more accurate
restriction maps
Graphing has been used to accomplish this
Steps to Restriction Map Generation
1.
2.
3.
4.
5.
6.
7.
Calculation of OM Overlaps
Overlap Graph Construction
Graph Correction Procedure
Identification of Islands
Contig Construction
Construction of Draft Consensus Map
Consensus Map Refinement
Calculation of Overlaps
A multitude of OMs are collected per optical mapping
experiment
Scoring system used to find overlaps between individual
optical maps:
[6]
Scoring system components:
Matching sites are rewarded
Discordant sites are penalized
Length similarity is rewarded
Overlap Graph Construction
Overlap Graph = G(V,E)
◦ Literature describes it as a graph, but its technically a digraph
◦ The set of nodes (V) represent individual optical maps
◦ The set of edges (E) represent high quality overlaps between pairs of
maps
Weighting and orienting the edges of the graph
◦ Edge weights correspond to genomic distances of the overlapping
map regions
◦ Orientation based on the sign of distance measurements from
neighboring map centerpoints
Goal: Heaviest weight path represents the most comprehensive
genomic restriction map
Optical Mapping Data
OM1
OM2
Graph Construction
OM3
OM4
…
Graph Correction Procedure (1)
False edges correspond to
falsely identified overlaps
◦ Spurious edges
Connect two nodes forming a
cycle which is not possible in
linear DNA
[4]
◦ Orientation consistent false
overlaps (cut edge)
Edges that connect two
unrelated portions of the
genome
[4]
Graph Correction Procedure (2)
False Nodes Chimeric maps
◦ Consist of two groups of nodes only connected
via a single node (cut vertex)
◦ Connect two unrelated portions of the genome
[4]
Identification of Islands
Islands correspond to genomic regions spanned by multiple
overlapping optical maps
[4]
Island 1
Island 2
Island 3
Contig Construction
For each island, “contigs” are defined as paths from sources
to sinks within the overlap graph for the island
The most complete representation of the genomic region is
represented by the heaviest edge path from source to sink
Construction of Draft Consensus Map
Using the determined paths, the nodes and
edges are used to merge the individual
optical maps corresponding to each chosen
island component
Each of the individual composite optical
maps are stored for further analysis
[4]
Consensus Map Refinement (1)
The draft map may contain errors:
◦ Missing cut sites
◦ False cut sites
Hidden Markov Model (HMM) for map refinement
◦ Compares draft map to many other optical maps
◦ Statistics used to identify matching, deleted, and inserted
cut sites
[7]
Hidden Markov Model
Consensus Map Refinement (2)
Sample HMM Path
[7]
The corrected consensus map for each island
pieced back together to form a complete genomic
restriction map
Typically takes 13-15 iterations for statistical
correction of the draft map
Applications of Optical Mapping
Identification of genetic insertions, deletions,
inversions, and repeats
Establish genotype-phenotype correlations for
advancements in diagnosis and treatment of
genetic disorders
Reduction of the time needed and the cost to
sequence entire strands of DNA
In the future: Patient-specific whole genome
analysis
Conclusions
Optical mapping is a method of restriction map
generation for whole genome analysis
Applications range from clinical phenotypegenotype correlations to identification of
polymorphisms in a variety of diseases
In the future, optical mapping technology will help
to realize the goal of patient-specific whole
genomic analysis
Optical Mapping is a modern application of
discrete mathematics with potential to change
medicine
References
1.
Samad A, Huff EF, Cai W, Schwartz DC. Optical mapping: A novel, singlemolecule approach to genomic analysis. Genome Res. 1995;5:1-4.
2.
Ramme AJ. Personal image collection. .
3.
Schwartz DC, Samad A. Optical mapping approaches to molecular
genomics. Curr Opin Biotechnol. 1997;8:70-74.
4.
Valouev A, Schwartz DC, Zhou S, Waterman MS. An algorithm for
assembly of ordered restriction maps from single DNA molecules. Proc
Natl Acad Sci U S A. 2006;103:15770-15775.
5.
Aston C, Mishra B, Schwartz DC. Optical mapping and its potential for
large-scale sequencing projects. Trends Biotechnol. 1999;17:297-302.
6.
Valouev A, Li L, Liu YC, et al. Alignment of optical maps. J Comput Biol.
2006;13:442-462.
7.
Valouev A, Zhang Y, Schwartz DC, Waterman MS. Refinement of optical
map assemblies. Bioinformatics. 2006;22:1217-1224.
Questions?
Further information available from:
1.) Laboratory for Molecular and Computational Genetics (http://www.lmcg.wisc.edu/)
2.) Opgen (http://www.opgen.com/)