Transcript Slide 1

OPTICAL MAPPING AS A
METHOD OF WHOLE GENOME
ANALYSIS
MAY 4, 2009
COURSE: 22M:151
PRESENTED BY:
AUSTIN J. RAMME
Presentation Outline
Introduction to Optical Mapping
 Definitions for Understanding
 Modern Optical Mapping Process
 Data Analysis

◦ Overview
◦ Steps to Restriction Map Generation
Applications of Optical Mapping
 Conclusions

Optical Mapping (OM) Introduction

The number of identified polygenetic diseases is ever
increasing

Methods to analyze the entire genome will enhance current
diagnostic and treatment methods for a variety of diseases

Patient-specific genomic analysis has become the goal in
genetics-based medical research

Optical mapping(OM) is an automated method of ordered
restriction map generation with a goal of whole genome
analysis that avoids the limitations inherent to traditional
techniques
Definitions

Restriction Enzymes
◦ Proteins that cleave DNA molecules based on
a specific base pair sequence (e.g. ATCG)
+
=
http://www.belchfire.net/screenshots/Pacman.jpg
http://www.dnavitaminpro.com/wp-content/uploads/2008/07/dna-horizontal.jpg
http://static.rbytes.net/full_screenshots/z/e/zenwaw-pacman.jpg
Definitions

Restriction Map
◦ Representation of the cut sites on a given DNA molecule to
provide spatial information of genetic loci
DNA strand

Optical Mapping
◦ Process to generate ordered restriction maps from single DNA
molecules

Optical Map
◦ Ordered restriction map of a portion of genomic DNA
[2]
Slide Removed for Online Posting
Computer Representation of Imaging Data


Imaged datasets are converted into barcode
patterns corresponding to the cleaved
fragments
Lengths are determined using an internal λ
standard and fluorescence intensity values
Imaged Cleaved DNA Fragments
[5]
Computer Representation of
Ordered DNA Fragments
Raw Data

Description
◦ Image collection containing genomic restriction fragments of
known length deposited in an ordered manner
◦ Fragments represent randomly sheared genomic DNA
◦ Each OM imaging study redundantly represents the entire
genomic region of interest

Challenges with analyzing individual DNA molecules:
◦
◦
◦
◦
◦
Extra cut sites - physical breakage
Missing cut sites - partial digestion
Loss of small fragments
Sizing error
Chimeric maps- physically overlapped molecules

Combining multiple OMs gives more accurate
restriction maps

Graphing has been used to accomplish this
Steps to Restriction Map Generation
1.
2.
3.
4.
5.
6.
7.
Calculation of OM Overlaps
Overlap Graph Construction
Graph Correction Procedure
Identification of Islands
Contig Construction
Construction of Draft Consensus Map
Consensus Map Refinement
Calculation of Overlaps


A multitude of OMs are collected per optical mapping
experiment
Scoring system used to find overlaps between individual
optical maps:
[6]

Scoring system components:
 Matching sites are rewarded
 Discordant sites are penalized
 Length similarity is rewarded
Overlap Graph Construction

Overlap Graph = G(V,E)
◦ Literature describes it as a graph, but its technically a digraph
◦ The set of nodes (V) represent individual optical maps
◦ The set of edges (E) represent high quality overlaps between pairs of
maps

Weighting and orienting the edges of the graph
◦ Edge weights correspond to genomic distances of the overlapping
map regions
◦ Orientation based on the sign of distance measurements from
neighboring map centerpoints

Goal: Heaviest weight path represents the most comprehensive
genomic restriction map
Optical Mapping Data
OM1
OM2
Graph Construction
OM3
OM4
…
Graph Correction Procedure (1)

False edges correspond to
falsely identified overlaps
◦ Spurious edges
 Connect two nodes forming a
cycle which is not possible in
linear DNA
[4]
◦ Orientation consistent false
overlaps (cut edge)
 Edges that connect two
unrelated portions of the
genome
[4]
Graph Correction Procedure (2)

False Nodes  Chimeric maps
◦ Consist of two groups of nodes only connected
via a single node (cut vertex)
◦ Connect two unrelated portions of the genome
[4]
Identification of Islands

Islands correspond to genomic regions spanned by multiple
overlapping optical maps
[4]
Island 1
Island 2
Island 3
Contig Construction


For each island, “contigs” are defined as paths from sources
to sinks within the overlap graph for the island
The most complete representation of the genomic region is
represented by the heaviest edge path from source to sink
Construction of Draft Consensus Map


Using the determined paths, the nodes and
edges are used to merge the individual
optical maps corresponding to each chosen
island component
Each of the individual composite optical
maps are stored for further analysis
[4]
Consensus Map Refinement (1)

The draft map may contain errors:
◦ Missing cut sites
◦ False cut sites

Hidden Markov Model (HMM) for map refinement
◦ Compares draft map to many other optical maps
◦ Statistics used to identify matching, deleted, and inserted
cut sites
[7]
Hidden Markov Model
Consensus Map Refinement (2)
Sample HMM Path
[7]
The corrected consensus map for each island
pieced back together to form a complete genomic
restriction map
 Typically takes 13-15 iterations for statistical
correction of the draft map

Applications of Optical Mapping

Identification of genetic insertions, deletions,
inversions, and repeats

Establish genotype-phenotype correlations for
advancements in diagnosis and treatment of
genetic disorders

Reduction of the time needed and the cost to
sequence entire strands of DNA

In the future: Patient-specific whole genome
analysis
Conclusions

Optical mapping is a method of restriction map
generation for whole genome analysis

Applications range from clinical phenotypegenotype correlations to identification of
polymorphisms in a variety of diseases

In the future, optical mapping technology will help
to realize the goal of patient-specific whole
genomic analysis

Optical Mapping is a modern application of
discrete mathematics with potential to change
medicine
References
1.
Samad A, Huff EF, Cai W, Schwartz DC. Optical mapping: A novel, singlemolecule approach to genomic analysis. Genome Res. 1995;5:1-4.
2.
Ramme AJ. Personal image collection. .
3.
Schwartz DC, Samad A. Optical mapping approaches to molecular
genomics. Curr Opin Biotechnol. 1997;8:70-74.
4.
Valouev A, Schwartz DC, Zhou S, Waterman MS. An algorithm for
assembly of ordered restriction maps from single DNA molecules. Proc
Natl Acad Sci U S A. 2006;103:15770-15775.
5.
Aston C, Mishra B, Schwartz DC. Optical mapping and its potential for
large-scale sequencing projects. Trends Biotechnol. 1999;17:297-302.
6.
Valouev A, Li L, Liu YC, et al. Alignment of optical maps. J Comput Biol.
2006;13:442-462.
7.
Valouev A, Zhang Y, Schwartz DC, Waterman MS. Refinement of optical
map assemblies. Bioinformatics. 2006;22:1217-1224.
Questions?
Further information available from:
1.) Laboratory for Molecular and Computational Genetics (http://www.lmcg.wisc.edu/)
2.) Opgen (http://www.opgen.com/)