“The World's 1st Ultra High throughput Genomics Data Platform” Copyright © 2004 Synamatix sdn bhd (538481-U)
Download ReportTranscript “The World's 1st Ultra High throughput Genomics Data Platform” Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 1
“The World's 1st
Ultra High throughput Genomics Data Platform”
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 2
Introductions
Robert George Hercus – CTO/MD
Over 30 years IT experience
Pioneered many large-scale IT
projects
“Language of Biology” basis of
Synamatix
Synamatix 1 of 4 companies in
group: Linguamatix, Neuramatix
and Viramatix
Interests: Linguistics, Genomics,
Artificial Intelligence, Neuronal
Networks
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 3
What we are NOT….what we ARE..
NOT ABOUT:
Data content
Flat file, hierarchical or Relational Databases
Replacement of existing browser/mapping tools: e.g.EnsEMBL
Just a Blast alternative……..
Applications – IP is within SynaBASE
ARE ABOUT:
NEW CONCEPT: Proprietary pattern based approaches to construct
1st “structured-network data system” for biological data
ITERATION, SPEED and EXTENSIBILITY: enable single data system
for a wide array of biological data – enabling enterprise wide
capability
Providing tools, Know-how and Technologies to assist in
understanding and ultimately defining the “language” of
biological data – FUNCTIONAL GENOMICS
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 4
Intelligence… not
brute force
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 5
Syntax and Semantics
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 6
Synamatix
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 7
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 8
Basis of IP – Human Brain>SynaBASE
This is
SynaBASE:
Identifies
and learns
patterns
without
human
supervision
or training
sets.
Synabase
maintains
patterns &
their
relationships
Highly
efficient
data
structures
and
relationships
between
data
elements
are
constructed
Copyright © 2004 Synamatix sdn bhd (538481-U)
Applications
becomes
more
accurate and
efficient as
more data is
added
Slide 9
A structural database for genomics data, SynaBASE*
*patents pending
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 10
4
unique features
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 11
1. Patterns and structures
Finds, Stores, Relates & Structures
PATTERNS,
not FLAT FILES
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 12
What makes Synamatix UNIQUE - 1
What do we
know about
data ?
There is no evolutionary need to
preserve non-functional sequence
patterns
Evolution requires the
conservation of patterns which are
at least functionally equivalent, or
functionally better
Similarity
of DATA
Common PATTERNS and
functionality
Significant patterns and their
relationships are extended and
maintained by SynaBASE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 13
A conventional flat-file database
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 14
SynaBASE * - A structured network database
*patents pending
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 15
2. Significance and Frequency
SynaBASE automatically learns and
maintains the significance of patterns and
data
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 16
Significance
Fixed length K-mers are inappropriate
The elephant and the giraffe walked up the mountain
A graph showing Frequency of “string (word)” patterns
in a sentence does not reflect meaning
The elephant and the giraffe walked up the mountain
A graph showing Probabilities of predicting Precessor and Successor
characters (string Significance) reflects true meaning
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 17
HUMAN Placental ribonuclease inhibitor
FREQUENCY
SIGNIFICANCE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 18
3. Scale and Speed
Unique method for structuring data leads to
Ultra-high-throughput applications becoming routinely accessible
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 19
What makes Synamatix UNIQUE - 3
All Eukaryotes
All Prokaryotes
Proteomics
All genomes!!
Human
Any data
VirusAll
Protein Interactions
Array data
proteomes!!
Mouse
Multiples….
Array analysis
Text
Phylogenetics
Sequence data
Non-Sequence data
3rd Party
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 20
Multi-genome scalability – flat file db
10
Genome 10 – 99.9%
9
Genome 9 – 99.9%
8
Genome 8 – 99.9%
7
Size of database
Genome 7 – 99.9%
6
Genome 6 – 99.9%
5
Genome 5 – 99.9%
4
Genome 4 – 99.9%
3
Genome 3 – 99.9%
2
Genome 2 – 99.9%
1
Genome 1
2
4
6
Number of Human genome copies
Copyright © 2004 Synamatix sdn bhd (538481-U)
8
10
Slide 21
Multi-genome scalability – SynaBASE
10
9
8
7
Size of database
6
5
4
Genome 10 – 99.9%
Genome 9 – 99.9%
Genome 8 – 99.9%
Genome 7 – 99.9%
Genome 6 – 99.9%
Genome 5 – 99.9%
3
Genome 4 – 99.9%
2
Genome 3 – 99.9%
Genome 2 – 99.9%
1
Genome 1
2
4
6
Number of Human genome copies
Copyright © 2004 Synamatix sdn bhd (538481-U)
8
10
Slide 22
Analysis speed scales at logn base 2
Speed milliseconds
1000
900
Conventional
800
700
600
500
400
300
200
100
1 10
100
Size of database giga bp
Copyright © 2004 Synamatix sdn bhd (538481-U)
1000
Slide 23
What makes Synamatix UNIQUE - 4
Massively Parallel Single
Molecule Sequencing analysis
Real-time Proteomics
Comparative genomics
Probe design / testing
Personalised medicine
Clinical Diagnostics
Ultra High Throughput (UHT)
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 24
Architecture
3rd party
Applications
SUITE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 25
Architecture
Users
Custom /
Windows
Applications
Linux
Linux
Itanium
Java
C++
Application
SUITE
Servers
WWW Interface
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 26
Summary
Unique pattern network dB
Maintains patterns and their relationships
Able to derive Significance from data “a priori”
Self learning mechanism
Accuracy
Developed world’s 1st genomics platform capable
of addressing demanding new applications:
Scalable and efficient ultra volume storage
Ultra-high-throughput genome analysis
Personalised medicine and the $1000 genome
Copyright © 2004 Synamatix sdn bhd (538481-U)
“The World's 1st
Ultra High throughput Genomics Data Platform”
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 2
Introductions
Robert George Hercus – CTO/MD
Over 30 years IT experience
Pioneered many large-scale IT
projects
“Language of Biology” basis of
Synamatix
Synamatix 1 of 4 companies in
group: Linguamatix, Neuramatix
and Viramatix
Interests: Linguistics, Genomics,
Artificial Intelligence, Neuronal
Networks
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 3
What we are NOT….what we ARE..
NOT ABOUT:
Data content
Flat file, hierarchical or Relational Databases
Replacement of existing browser/mapping tools: e.g.EnsEMBL
Just a Blast alternative……..
Applications – IP is within SynaBASE
ARE ABOUT:
NEW CONCEPT: Proprietary pattern based approaches to construct
1st “structured-network data system” for biological data
ITERATION, SPEED and EXTENSIBILITY: enable single data system
for a wide array of biological data – enabling enterprise wide
capability
Providing tools, Know-how and Technologies to assist in
understanding and ultimately defining the “language” of
biological data – FUNCTIONAL GENOMICS
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 4
Intelligence… not
brute force
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 5
Syntax and Semantics
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 6
Synamatix
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 7
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 8
Basis of IP – Human Brain>SynaBASE
This is
SynaBASE:
Identifies
and learns
patterns
without
human
supervision
or training
sets.
Synabase
maintains
patterns &
their
relationships
Highly
efficient
data
structures
and
relationships
between
data
elements
are
constructed
Copyright © 2004 Synamatix sdn bhd (538481-U)
Applications
becomes
more
accurate and
efficient as
more data is
added
Slide 9
A structural database for genomics data, SynaBASE*
*patents pending
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 10
4
unique features
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 11
1. Patterns and structures
Finds, Stores, Relates & Structures
PATTERNS,
not FLAT FILES
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 12
What makes Synamatix UNIQUE - 1
What do we
know about
data ?
There is no evolutionary need to
preserve non-functional sequence
patterns
Evolution requires the
conservation of patterns which are
at least functionally equivalent, or
functionally better
Similarity
of DATA
Common PATTERNS and
functionality
Significant patterns and their
relationships are extended and
maintained by SynaBASE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 13
A conventional flat-file database
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 14
SynaBASE * - A structured network database
*patents pending
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 15
2. Significance and Frequency
SynaBASE automatically learns and
maintains the significance of patterns and
data
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 16
Significance
Fixed length K-mers are inappropriate
The elephant and the giraffe walked up the mountain
A graph showing Frequency of “string (word)” patterns
in a sentence does not reflect meaning
The elephant and the giraffe walked up the mountain
A graph showing Probabilities of predicting Precessor and Successor
characters (string Significance) reflects true meaning
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 17
HUMAN Placental ribonuclease inhibitor
FREQUENCY
SIGNIFICANCE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 18
3. Scale and Speed
Unique method for structuring data leads to
Ultra-high-throughput applications becoming routinely accessible
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 19
What makes Synamatix UNIQUE - 3
All Eukaryotes
All Prokaryotes
Proteomics
All genomes!!
Human
Any data
VirusAll
Protein Interactions
Array data
proteomes!!
Mouse
Multiples….
Array analysis
Text
Phylogenetics
Sequence data
Non-Sequence data
3rd Party
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 20
Multi-genome scalability – flat file db
10
Genome 10 – 99.9%
9
Genome 9 – 99.9%
8
Genome 8 – 99.9%
7
Size of database
Genome 7 – 99.9%
6
Genome 6 – 99.9%
5
Genome 5 – 99.9%
4
Genome 4 – 99.9%
3
Genome 3 – 99.9%
2
Genome 2 – 99.9%
1
Genome 1
2
4
6
Number of Human genome copies
Copyright © 2004 Synamatix sdn bhd (538481-U)
8
10
Slide 21
Multi-genome scalability – SynaBASE
10
9
8
7
Size of database
6
5
4
Genome 10 – 99.9%
Genome 9 – 99.9%
Genome 8 – 99.9%
Genome 7 – 99.9%
Genome 6 – 99.9%
Genome 5 – 99.9%
3
Genome 4 – 99.9%
2
Genome 3 – 99.9%
Genome 2 – 99.9%
1
Genome 1
2
4
6
Number of Human genome copies
Copyright © 2004 Synamatix sdn bhd (538481-U)
8
10
Slide 22
Analysis speed scales at logn base 2
Speed milliseconds
1000
900
Conventional
800
700
600
500
400
300
200
100
1 10
100
Size of database giga bp
Copyright © 2004 Synamatix sdn bhd (538481-U)
1000
Slide 23
What makes Synamatix UNIQUE - 4
Massively Parallel Single
Molecule Sequencing analysis
Real-time Proteomics
Comparative genomics
Probe design / testing
Personalised medicine
Clinical Diagnostics
Ultra High Throughput (UHT)
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 24
Architecture
3rd party
Applications
SUITE
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 25
Architecture
Users
Custom /
Windows
Applications
Linux
Linux
Itanium
Java
C++
Application
SUITE
Servers
WWW Interface
Copyright © 2004 Synamatix sdn bhd (538481-U)
Slide 26
Summary
Unique pattern network dB
Maintains patterns and their relationships
Able to derive Significance from data “a priori”
Self learning mechanism
Accuracy
Developed world’s 1st genomics platform capable
of addressing demanding new applications:
Scalable and efficient ultra volume storage
Ultra-high-throughput genome analysis
Personalised medicine and the $1000 genome
Copyright © 2004 Synamatix sdn bhd (538481-U)