Glimmer and GeneMark

download report

Transcript Glimmer and GeneMark

Glimmer and GeneMark
• Glimmer is a system for finding genes in microbial
• The system works by creating a variable-length
Markov model from a training set of genes and
then using that model to attempt to identify all
genes in a given DNA sequence.
• Local Installation on
– /l/glimmer3.02/bin
• All the relevant code is in
– /l/glimmer3.02/bin/
• I have added E.coli data in the following
directory to play with:
– /tmp/ecoli
• Running Glimmer involves a two-step process
1. Building the model using known genes
– /l/glimmer3.02/bin/build-icm -r run1.icm < /tmp/ecoli/ecoli-genes.fasta
2. Make gene predictions using glimmer3 program
– /l/glimmer3.02/bin/glimmer3 -o50 -g110 -t30 /tmp/ecoli/ecoli.fna run1.icm
• For more details please refer:
– /l/glimmer3.02/glim302notes.pdf
• GeneMark includes a suite of software tools
for predicting protein coding genes in various
types of genomes
• The algorithms use Hidden Markov models
reflecting the "grammar" of gene organization.
• Local Installation on burrow:
– /l/gmsuite/
• You can run the code for prokaryotic gene
prediction using the following command
– /l/gmsuite/ --prok --format GFF /tmp/ecoli/ecoli.fna
• For more details please refer to:
– /l/gmsuite/README.GeneMarkSuite