Transcript [Poster title] - New Mexico State University
Genomic Region Analysis of Microsatellite Loci in
Arabidopsis thaliana
Janae Gonzales, Teresa Leslie, Efren Miranda, Mirna Rocha NIH RISE Genome Discovery Workshop, New Mexico State University, Las Cruces, NM, 88003
Abstract
Microsatellite loci frequently occur throughout a genome in various locations. A correlation between function and location of microsatellites is not completely understood. The completed genome of
Arabidopsis thaliana
allowed for exploration of the relationships between microsatellite classifications with the various regions of the genome. 203 sample sequences containing microsatellites from the A. thaliana genome were analyzed and classified by canonical motif, the length of the motif, and the location on the genome. After analysis, it was found that the most common motif length in exon regions was the trinucleotide repeat.
Introduction
Microsatellite loci, or simple sequence repeats (SSRs), consist of 1-6 base pair units and are abundantly interspersed throughout a genome. They are generally variable among individuals within a species, allowing for these sequences to be useful in genetic analysis within a population. The great variability found in microsatellites can be attributed to it’s increased mutation rate.
A. thaliana
microsatellite loci data has been shown to be capricious (Innan
et al.
1997) and copious (Casacuberta
et al.
2000). The entire genome of
Arabidopsis thaliana
has been sequenced and annotated allowing for undemanding genetic research.
A. thaliana
has become a model organism in plant biology due to it is extensive genetic map and relatively small genome.
Methods
Arabidopsis Thaliana DNA isolated* (Alexander
et al. 2006
) Adapters addition and enrichment* Sequencing (Roche protocol)* Tandem Repeats Finder software (G. Benson 1999) Putative loci grouped based on flanking regions A consensus sequence was formed from the grouping BLAST
A. Thaliana
Genome Database Results were categorized into location, repeat length, canonical motif, and number of copies in the genome *The DNA information from
Arabidopsis thaliana
was obtained from C.D. Bailey Collection #69, Cornell University.
Results
o Of the 37 dinucleotide repeats found in the
A. Thaliana
data, the majority were found in intergenic regions and untranslated 3’ regions (Figure 1). A smaller amount, 14%, were found in exons. o There were 131 trinucleotide repeats found in the
A. Thaliana
data. The majority, 64%, was found in exon regions of the genome (Figure 2). About 13% was found in intergenic regions. o Of the 33 samples of tetranucleotides, almost half were found in intergenic regions (Figure 3). Around 22% were found in exons. o The most common type of repeat was a trinucleotide repeat in exons only. For example, this would be an ATC repeat. On the other hand, the di-, tetra-, and pentanucleotide repeats all made up about 14% of the total nucleotide repeats in exons.
11%
Intron
4% 13%
Intron
14% 27% 16% 24% 8%
Exon Intergenic Pseudo Untranslated 3'
Figure 1.
Percentages of Dinucleotide Repeats in Different Regions of the
A. Thaliana
Genome
Untranslated 5' Exon
22% 64%
Intergenic Pseudo
47% 13% 2% 9% 8%
Intron Exon Intergenic Pseudo Untranslated 3'
Figure 2.
Percentages of Trinucleotide Repeats in Different Regions of the
A. Thaliana
Genome
Untranslated 5' Untranslated 3'
4% 7% 7%
Untranslated 5'
Discussion
It is very common for microsatellite loci to increase or even decrease during DNA replication because of polymerase slippage (Eckert
et al.
2002). Previous data has shown that dinucleotide microsatellites mutate at a higher rate than other microsatellites (Chakraborty
et al.
1997). Only 14% of dinucleotide repeats were found to be in exon regions. This does not mean, however, that the dinucleotides are not affecting the genome. It has been found that dinucleotide repeats in untranslated 3’ regions can be associated with rheumatoid arthritis in humans (Martin-Donaire
et al.,
2007). In
A. Thaliana
, 24% of the dinucleotide repeats were found in untranslated 3’ regions, but their affect on function is unknown. While similar to size in sample number as the dinucleotide, tetranucleotides are more stable and do not generally mutate as readily.
Figure 3.
Percentages of Tetranucleotide Repeats in Different Locations of the
A. Thaliana
Genome
120 100 80 60 40 Trinucleotide motifs are the most common repeats in exons (Sutherland
et al.
1995); this can be beneficial to an organism because while mutation may occur, it will not generally cause a frame-shift mutation. A frame-shift mutation can be detrimental to an organism because it alters the translation product. A trinucleotide repeat has a higher potential to cause positive variation in the genome.
20 0 Frequency of Repeats in Exons Percentage (%) Di 5 4 Tri 108 Tetra 9 86,4 7,2
Type of Nucleotide Repeat
Figure 4.
Frequency and Percentages of Nucleotide Repeats in exons.
Penta 3 2,4
Acknowledgements
The authors acknowledge the National Institute of Health for funding the Genomes Discovery Workshop. Thank you to Dr. Gong Xin Yu and Alexander Tchourbanov for sharing their knowledge in Genomics Analysis and Bioinformatics, and Dr. Donovan Bailey for providing our data source. We would especially like to thank Dr. Brook Milligan, Nabeeh Hasan, and Erin Punke for their advisement and support. Also, a huge thanks to our fellow interns who helped with problem solving, insight, and constructive criticism. This program was supported by NMSU RISE to Excellence (NIH NIGMS MBRS Grant #R25GM061222), MRI: Acquisition of Genomic Sequencing Instrumentation (NSF #0821806), and CREST: Center for Research Excellence in Bioinformatics and Computational Biology (NSF 0420407).
References
• Casacuberta, E., P. Puigdomenech and A. Monfort, 2000. Distribution of microsatellites in relation to coding sequences within the Arabidopsis thaliana genome. Plant Sci. 157: 97-104 • Eckert, K.A., A. Mowery and S.E. Hile, 2002 Misalignment-mediated DNA polymerase beta mutations: comparison of microsatellite and frame-shift error rates using a forward mutation assay. Biochemistry 41: 10490-10498 • Gertz, E. M. "BLAST Arabidopsis Thaliana Sequences."
Http://www.ncbi.nlm.nih.gov/
. NCBI, 27 June 2001. Web. 30 June 2010.
• Innan, H., R. Terauchi and N.T. Miyashita, 1997. Microsatellite polymorphism in natural populations of the wild plant Arabidopsis thaliana. Genetics 146: 1441 1452.
• Marriage, T. N., Hudman, S., Mort, M. E., Orive, M. E., Shaw, R. G., & Kelly, J. K. (2009). Direct estimation of the mutation rate at dinucleotide microsatellite loci in Arabidopsis thaliana (Brassicaceae).
Heredity, 103
(4), 310-317. doi: 10.1038/hdy.2009.67
• Sutherland, G. R., and Richards, R. I. 1995. Simple tandem DNA repeats and human genetic disease. Proc. Natl. Acad. Sci 92:3636-3641 • Symonds, V. V., & Lloyd, A. M. (2003). An analysis of microsatellite loci in Arabidopsis thaliana: Mutational dynamics and application.
Genetics, 165
(3), 1475-1488.