[Poster title] - New Mexico State University

Download Report

Transcript [Poster title] - New Mexico State University

Genomic Region Analysis of Microsatellite Loci in

Arabidopsis thaliana

Janae Gonzales, Teresa Leslie, Efren Miranda, Mirna Rocha NIH RISE Genome Discovery Workshop, New Mexico State University, Las Cruces, NM, 88003

Abstract

Microsatellite loci frequently occur throughout a genome in various locations. A correlation between function and location of microsatellites is not completely understood. The completed genome of

Arabidopsis thaliana

allowed for exploration of the relationships between microsatellite classifications with the various regions of the genome. 203 sample sequences containing microsatellites from the A. thaliana genome were analyzed and classified by canonical motif, the length of the motif, and the location on the genome. After analysis, it was found that the most common motif length in exon regions was the trinucleotide repeat.

Introduction

Microsatellite loci, or simple sequence repeats (SSRs), consist of 1-6 base pair units and are abundantly interspersed throughout a genome. They are generally variable among individuals within a species, allowing for these sequences to be useful in genetic analysis within a population. The great variability found in microsatellites can be attributed to it’s increased mutation rate.

A. thaliana

microsatellite loci data has been shown to be capricious (Innan

et al.

1997) and copious (Casacuberta

et al.

2000). The entire genome of

Arabidopsis thaliana

has been sequenced and annotated allowing for undemanding genetic research.

A. thaliana

has become a model organism in plant biology due to it is extensive genetic map and relatively small genome.

Methods

Arabidopsis Thaliana DNA isolated* (Alexander

et al. 2006

) Adapters addition and enrichment* Sequencing (Roche protocol)* Tandem Repeats Finder software (G. Benson 1999) Putative loci grouped based on flanking regions A consensus sequence was formed from the grouping BLAST

A. Thaliana

Genome Database Results were categorized into location, repeat length, canonical motif, and number of copies in the genome *The DNA information from

Arabidopsis thaliana

was obtained from C.D. Bailey Collection #69, Cornell University.

Results

o Of the 37 dinucleotide repeats found in the

A. Thaliana

data, the majority were found in intergenic regions and untranslated 3’ regions (Figure 1). A smaller amount, 14%, were found in exons. o There were 131 trinucleotide repeats found in the

A. Thaliana

data. The majority, 64%, was found in exon regions of the genome (Figure 2). About 13% was found in intergenic regions. o Of the 33 samples of tetranucleotides, almost half were found in intergenic regions (Figure 3). Around 22% were found in exons. o The most common type of repeat was a trinucleotide repeat in exons only. For example, this would be an ATC repeat. On the other hand, the di-, tetra-, and pentanucleotide repeats all made up about 14% of the total nucleotide repeats in exons.

11%

Intron

4% 13%

Intron

14% 27% 16% 24% 8%

Exon Intergenic Pseudo Untranslated 3'

Figure 1.

Percentages of Dinucleotide Repeats in Different Regions of the

A. Thaliana

Genome

Untranslated 5' Exon

22% 64%

Intergenic Pseudo

47% 13% 2% 9% 8%

Intron Exon Intergenic Pseudo Untranslated 3'

Figure 2.

Percentages of Trinucleotide Repeats in Different Regions of the

A. Thaliana

Genome

Untranslated 5' Untranslated 3'

4% 7% 7%

Untranslated 5'

Discussion

It is very common for microsatellite loci to increase or even decrease during DNA replication because of polymerase slippage (Eckert

et al.

2002). Previous data has shown that dinucleotide microsatellites mutate at a higher rate than other microsatellites (Chakraborty

et al.

1997). Only 14% of dinucleotide repeats were found to be in exon regions. This does not mean, however, that the dinucleotides are not affecting the genome. It has been found that dinucleotide repeats in untranslated 3’ regions can be associated with rheumatoid arthritis in humans (Martin-Donaire

et al.,

2007). In

A. Thaliana

, 24% of the dinucleotide repeats were found in untranslated 3’ regions, but their affect on function is unknown. While similar to size in sample number as the dinucleotide, tetranucleotides are more stable and do not generally mutate as readily.

Figure 3.

Percentages of Tetranucleotide Repeats in Different Locations of the

A. Thaliana

Genome

120 100 80 60 40 Trinucleotide motifs are the most common repeats in exons (Sutherland

et al.

1995); this can be beneficial to an organism because while mutation may occur, it will not generally cause a frame-shift mutation. A frame-shift mutation can be detrimental to an organism because it alters the translation product. A trinucleotide repeat has a higher potential to cause positive variation in the genome.

20 0 Frequency of Repeats in Exons Percentage (%) Di 5 4 Tri 108 Tetra 9 86,4 7,2

Type of Nucleotide Repeat

Figure 4.

Frequency and Percentages of Nucleotide Repeats in exons.

Penta 3 2,4

Acknowledgements

The authors acknowledge the National Institute of Health for funding the Genomes Discovery Workshop. Thank you to Dr. Gong Xin Yu and Alexander Tchourbanov for sharing their knowledge in Genomics Analysis and Bioinformatics, and Dr. Donovan Bailey for providing our data source. We would especially like to thank Dr. Brook Milligan, Nabeeh Hasan, and Erin Punke for their advisement and support. Also, a huge thanks to our fellow interns who helped with problem solving, insight, and constructive criticism. This program was supported by NMSU RISE to Excellence (NIH NIGMS MBRS Grant #R25GM061222), MRI: Acquisition of Genomic Sequencing Instrumentation (NSF #0821806), and CREST: Center for Research Excellence in Bioinformatics and Computational Biology (NSF 0420407).

References

• Casacuberta, E., P. Puigdomenech and A. Monfort, 2000. Distribution of microsatellites in relation to coding sequences within the Arabidopsis thaliana genome. Plant Sci. 157: 97-104 • Eckert, K.A., A. Mowery and S.E. Hile, 2002 Misalignment-mediated DNA polymerase beta mutations: comparison of microsatellite and frame-shift error rates using a forward mutation assay. Biochemistry 41: 10490-10498 • Gertz, E. M. "BLAST Arabidopsis Thaliana Sequences."

Http://www.ncbi.nlm.nih.gov/

. NCBI, 27 June 2001. Web. 30 June 2010. .

• Innan, H., R. Terauchi and N.T. Miyashita, 1997. Microsatellite polymorphism in natural populations of the wild plant Arabidopsis thaliana. Genetics 146: 1441 1452.

• Marriage, T. N., Hudman, S., Mort, M. E., Orive, M. E., Shaw, R. G., & Kelly, J. K. (2009). Direct estimation of the mutation rate at dinucleotide microsatellite loci in Arabidopsis thaliana (Brassicaceae).

Heredity, 103

(4), 310-317. doi: 10.1038/hdy.2009.67

• Sutherland, G. R., and Richards, R. I. 1995. Simple tandem DNA repeats and human genetic disease. Proc. Natl. Acad. Sci 92:3636-3641 • Symonds, V. V., & Lloyd, A. M. (2003). An analysis of microsatellite loci in Arabidopsis thaliana: Mutational dynamics and application.

Genetics, 165

(3), 1475-1488.