Sample Senior Thesis Poster (Powerpoint #1)
Sample Senior Thesis Poster (Powerpoint #1)
Assessing Genetic Diversity in the Rare Sandhill Endemic Erysimum teretifolium
Using Microsatellites and Next-Generation Sequencing
Island biogeography predicts
that populations occupying
island-like habitats near genetic
reservoirs will contain higher
levels of diversity than more
isolated populations (Vellend
2003). Genetic structure within
such islands then reflects isolation
by distance theory (Wright 1943).
Genetic diversity is also predicted
to be positively correlated with
population size (Leimu et al.
The Zayante Sandhills of Santa
Fig. 1. The Ben Lomond Wallflower. Erysimum
Cruz, California, are island-like
teretifolium occupies inland sandhills of Santa Cruz Co. (A)
xeric habitats separated by mesic
which have been largely destroyed by sand quarrying (B).
redwoods and mixed evergreen
forests. These unique habitats are home to many endemic plant and animal species,
including the Ben Lomond Wallflower (Erysimum teretifolium; Fig. 1A). This naturally
patchy habitat is threatened by the sand quarrying industry (Fig. 1B) and residential
development. An unknown number of populations of E. teretifolium remain, several of
which contain fewer than 100 individuals.
Using two distinct methods, microsatellite analysis and Next-Generation sequencing
(NGS), this project investigates the distribution of genetic diversity within and among
eight extant populations to determine whether E. teretifolium’s island-like habitat
influences its genetic distribution and to guide future conservation priorities. Such data
will help land managers determine appropriate seed sources for establishing new
populations of E. teretifolium. In particular, this project addresses the complexity of
analyzing microsatellite data from a hexaploid plant species and discusses whether NGS
may provide a viable alternative to estimating genetic diversity in such taxa.
• 25 individuals per population were pooled into a single barcode.
• 4 populations in total were barcoded and sequenced on a single lane of Illumina HiSeq
(shared with a total of 8 barcodes/lane).
• Is there discernible population structure in E. teretifolium?
• Is the distribution of genetic diversity within and among populations consistent with
this species’ insular habitat?
• Do population size or geographic isolation impact genetic diversity within populations?
• Can NGS complement traditional microsatellite approaches for conservation genetics?
Samples were collected from 186 individuals representing 8 populations of E. teretifolium (11-32
individuals per population). DNA was extracted with a NucleoSpin Plant II kit using lysis buffer 1
(Machery & Nagel). PCR amplification was carried out on 3 microsatellite loci (18 total alleles)
developed for the European E. mediohispanicum according to the methods of Muñoz-Pajares et al.
(2011). Alleles were separated on an ABI3730 with a LIZ600 size standard, and lengths were
determined using PeakScanner Software v1.0 (Life Technologies).
Due to hexaploidy in E. teretifolium, we could not confidently determine genotypes, so we
analyzed the data with the restriction model in Structure (Pritchard et al. 2000). A range of population
clusters (k = 1-10) were tested using location priors and allowing for admixture (ngen=106, 5 replicates
per k-value, burnin=5*105, lambda=0.51202, determined empirically). The number of population
clusters that best fit the data was calculated using the Δk method of Evanno et al. (2005) in Structure
Harvester (Earl et al. 2011). Runs with identical parameters were conducted including samples from
the closely related wallflower, E. capitatum ssp. angustatum (ERCAAN), to ensure the model could
differentiate these taxa. Average group assignments for E. teretifolium were used for later analyses.
Samples were analyzed in Arlequin v3.5 (Excoffier et al. 2005) for AMOVA and FST using
groupings predicted by Structure. The total number of differences between each pair of individuals was
calculated in PAUP v4.0 (Swofford 2002). The distribution of genetic distances within and among
populations was calculated from the resulting distance matrix. Geographic distances were determined
in Google Earth based on GPS coordinates. A Mantel nonparametric test was used to compare the
geographic and genetic distance matrices (Liedloff 1999). Population size estimates were based on
censuses of juveniles, flowering individuals, and fruiting individuals at each site. Remaining analyses
were carried out in Excel.
by Illumina HiSeq
Fig. 3. Average probability of group assignments. Pie diagrams depict the average group
assignment probabilities in each population for the two genetic clusters identified by Structure
for E. teretifolium.
• Two primary geographic clusters emerge based on Structure assignments:
Northwest/South (QH, BD, AZA/Hwy17), and Central (OLY, GEY, SHGW) with MTH
acting as a bridge between the Central and South groupings.
• Groupings may be arising from a central versus peripheral division
De Novo Assembly
Median Depth of
Contigs Blasted to A. thaliana
Fig. 8. De novo
contigs for four
populations of E.
a range of k-mer
lengths. All four of
the longest contigs
are similar to
known A. thaliana
contain SNPs and
Fig. 4. Analysis of Molecular
Variance. Populations assigned to
groups based on average group
assignment probability from Structure
k=2 categories without ERCAAN. 82%
of the variation exists within
y = -4E-05x + 3.2966
R² = 0.096
Geographic distance (m)
Sources of Genetic Variation
Next-Generation Sequencing Approach
Average Genetic Distance
Artwork by Edward Rooks
Julie A. Herman, Khaaliq DeJan, Justen B. Whittall
Santa Clara University, CA
Fig. 5. Isolation by distance. Genetic
distances are averages of all pairwise
comparisons of individuals for each
pairwise comparison of populations. No
correlation (Mantel test: 104 iterations, 8x8
half matrix, randomization, r = -0.3098,
• Most of the genetic diversity exists within populations and correlates weakly with
• Continental islands such as the Zayante sandhills may not act the same as oceanic islands,
as seen in the case of E. teretifolium, which does not fit an isolation by distance model.
• 24 of 28 comparisons between populations had Fst significantly greater than 0 (p<0.05).
• Hwy17, one of the smallest, most disturbed, and isolated populations, has the highest
• AZA, one of the largest, least disturbed, and central populations, has the lowest Fst.
• Although AMOVA shows most of the variation is contained within populations, Fst
reveals that most populations are significantly different from one another.
• There is no correlation between geographic distance and genetic distance.
• These results suggest that an island-like model is inappropriate to describe these
populations although they superficially physically resemble island habitats
Team Wallflower, Summer 2012
Cindy Dick, Miranda Melen, & Devin Wakefield at SCU provided invaluable
assistance, as well as Inés Casimiro-Soriguer from Universidad Pablo de Olavide
Charles Nicolet from USC’s Epigenome Center provided critical assistance with
the NGS library preps & sequencing.
Jodi McGraw, Ingrid Parker, Val Haley & Terris Kasteen provided essential field
Funding was provided by an SCU ALZA Scholarship to JH and Section VI funds
from the California Department of Fish and Wildlife to JW.
Earl D & von Holdt B (2011). Structure harvester: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation
Evanno G, Regnaut S, & Goudet J (2005) Detecting the number of clusters of individuals using the software Structure: a simulation study. Molecular Ecology
Excoffier, Laval LG, & Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online
Leimu R, Mutikainen P, Koricheva J, Fischer M (2006) How general are positive relationships between plant population size, fitness, and genetic variation? Journal of
Liedloff, AC (1999) Mantel Nonparametric Test Calculator. Version 2.0. School of Natural Resource Sciences, Queensland University of Technology, Australia.
Muñoz-Pajares AJ, Herrador MB, Abdelaziz M, Picó FX, Sharbel TF, Gómez JM &Perfectti F (2011) Characterization of microsatellite loci in Erysimum
mediohispanicum (Brassicaceae) and cross-amplification in related species. American Journal of Botany e287-e289.
Pritchard JK, Stephens M, & Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945-959.
Swofford, D L (2002) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.
Vellend M (2003) Island Biogeography of Genes and Species. The American Naturalist 162(3):358-365.
Wright S (1943). Isolation by distance. Genetics 28(2), 114.