Transcript Document

GDR/CottonGen: Converting
legacy sites to Tripal
Sook Jung, Jing Yu, Taein Lee, Chun-Huai
Cheng, Stephen Ficklin, Dorrie Main
Talk Overview
• Introduction of GDR
• Moving to Chado/Tripal
o GDR
o CottonGen
• How to store data in Chado
• Tripal modules developed
• Future Directions
GDR (2002 ~)
•
Genomics
Germplasm
Genetics
Integrated
Data &
Tools
•
Diversity
•
Breeding
Genomics
o EST Unigenes
o WGS and
annotation
Genetics
o Markers and maps
o QTL
o Molecular diversity
Breeding
o Genotypic data
o Phenotypic data
o Germplasm data
GDR’s Progress
CottonDB to
CottonGen
GDR in
Drupal
(2009)
GDR in inhouse schema
(2002 ~)
Genetic
data
Breeders
moved to
Toolbox
ND
Chado
developed
schema
(Not in Tripal) (2013)
Tripal
developed (2010-2011)
instance
&
created
Breeding
(Genomic data in
data)
Chado
(2010)
(2010)
GDR
New Custom/Extention
Tripal Modules for
gene, marker, QTL,
genotype, publication
(2013)
in
Tripal
CottonDB to CottonGen
Tripal
instance
created
CottonDB on
WSU servers
(Oct 2011)
Develop
ICGI
website
Add
CottonDB Tools
data in
Chado
Phenotype, QTL,
Gene,
Publication search
marker,
pages added (2013)
search pages
(Oct 2012)
CottonGen
Released
(1st year)
Chado: Modular, Generic and Ontology-driven schema
Feature
Feature_relationship
Feature_relationship_i
d
Subject_id
Object_id
Type_id
AbcmRNA
part_of
Abc-gene
Feature_id
Name
Uniquename
Type_id
Organism_id
residues
gene, mRNA,
marker, QTL,
etc
Featureprop
Featureprop_id
Feature_id
Type_id
Value
rank
Repeat_motif
Product_size
cvterm
cvterm_id
Name
definition
cv_id
Dbxref_id
cv
cv_id
Name
definition
Sequence Ontology,
Gene Ontology, etc
Storing Stock (from samples to population; pedigree)
Population,
cultivar, breeding
line, clone, sample,
etc
stock_relationship
Feature_relationship_id
Subject_id
Object_id
Type_id
Gala
Maternal_parent_of
Sonya
pedigree
Gala-001
sample_of
Gala
stock
stock_id
Name
Uniquename
Type_id
Organism_id
residues
stockprop
stockprop_id
stock_id
Type_id
value
cvterm
cvterm_id
Name
definition
cv_id
Dbxref_id
Description,
population_siz
e
Storing phenotype data (from measurements to
projects)
project
NE_project
nd_experiment
stock
Feature_id
Name
Uniquename
Type_id
Organism_id
residues
NE_stock
Nd_experiment_id
Nd_geolocation_id
Type_id
Featureprop_id
Feature_id
Type_id
value
NE_phenoty
pe
Phenotyping
Genotyping
Cross_experiment
project_relationshi
p
phenotype
phenotype_id
Uniquename
value
attr_id
cvterm
cvterm_id
Name
definition
cv_id
Dbxref_id
Genotypic data integrated with genomic/genetic
data
map
stock
project
nd_experiment
Explore
sequence
s around
marker in
GBrowse
Feature
Nd_experiment_id
Nd_geolocation_id
Type_id
NE_genotype
genotype
genotype_id
name
Uniquename
description
uniquename:
CPSCT038_190|192 description:
190:192
Feature_id
Name
Uniquename
Type_id
Organism_id
residues
feature_genotype
Uniquename:CPSCT038
Type:microsatellite
Tripal Modules Developed
(Custom modules)
• Gene/Sequence Module
• Genetic Module
o Marker Search
o QTL Search
o Genotype Search
Gene/Sequence Module
Gene Search Results
13
Genetic Module (Marker Search)
Link to the Genotype
(Diversity) module
Genetic Module (Diversity Search)
Genetic Module (Diversity Search)
Long Form
Wide Form
Genetic Module (QTL/MTL Search)
QTL to map
QTL to Germplasm
QTL to Marker to Diversity data
Future Directions
• Make the current modules available
o With a set of controlled vocabularies
o Bulk loader templates
• Further refinement of the modules
o QTL: add graphic interface to view the QTLs in the genome
o Further develop diversity module (integrate with
phenotypic diversity and germplasm module)
o Germplasm (search page, integrate with image module,
etc)
o Data transformation functionality
• Introduce flexibility to the modules
o Allow adding users’ own CV
o Options to display certain data according to the CV
Acknowledgements
•
Main Lab team members
Dorrie Main
Taein Lee Stephen Ficklin Jing Yu ChunHuai Cheng Ping Zheng Anna Blenda Sushan Ru
• GDR Project coPIs- Dorrie Main (PI), Bert Abbott, Cameron
Peace, Kate Evans, Des Layne, Nnadozie Oraguzie, Mercy
Olmstead, Fred Gmitter Jr., the RosBREED teams
• CottonGen – Don Jones, Richard Percy
• Rosaceae, Cotton and Bioinformatics Community 
• USDA NIFA SCRI, NSF Plant Genome Program, MARS, USDAARS, Washington Tree Fruit Research Commission, Cotton
Inc, WSU, Clemson University, University of Florida, Boyce
Thompson Institute, Texas A&M