Transcript Document
GDR/CottonGen: Converting legacy sites to Tripal Sook Jung, Jing Yu, Taein Lee, Chun-Huai Cheng, Stephen Ficklin, Dorrie Main Talk Overview • Introduction of GDR • Moving to Chado/Tripal o GDR o CottonGen • How to store data in Chado • Tripal modules developed • Future Directions GDR (2002 ~) • Genomics Germplasm Genetics Integrated Data & Tools • Diversity • Breeding Genomics o EST Unigenes o WGS and annotation Genetics o Markers and maps o QTL o Molecular diversity Breeding o Genotypic data o Phenotypic data o Germplasm data GDR’s Progress CottonDB to CottonGen GDR in Drupal (2009) GDR in inhouse schema (2002 ~) Genetic data Breeders moved to Toolbox ND Chado developed schema (Not in Tripal) (2013) Tripal developed (2010-2011) instance & created Breeding (Genomic data in data) Chado (2010) (2010) GDR New Custom/Extention Tripal Modules for gene, marker, QTL, genotype, publication (2013) in Tripal CottonDB to CottonGen Tripal instance created CottonDB on WSU servers (Oct 2011) Develop ICGI website Add CottonDB Tools data in Chado Phenotype, QTL, Gene, Publication search marker, pages added (2013) search pages (Oct 2012) CottonGen Released (1st year) Chado: Modular, Generic and Ontology-driven schema Feature Feature_relationship Feature_relationship_i d Subject_id Object_id Type_id AbcmRNA part_of Abc-gene Feature_id Name Uniquename Type_id Organism_id residues gene, mRNA, marker, QTL, etc Featureprop Featureprop_id Feature_id Type_id Value rank Repeat_motif Product_size cvterm cvterm_id Name definition cv_id Dbxref_id cv cv_id Name definition Sequence Ontology, Gene Ontology, etc Storing Stock (from samples to population; pedigree) Population, cultivar, breeding line, clone, sample, etc stock_relationship Feature_relationship_id Subject_id Object_id Type_id Gala Maternal_parent_of Sonya pedigree Gala-001 sample_of Gala stock stock_id Name Uniquename Type_id Organism_id residues stockprop stockprop_id stock_id Type_id value cvterm cvterm_id Name definition cv_id Dbxref_id Description, population_siz e Storing phenotype data (from measurements to projects) project NE_project nd_experiment stock Feature_id Name Uniquename Type_id Organism_id residues NE_stock Nd_experiment_id Nd_geolocation_id Type_id Featureprop_id Feature_id Type_id value NE_phenoty pe Phenotyping Genotyping Cross_experiment project_relationshi p phenotype phenotype_id Uniquename value attr_id cvterm cvterm_id Name definition cv_id Dbxref_id Genotypic data integrated with genomic/genetic data map stock project nd_experiment Explore sequence s around marker in GBrowse Feature Nd_experiment_id Nd_geolocation_id Type_id NE_genotype genotype genotype_id name Uniquename description uniquename: CPSCT038_190|192 description: 190:192 Feature_id Name Uniquename Type_id Organism_id residues feature_genotype Uniquename:CPSCT038 Type:microsatellite Tripal Modules Developed (Custom modules) • Gene/Sequence Module • Genetic Module o Marker Search o QTL Search o Genotype Search Gene/Sequence Module Gene Search Results 13 Genetic Module (Marker Search) Link to the Genotype (Diversity) module Genetic Module (Diversity Search) Genetic Module (Diversity Search) Long Form Wide Form Genetic Module (QTL/MTL Search) QTL to map QTL to Germplasm QTL to Marker to Diversity data Future Directions • Make the current modules available o With a set of controlled vocabularies o Bulk loader templates • Further refinement of the modules o QTL: add graphic interface to view the QTLs in the genome o Further develop diversity module (integrate with phenotypic diversity and germplasm module) o Germplasm (search page, integrate with image module, etc) o Data transformation functionality • Introduce flexibility to the modules o Allow adding users’ own CV o Options to display certain data according to the CV Acknowledgements • Main Lab team members Dorrie Main Taein Lee Stephen Ficklin Jing Yu ChunHuai Cheng Ping Zheng Anna Blenda Sushan Ru • GDR Project coPIs- Dorrie Main (PI), Bert Abbott, Cameron Peace, Kate Evans, Des Layne, Nnadozie Oraguzie, Mercy Olmstead, Fred Gmitter Jr., the RosBREED teams • CottonGen – Don Jones, Richard Percy • Rosaceae, Cotton and Bioinformatics Community • USDA NIFA SCRI, NSF Plant Genome Program, MARS, USDAARS, Washington Tree Fruit Research Commission, Cotton Inc, WSU, Clemson University, University of Florida, Boyce Thompson Institute, Texas A&M