IU OREChem Summary Slides Marlon Pierce, Geoffrey Fox, Sashikiran Challa IU’s ORE-CHEM Pipeline Harvest NIH PubChem for 3D Structures Convert Gaussian Output to CML Convert CML to RDF->OREChem Convert PubChem XML to CML Submit.

Download Report

Transcript IU OREChem Summary Slides Marlon Pierce, Geoffrey Fox, Sashikiran Challa IU’s ORE-CHEM Pipeline Harvest NIH PubChem for 3D Structures Convert Gaussian Output to CML Convert CML to RDF->OREChem Convert PubChem XML to CML Submit.

IU OREChem Summary Slides
Marlon Pierce, Geoffrey Fox,
Sashikiran Challa
IU’s ORE-CHEM Pipeline
Harvest NIH
PubChem for 3D
Structures
Convert
Gaussian
Output to CML
Convert CML to
RDF->OREChem
Convert
PubChem XML
to CML
Submit Jobs to
TeraGrid with
Swarm
Insert RDF into
RDF Triple Store
Convert CML to
Gaussian Input
Goal is to create a
public, searchable
triple store populated
with ORE-CHEM data
on drug-like
molecules.
Convert
PubChem XML
to CML
Conversions are done with Jumbo/CML tools from Peter Murray Rust’s
group at Cambridge. Swarm is a Web service capable of managing 10,000’s
of jobs on the TeraGrid. We are developing a Dryad version of the pipeline.
Swarm-Grid
• Swarm considers
traditional Grid HPC
cluster are suitable for
the high-throughput
jobs.
Swarm-Grid
Standard Web Service Interface
QBETS
Web
Service
Hosted by UCSB
– Prioritizes the resources
with QBETS, INCA
Resource Ranking Manager
Data Model
Manager
Fault Manager
User A’s Job Board
Local
RDMBS
– Parallel jobs (e.g. MPI
jobs)
– Long running jobs
• Resource Ranking
Manager
Request Manager
Job Queue
Job Distributor
MyProxy
Server
Hosted by
TeraGrid Project
Grid HPC/Condor pool Resource
Connector
Condor(Grid/Vanilla) with Birdbath
• Fault Manager
– Fatal faults
– Recoverable faults
Grid HPC
Grid HPC
Clusters
Grid HPC
Clusters
Grid HPC
Clusters
Clusters
Condor
Cluster