SAM-Grid Middleware - Rod Walker,ICL. SAM. JIM. RunJob. Conclusions. Rod Walker IC 13th March 2002 http://d0db.fnal.gov/sam.
Download ReportTranscript SAM-Grid Middleware - Rod Walker,ICL. SAM. JIM. RunJob. Conclusions. Rod Walker IC 13th March 2002 http://d0db.fnal.gov/sam.
SAM-Grid Middleware - Rod Walker,ICL. SAM. JIM. RunJob. Conclusions. Rod Walker IC 13th March 2002 http://d0db.fnal.gov/sam Rod Walker IC 13th March 2002 SAM stands for “Sequential Access to Data via Metadata”. Sequential access within files – order of files isn’t important, e.g. HEP data. History of SAM Project started in 1997 by FNAL Computing Division(not just physicists). Meant for FNAL experiments, and recently taken up by CDF. So far ~20 FTE years – a lot of effort. State of the art in Data Management No-one else deliver TB’s of user selected data on demand. Rod Walker IC has 13thtried Marchto2002 Global file routing • Many remote stations want files – SAM allowed free-for-all to gridftp server. – MSS access only from FNAL site, cache on private network,... • Needed control and routing • Solution: All sites can route files, eg. – Get fnal files from fnal-router – route=fnal.gov::nijmegen and nijmegen station has route=fnal.gov::fnal-router • Janet - Geant – Esnet – FNAL, 155Mbit bottleneck. • Janet - Geant – Surfnet – FNAL, Gbit(?) Rod Walker IC 13th March 2002 SAM Status •Middleware Development •Global routing. •Diverse deployments, e.g. private network, firewall, shared vs local disk cache. •CDF deployment – GridPP •Bug fixes. •GridFTP and Authentication – GridPP •Outlook • Decreasing development. FNAL CD support for RunII Rod Walker IC 13th March 2002 Rod Walker IC 13th March 2002 JIM history •Purpose: to build on SAM’s data handling, to create a real grid. •Job definition & management •Information & Monitoring •Novel concepts •Already have DH system. •ups/upd packaging and deployment. •rpm functionality plus multi-platform, tailoring. •little dependence on native installation, e.g.python v2.1f •hugely simplified deployment. •Use Condor as resource broker. Rod Walker IC 13th March 2002 JIM components • User Interface •Job Definition language based on classadds • RB reduced to making MMS ranking function •Static & dynamic constraints:os,code version,freecpu,… •Plus external function to query DH system. • Collaboration with Wisconsin. •Choose gatekeeper, use external function, separate submission server from negotiator. Rod Walker IC 13th March 2002 Rod Walker IC 13th March 2002 JIM components •Information & Monitoring. • Currently: grid sensors > ldap > MDS > PHP • Developing: grid sensors > xml > native Db > PHP, other. • Reliability, flexibility, persistency. • Same model works for grid system book-keeping and user level monitoring. Rod Walker IC 13th March 2002 User Interfac e Parser JDL ClassAd Condor Schedd External Code Information Flow Condor-G Condor Negotiator Cin Cout GRAM Condor Grid Manager Gatekeeper Batch Syestem Grid Sensors Compute Resource Execution Site Rod Walker IC 13th March 2002 ClassAd Condor Collector Information And Monitoring RunJob • Vital tool for d0 MC productions on farms. •Chains, steers and parallelizes d0 executables. Creates metadata. Use SAM to store to MSS. • Now interfaced to SAM for input, and can handle real data and any d0 executables. •Will be used for skimming, re-processing datasets, and user analysis. •Fully automate monitoring, checking and storage. •Work underway by UK. Rod Walker IC 13th March 2002 RunJob status • Maintenance & development of RunJob, and interface to SAM-Grid entirely by UK. • CMS using branch of RunJob for production. • Dave Evans and Greg Graham collaborating on merging branches. •Goal: Single package with EDG and SAM-Grid interfaces. • Runjob “server” or job-manager. Rod Walker IC 13th March 2002 SAM-Grid Logistics User Interface User Interface User Interface Submission Global Job Queue User Interface Submission Resource Selector Grid Client Match Making Global DH Services Info Gatherer SAM Naming Server Info Collector SAM Log Server Resource Optimizer MSS Cluster Data Handling Local Job Handling SAM Station (+other servs) Grid Gateway SAM Stager(s) Local Job Handler (CAF,RunJob,Vanilla, ...) JIM Advertise Dist.FS Worker Nodes AAARod Walker IC 13th March 2002 Cache SAM DB Server Site RC MetaData Catalog Bookkeeping Service Info Manager MDS Web Serv Info Providers Grid Monitoring XML DB server Site Conf. Glob/Loc JID map ... User Tools Site Site Site Conclusions o Core SAM supported by FNAL CD o Operational support via software shifts. o UK currently contributes 2 experts on shift. o JIM post-development support, o bug fixing, deployment issues (like SAM). o will need software support shifts. o RunJob is and will be UK supported. o Expanding functionality – analysis,reprocessing. o Increasing deployment – d0 sites, CMS. o On target for end-March deliverable, and production Grid in April. Rod Walker IC 13th March 2002 JIM V1: Package dependencies samgrid jim_broker_client jim_client sam_common xml_meta_configurator sam_config server_run jim_broker jim_info_providers jim_advertise galax orbacus jim_jobmanagers Rod Walker IC 13th March 2002 globus jim_www jim_sandbox