Grist: Grid Computing for Astronomy

Download Report

Transcript Grist: Grid Computing for Astronomy

Scaling NVO Services to the Teragrid
Roy Williams
Conrad Steenberg
Craig Miller
Matthew Graham
Joe Jacob
Julian Bunn
Desired Characteristics of NVO Services
•
Service oriented architecture
•
•
•
Service developers/deployers are trusted users
•
•
•
Jobs submitted to batch queue
Unique sessionID may be used to monitor job & return results
From “clicking” to “scripting”
•
•
•
•
Easy to start, but great power is possible
Asynchrony for compute intensive jobs
•
•
Service developer acts as a broker between computing customer and computer
center
Service users authenticated with “graduated security”
•
•
Services should be easily and quickly deployable and usable on workstations or
supercomputers
Services deployed, managed, and upgraded by their developers
Services may be accessed by clicking on a web page or with scripted client codes
Authentication for web clicking comes from a certificate store or fat browser
Scripted access requires a certificate (strong or weak) straight from the client
Services as workflow components
•
A service user may be another service (a computer, not a human!)
A “Graduated Security” Model
Power user
Scripted access
Portal-Based
Big-iron computing....
Full TeraGrid account - browser access
More science....
Get NVO weak certificate - access logged, but identity not verified
Some science....
Web form - anonymous access, small jobs
Traditional Grid Security
client
I will do exactly what you want.
Show us your Certificate!
Graduated Security
May I have your Request and your Certificate?
client
Certificates
The Virtual Observatory as a Virtual Organization
This is a US driver’s licence.
In the US it proves identity strongly.
It is like a strong certificate.
This is a loyalty card where I buy food.
(You can put a false address on the application.)
It is like a weak certificate.
This is a $50 gift card at a bookstore.
It does not prove my identity in any way.
It is like an anonymous certificate.
Graduated security
• No certificate gets 15 CPU-minutes from community account
•Just switch on Javascript
• Weak certificate gets 1 CPU-hour from community account
• In exchange for registering name/email
• Strong (gridmapped) certificate gets infinity from own account
• Get this one from TeraGrid HQ
"nesssi_strong_cert_max_time" : 216000,
"nesssi_weak_cert_max_time" : 3600,
"nesssi_anon_max_time" : 900,
"nesssi_anon_user" : ”nvo",
"nesssi_weak_user" : "nvo",
service implementation
web forms
python API
graduated security
Certificates
multiple browsers
certificate chains
root certificates
proxy certificates
proxy certificate chains
2nd level proxy chains
secure https redirection
teragrid security police
caltech security police
NCSA security police
chown directory ownership
NFS root-squashing
pubcookie
Three Interfaces
•
Commandline with Python & Java
• Cert or proxy in wacko place like .globus or /tmp/u509
•
Fat Browser
• https: and browser managed PKCS12 certs
•
Thin Browser
• Web Proxy works dynamically with cert authority
Commandline Portal
certificate
policies
Certificate Authority
node
select user
account
get certificate
client
nesssi
node
queue
node
proxy
node
XML-RPC
sandbox
storage
build
proxy
open http
Teragrid
cluster
Fat Browser Portal
certificate
policies
Certificate Authority
node
select user
account
load certificate
Browser
nesssi
node
queue
node
certificate
node
JSON-RPC
sandbox
storage
open http
Teragrid
cluster
Web Portal
certificate
services
certificate
policies
select user
account
fetch
proxy
client
web form
nesssi
web portal
node
SOAP http
nesssi
node
queue
node
node
sandbox
storage
open http
Exercise: Running a Nesssi Service
see http://us-vo.org/nesssi
SessionID and Sandbox
•
Identify which job we are talking about
• 32 character hex string eg cb28d0753a7fec9a485981f741d425ec
• Used to monitor a running job
sessionID = remoteserver.cutout.init()
msg = remoteserver.cutout.monitor(sessionID)
• Used to form URL where results appear, eg
•
•
http://dtf-test1.sdsc.teragrid.org:8080
/clarens/shell/cb/cb28d0753a7fec9a485981f741d425ec/cutouts/index.htm
If you lose the sessionID, you lose your job
DPOSS Mosaic Service
nesssiServer=nesssi.client('https://envoy.c
acr.caltech.edu:8443/clarens/',debug=0)
mosaic_loc = "-ra 49.1 -dec 60.1 -rawidth
0.5 -decwidth 0.5 -filt f -bgcorr 0"
session =
nesssiServer.dpossMosaic.mosaic(mosaic_loc)
print "Your session ID is %s." % session
msg = dbsvr.dpossMosaic.monitor(session)
print msg
Repeat the monitoring
nesssiServer.
dpossMosaic.mosaic (
“-ra 49.1
-dec 60.1
-rawidth 0.5
-decwidth 0.5
-filt f
-bgcorr 0”)
Cutout Service
nesssiServer=nesssi.client('https://envoy.cacr.caltech.edu:
8443/clarens/',debug=0)
sessionID = nesssiServer.cutout.init()
print "Session id is ", sessionID
# Upload locations file
nesssiServer.upload_file(“inputfile.xml”, ”inputfile.xml”)
# Arguments for service, surveys to use and cutout size
args = "-surveys
PQ:gr,PQ:gi,PQ:z1,PQ:z2,SDSS:r,SDSS:i,SDSS:z,2MASS:k,2MASS:h "
args += "-size 64"
# Run service
nesssiServer.cutout.run(sessionID, args)
Cutout Monitoring
cutouts from Palomar-Quest, SDSS, 2MASS
of sources from Veron quasar catalog
Synoptic Coaddition service
Palomar-Quest Survey
Coverage map
Max=18
Making a Service
•
Developer builds script
•
•
•
•
•
Nesssi admin installs your service
•
•
•
•
•
Keyword-value pairs on command line [+uploaded files]
Sandbox location given on cmdline -- all files staged there
Should make index.htm in sandbox for progress
Make Nesssi connector for init(), upload(), run(), monitor()
Interview first
Symlink to code
Code is cached, restart server after edit
Developer gets right to restart server (running jobs not affected)
Service instantiations farmed out to cluster with PBS
Server side code
Application example:
dposs.py -dir sandbox \
-ra 123 -dec 22.7 \
-rawidth 0.4 -decwidth 0.4 \
-filt j -bgcorr 1
It should:
(1) Use keyword-value arguments and uploaded files
(2) Read/Write results in the given sandbox directory
(3) Write a progress file in sandbox/index.htm
(4) Estimate limits for anon/weak/strong certs
Service code will be symlinked from server code directory
Requires sudo server restart to see the service
Client-side Javascript
<input type="button"
onclick="connect_nesssi('dposs')"
name="Connect"
value="Connect to Nesssi">
Service name
Expect to run remote services called:
dposs.init(), dposs.run(), dposs.monitor()
Client-side Javascript
<form name="Parameters">
<input name="ra" value="202.4682”
Etc…
The form for the user
function getparams() {
var params =
"-ra "
+ document.Parameters.ra.value
"-rawidth " + document.Parameters.rawidth.value
"-dec "
+ document.Parameters.dec.value
"-decwidth " + document.Parameters.decwidth.value
"-filt " + filt + " " +
"-bgcorr " + bgCorr;
return params;
}
+
+
+
+
"
"
"
"
"
"
"
"
+
+
+
+
Developer converts the form to a string
Nesssi Assets
•
Graduated security
• Anonymous, Registered, Known
•
Multiple interfaces
• Fat browser, Web proxy, Scripted
•
Multiple implementations
• cacr.caltech.edu and sdsc.teragrid.org
•
Some useful services
• Hyperatlas mosaic, Cutouts, Synoptic coaddition
•
Teragrid acceptance of security model
• Server runs a job as somebody else
• Anonymous access to TeraGrid!!