RLS Tier-1 Deployment
Download
Report
Transcript RLS Tier-1 Deployment
RLS Tier-1 Deployment
James Casey,
PPARC-LCG Fellow, CERN
[email protected]
10th GridPP Meeting, CERN, 3rd June 2004
Database and Application Services
Overview
RLS for LCG at CERN and the Tier-1s
Tools for deploying the RLS
Deployment Progress
Evolution of RLS Architecture
Summary
Database and Application Services
RLS at CERN
Initial design of RLS system involved
distributed Local Replica Catalogs (LRCs) and indices (RLIs)
RLIs never deployed in production environment
EDG RLS chosen as single Grid Catalog for LCG-1/LCG-2
Since RLIs not proven, a single RLS (LRC/RMC) was provided
at CERN for each LCG Virtual Organization
single, separate Metadata Catalog (RMC)
EDG supported RLS on both Oracle and MySQL/Tomcat
Different VOs and different sites may need different levels
of Quality of Service
Oracle chosen for LCG deployment at CERN
CERN IT-DB has a large experience with Oracle
Single catalog required high availability if LCG was to work
Catalog acts as single point of access to grid files
Database and Application Services
RLS at Tier-1s (1/2)
New CERN Oracle contract in December 2002
Based on “named users”
Products can be used anywhere in the world by any of the
‘named users’
Sites do not need to buy licenses for Oracle Servers running
LCG applications
Many Tier-1s already have Oracle experience
RAL, NIKHEF, Taiwan, FZK, CNAF, …
Tier-1s run central services, and require the manageability
and availability
Natural choice to try and deploy RLS components at Tier-1
using Oracle
Database and Application Services
RLS at Tier-1s (2/2)
The following deployment plan devised:
Support Platforms in use at CERN only
RedHat Linux ES 2.1
Use the same tools we use to deploy these products locally at
CERN
We provide the distribution toolkits for Oracle products
Binary installation kits
configuration scripts
Support is provided for
the installation and deployment of Oracle
the deployment of the “shrink-wrapped” application
Focus on standard environment to make things easier
Sites can do things differently if they have the knowledge,
but we don’t support it
Database and Application Services
RLS Deployment Details (1/3)
Operating System
RedHat ES/AS 2.1
Oracle specific additions/fixes (binutils, fuser symlink)
Standard disk layout assumed (based on Oracle Flexible
Architecture)
Database
Oracle 9i 9.2.0.4 (move to 9.2.0.5 soon)
Application Server
Oracle 9iAS 9.0.3.1 + Local Fixes
Database and Application Services
RLS Deployment Details (2/3)
Binary Install Kits
Oracle 9i 9.2.0.4 single instance
Oracle 9iAS 9.1.3.1 single instance
Environment configuration
.bashrc/.cshrc/sysconfig configuration
init.d scripts
For both Oracle 9i/9iAS
Delivered as RPMs for RHES2.1
Database Creation
Generic tool to create database instance, using per application
specific configuration files
oracle config files (pfile, tnsnames, init.ora, orapw)
Instance creation SQL
Database and Application Services
RLS Deployment Details (3/3)
Configuration files for Database Creation specific to RLS
Database deployment scripts
create-tablespaces, create-users, create-schemas
Application Server deployment scripts
deploy-webapps , undeploy-webapps, alias-webapp
CERN specific tools that other can decide to use
DNS alias based application server fail-over
Application level monitoring
Application level statistics gathering and visualization
Database and Application Services
Communication with Sites
The Savannah portal will act as main point of contact
between CERN IT-DB team and the Tier1 administrators
http://savannah.cern.ch/projects/lcg-orat1/
CVS Repository
http://savannah.cern.ch/cvs/?group=lcg-orat1
File Download area
http://savannah.cern.ch/files/?group=lcg-orat1
FAQ List
http://wwwdb.web.cern.ch/wwwdb/savannah-files/oralcgt1/docs/
Still being completed – much information already there, but
some missing
Feedback welcome !!!
Database and Application Services
Deployment Progress
Summer/Autumn 2003
Tier-1s invited to participate via Grid Deployment Board
Experiments informed of plan via the LCG Applications Area
Scripts used to install production version of catalogs for LCG-1/LCG-2
at CERN
Academica Sinicia (Taiwan) deploys RLS Catalogs
December 2003
First full version of distribution kits released to Tier-1s
CNAF deploys RLS
Used in Replication tests between CERN and CNAF in
conjunction with CMS
April 2004
FZK deploys RLS
May 2004
Meeting at CERN with RAL representatives
Database and Application Services
Evolution of RLS Architecture
Initial design of RLS system does not scale well for common
use cases
e.g. catalog extraction, bulk inserts, cross-catalog queries
Single LRC/RMC at CERN for Data Challenges in 2004
showed problems in deployed architecture
“bottleneck” – all jobs had to contact catalog at CERN
Single point of failure
Future architecture looks to replicated LRC/RMC at “core”
sites
Targeted for Data Challenges in 2005
CERN Tier-0 and several Tier-1s
Number of Tier-1s in range 4 -> 6
Database and Application Services
Replication
Replicated multi-master databases will let the system grow
Most commercial-strength databases already support
replication, so don’t try and invent it again! (no need for RLIs)
Tests of Replication carried out by IT-DB, CNAF and CMS
during CMS DC04
Used Oracle Advanced Replication
Oracle now recommend Oracle Streams as better way of
doing replication
Project started to test Oracle Streams based replication
Use sample workloads provided by CMS, using input from
DC04
Needs to be integrated with application for best results
Conflict avoidance, not conflict resolution
Database and Application Services
Summary
RLS scalability requires a move from single site (CERN) to
distributed system
Requires involvement of Tier-1s
Scripts prepared to make deployment as easy as possible
for Tier-1 administrators
Oracle installation
Application installation and monitoring
Next steps involve testing real performance and reliability
of distributed setup
This will be needed to support 2005 Data Challenges
Architecture will need to evolve as these new requirements
appear
Should build on the knowledge and expertise of Tier-1 sites
in running high-availability services under Oracle
Database and Application Services