Transcript Document
Automated software
packaging and installation
for the ATLAS experiment
Simon George
Royal Holloway, University of London
Christian Arnault, LAL Orsay; Michael Gardner, RHUL; Roger Jones,
University of Lancaster; Saul Youssef, Boston University
[email protected]
e-Science All Hands Meeting
Nottingham
2-4 September 2003
ATLASexperiment.org
Introduction
This talk is about packaging, distribution and
installation for a large software project
It is essential because
The project computing resources are widely distributed
around 140 institutes, who all want to use the software
We want to be able to use Grid resources that do not
have locally managed installations of the software
Our working model also requires the ability to deploy
user code that is not part of an official distribution
I’ll describe the process developed and the tools
used.
Wed 03Sep03
Simon George RHUL
2
Contents
ATLAS
and its software
Requirements
Tools and formats
Meta data
Naming conventions
Creating and installing the kits
Conclusions and outlook
Wed 03Sep03
Simon George RHUL
3
The ATLAS Experiment
A Particle Physics experiment at
the Large Hadron Collider, CERN
1600 physicists, 140 institutes,
6 continents
Studies include
• search for the origin of mass
• excess of matter over antimatter in
the universe
• evidence for Supersymmetry
• other new physics
Wed 03Sep03
Simon George RHUL
4
ATLAS software suite
Simulation, data processing and analysis
500 “packages”, 50 external, inter-dependent.
100s of developers and 1000s of users in 140 institutes
One release build is 2.5 GB of files
It takes 10 hours to build
Build types and frequencies
Build configuration permutations
Production release 3-4 times per year
Developer release every 2-3 weeks
Nightly build of snapshot
Optimised, debug and sometimes also profile builds.
Two platforms (RedHat 7.3 on Intel x86, Solaris 8 on SPARC)
One or more compilers (gcc 3.2)
Config. management, build and install handled by CMT
So not a trivial task to package, distribute and install
Wed 03Sep03
Simon George RHUL
5
CMT
www.cmtsite.org
Configuration management tool
CMT
Concerned with setting up user’s environment to build
and run software
Needs help of tools for a large project
CMT helps to define and impose conventions
For naming packages, files, directories
For describing their relationships
In other words, package metadata
This is the key feature exploited for this project.
Useful features to manage sub projects,
dependencies
A broad user base, especially in Particle Physics
and Astronomy experiments.
Wed 03Sep03
Simon George RHUL
6
Packaging Requirements
Three types of kit required
Binary kit
• Pre-built executables, libraries and configuration files needed to
run the software
• Used for data challenges, production, basic users
Developer’s kit
• Binary kit plus
• Headers, libraries and configuration needed to build against it
• For developers and most users
Full source kit
• To rebuild from scratch on binary-incompatible platforms
• When local source code browsing is required
For each permutation of platform, config, compiler
Wed 03Sep03
Simon George RHUL
7
Installation requirements
For large facilities: unattended, push button deployment
For normal user: relocateable, no root access
Automatic configuration
Updates, multiple versions
Avoid duplication and unnecessary downloads
Possibility to take subset of software
Self contained, apart from …
Prerequisite software: modest list and automatic check
Set up user’s environment (e.g. LD_LIBRARY_PATH)
Reversible: uninstall
Install and work disconnected from network,
e.g. install onto a laptop from CDs
Wed 03Sep03
Simon George RHUL
8
Constraints
ATLAS software is divided into sub-projects
Currently ATLAS and Gaudi
Could be more in the future, e.g. split ATLAS into
simulation and reconstruction
Each sub-project consists off many packages
External/Internal package distinction
Internal packages are developed and managed within
the ATLAS software project
External packages are the opposite, e.g. software from
the Particle Physics community, public domain software
or commercial products.
Interface packages for externals
• Pure metadata package
• Actual external sw can be installed anywhere, any way.
• Gives it the outside appearance of an internal package
Wed 03Sep03
Simon George RHUL
9
Constraints, continued
Existing use of CMT
Package structure already in place
Meta data provided by packages or implied by default
policies is already enough for automated packaging.
Problems
ATLAS software is written by large communities with a
mixed level of experience
All such software projects will have small flaws
introduced in each release
These must be worked around when they impact on the
packaging.
For example, one problem of particular relevance to
packaging & installation is cyclic dependencies
Wed 03Sep03
Simon George RHUL
10
Packaging: starting point
One
kit per package
Follow existing granularity
Separate
metadata and payload
Two parts to each kit
Performed
by librarian as integral part of
release procedure
Distribution by web or distributed filesystem
(e.g. AFS)
Wed 03Sep03
Simon George RHUL
11
Tools used
CMT
Pacman
Metadata format
Tool used to manage kit installation
Tar and RPM
Define and impose conventions on packages
Query the metadata needed for packaging
Payload format – the package itself
“Deployment tools” shell scripts
Construct the kits using CMT
Control location of Pacman cache and distribution
Post-installation configuration
Wed 03Sep03
Simon George RHUL
12
Overview of process and tools
Librarian
CMT
Create kits
Web server
or AFS
Deployment
Tools
CMT
Pacman
Local s/w
manager
Developer
Local
computers
Wed 03Sep03
Simon George RHUL
Run software
13
A package manager
Packager defines how the software should be
fetched, installed, configured, updated, in a
“Pacman” file. The package itself can be in any
format as that file is separate.
A directory of these files is known as a cache,
usually available on the web.
Pacman tool is used to install the software
Pacman’s feature list is a good match to the
requirements for installation.
Already used by several Particle Physics and
GRID projects.
Wed 03Sep03
Simon George RHUL
http://physics.bu.edu/~youssef/pacman
Pacman
14
Package distribution format
Tar vs. RPM
Both can be made relocateable
Feature set
Tar has a simple feature set but is complementary to CMT and
Pacman
RPM overlaps with CMT and Pacman
• e.g. RPM also handles dependencies and prerequisites
Platforms
RPM is only widely used on Linux, while tar is standard on pretty
much any Unix
Annoyances
Default RPM database needs root access to write to it
• There are workarounds for this but not pretty
Conclusion
Decided to use tar
but retained RPM as an option
Wed 03Sep03
Simon George RHUL
15
Meta data
For each package
Other packages it uses (dependencies)
Location of constituents
•
•
•
•
External packages
Pure meta data “glue” packages
Just define paths to export
All defined in CMT requirements files
Applications and libraries
Header files
Run time/config files
CMT requirements file
or implied by default conventions of ATLAS
Can be queried through cmt
cmt show uses
cmt show macro <package>_export_paths
Wed 03Sep03
Simon George RHUL
16
Naming and structure
Package naming convention
Packages in a sub-project
• <package name>-<sub-project release id>
External packages
• <package name>-<version id>
These names are used when expressing the inter-package
dependencies
Directory structure within each kit
<sub-project>/<release-id>/InstallArea/
• contains the sub-directories bin, lib, include, share.
<sub-project>/<release-id>/<package>/<version>/cmt/
• Contains the configuration management files
<external-package>/
• Assumed to have their own internal structure for versions & builds
This is designed to support coexistence of:
Different versions of every piece of software
Different binary versions (platform and build config)
Wed 03Sep03
Simon George RHUL
17
Examples
CMT requirements file:
package ExamplePkgA
author A. Person <[email protected]>
use ExamplePkgB
use ExampleExtPkg
library ExamplePkgA *.cxx
apply pattern component_library
apply pattern declare_runtime
Package name and author
Inter-package dependencies
Instruction to build a library
from source files
Type of library to build,
implies library file names
Default location implied
Pacman file:
description=‘Package ExamplePkgA-01-07-02 in release 6.5.0’
url=‘http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO’
source=‘../dist’
download = { ‘*’:’ExamplePkgA-6.5.0.tar.gz’ }
depends = [ ‘ExamplePkgB-6.5.0’, ‘ExampleExtPkg-v1’ ]
Wed 03Sep03
Simon George RHUL
18
Creating the kits
First, build a release
Discover cycles in the dependencies
Then, use a feature of CMT to visit every package in a
dependency tree and apply a command there
cmt broadcast <command>
Usage of the script to create a kit:
Use a feature of CMT to discover cycles in the dependencies, as
these must not be propagated to the kits. Record the output in a
file.
create_kit.sh –release <release-id> -cycles <file>
<target distribution directory>
[-rpm]
Creates a pacman file and tar file, optional RPM file
Finally, there are often a few things to fix by hand specific
to each release.
Note that CMT itself is included as a kit
Wed 03Sep03
Simon George RHUL
19
Installation
Performed by site software manager or end user on
desktop or laptop
Straightforward procedure:
Install Pacman, if not already done
Install prerequisite software
• Currently just RedHat 7.3 o/s, gcc-3.2 and Java SDK 1.4.1
Choose directory for the installation
• Probably the same as before
Choose which release to install
• Available releases are listed on a web page
Use Pacman to download, install and configure it, e.g.
pacman –get ATLAS:AtlasRelease-6.5.0
• Dependencies followed automatically to get everything you need
Optionally, run script to set up a user environment and run a test
User configures software in the usual way
Just choose release and private working area as normal
Run a setup script provided by CMT
Wed 03Sep03
Simon George RHUL
20
Conclusions
Procedures and tools have been developed for the
packaging, distribution and installation of ATLAS software
Based on Pacman, CMT, tar/rpm and some shell scripts
The basic principles could be applied more generally
Using some or all of the same tools
It satisfies most of the requirements for run-time and
developers’ kits and for installation.
Full source kit still to be done.
Early adopters have given useful feedback and it is now
being imported into Grid production systems
Must now move to its use as part of the standard release
procedure in ATLAS
by December 2003, for our global `Data Challenge 2'
Wed 03Sep03
Simon George RHUL
21
Future developments
Better
handling of prerequisite software and
platform compatibility checks
EDG WP4 configuration management task
Potential
to work with an installation on
demand mechanism for GRID farms
LCG/EDG/iVDGL GLUE
Meta packaging proposal for Grid middleware
and applications, O. Barring et al.
Pacman
Wed 03Sep03
version 3
Simon George RHUL
22