Transcript Document

Automated software
packaging and installation
for the ATLAS experiment
Simon George
Royal Holloway, University of London
Christian Arnault, LAL Orsay; Michael Gardner, RHUL; Roger Jones,
University of Lancaster; Saul Youssef, Boston University
[email protected]
e-Science All Hands Meeting
Nottingham
2-4 September 2003
ATLASexperiment.org
Introduction

This talk is about packaging, distribution and
installation for a large software project
 It is essential because




The project computing resources are widely distributed
around 140 institutes, who all want to use the software
We want to be able to use Grid resources that do not
have locally managed installations of the software
Our working model also requires the ability to deploy
user code that is not part of an official distribution
I’ll describe the process developed and the tools
used.
Wed 03Sep03
Simon George RHUL
2
Contents
 ATLAS
and its software
 Requirements
 Tools and formats
 Meta data
 Naming conventions
 Creating and installing the kits
 Conclusions and outlook
Wed 03Sep03
Simon George RHUL
3
The ATLAS Experiment



A Particle Physics experiment at
the Large Hadron Collider, CERN
1600 physicists, 140 institutes,
6 continents
Studies include
• search for the origin of mass
• excess of matter over antimatter in
the universe
• evidence for Supersymmetry
• other new physics
Wed 03Sep03
Simon George RHUL
4
ATLAS software suite






Simulation, data processing and analysis
500 “packages”, 50 external, inter-dependent.
100s of developers and 1000s of users in 140 institutes
One release build is 2.5 GB of files
It takes 10 hours to build
Build types and frequencies




Build configuration permutations





Production release 3-4 times per year
Developer release every 2-3 weeks
Nightly build of snapshot
Optimised, debug and sometimes also profile builds.
Two platforms (RedHat 7.3 on Intel x86, Solaris 8 on SPARC)
One or more compilers (gcc 3.2)
Config. management, build and install handled by CMT
So not a trivial task to package, distribute and install
Wed 03Sep03
Simon George RHUL
5
CMT

www.cmtsite.org
Configuration management tool



CMT
Concerned with setting up user’s environment to build
and run software
Needs help of tools for a large project
CMT helps to define and impose conventions




For naming packages, files, directories
For describing their relationships
In other words, package metadata
This is the key feature exploited for this project.

Useful features to manage sub projects,
dependencies
 A broad user base, especially in Particle Physics
and Astronomy experiments.
Wed 03Sep03
Simon George RHUL
6
Packaging Requirements

Three types of kit required

Binary kit
• Pre-built executables, libraries and configuration files needed to
run the software
• Used for data challenges, production, basic users

Developer’s kit
• Binary kit plus
• Headers, libraries and configuration needed to build against it
• For developers and most users

Full source kit
• To rebuild from scratch on binary-incompatible platforms
• When local source code browsing is required

For each permutation of platform, config, compiler
Wed 03Sep03
Simon George RHUL
7
Installation requirements











For large facilities: unattended, push button deployment
For normal user: relocateable, no root access
Automatic configuration
Updates, multiple versions
Avoid duplication and unnecessary downloads
Possibility to take subset of software
Self contained, apart from …
Prerequisite software: modest list and automatic check
Set up user’s environment (e.g. LD_LIBRARY_PATH)
Reversible: uninstall
Install and work disconnected from network,
e.g. install onto a laptop from CDs
Wed 03Sep03
Simon George RHUL
8
Constraints

ATLAS software is divided into sub-projects




Currently ATLAS and Gaudi
Could be more in the future, e.g. split ATLAS into
simulation and reconstruction
Each sub-project consists off many packages
External/Internal package distinction



Internal packages are developed and managed within
the ATLAS software project
External packages are the opposite, e.g. software from
the Particle Physics community, public domain software
or commercial products.
Interface packages for externals
• Pure metadata package
• Actual external sw can be installed anywhere, any way.
• Gives it the outside appearance of an internal package
Wed 03Sep03
Simon George RHUL
9
Constraints, continued

Existing use of CMT



Package structure already in place
Meta data provided by packages or implied by default
policies is already enough for automated packaging.
Problems




ATLAS software is written by large communities with a
mixed level of experience
All such software projects will have small flaws
introduced in each release
These must be worked around when they impact on the
packaging.
For example, one problem of particular relevance to
packaging & installation is cyclic dependencies
Wed 03Sep03
Simon George RHUL
10
Packaging: starting point
 One

kit per package
Follow existing granularity
 Separate

metadata and payload
Two parts to each kit
 Performed
by librarian as integral part of
release procedure
 Distribution by web or distributed filesystem
(e.g. AFS)
Wed 03Sep03
Simon George RHUL
11
Tools used

CMT



Pacman



Metadata format
Tool used to manage kit installation
Tar and RPM


Define and impose conventions on packages
Query the metadata needed for packaging
Payload format – the package itself
“Deployment tools” shell scripts



Construct the kits using CMT
Control location of Pacman cache and distribution
Post-installation configuration
Wed 03Sep03
Simon George RHUL
12
Overview of process and tools
Librarian
CMT
Create kits
Web server
or AFS
Deployment
Tools
CMT
Pacman
Local s/w
manager
Developer
Local
computers
Wed 03Sep03
Simon George RHUL
Run software
13






A package manager
Packager defines how the software should be
fetched, installed, configured, updated, in a
“Pacman” file. The package itself can be in any
format as that file is separate.
A directory of these files is known as a cache,
usually available on the web.
Pacman tool is used to install the software
Pacman’s feature list is a good match to the
requirements for installation.
Already used by several Particle Physics and
GRID projects.
Wed 03Sep03
Simon George RHUL
http://physics.bu.edu/~youssef/pacman
Pacman
14
Package distribution format



Tar vs. RPM
Both can be made relocateable
Feature set


Tar has a simple feature set but is complementary to CMT and
Pacman
RPM overlaps with CMT and Pacman
• e.g. RPM also handles dependencies and prerequisites

Platforms


RPM is only widely used on Linux, while tar is standard on pretty
much any Unix
Annoyances

Default RPM database needs root access to write to it
• There are workarounds for this but not pretty

Conclusion


Decided to use tar
but retained RPM as an option
Wed 03Sep03
Simon George RHUL
15
Meta data

For each package


Other packages it uses (dependencies)
Location of constituents
•
•
•
•

External packages



Pure meta data “glue” packages
Just define paths to export
All defined in CMT requirements files


Applications and libraries
Header files
Run time/config files
CMT requirements file
or implied by default conventions of ATLAS
Can be queried through cmt


cmt show uses
cmt show macro <package>_export_paths
Wed 03Sep03
Simon George RHUL
16
Naming and structure

Package naming convention

Packages in a sub-project
• <package name>-<sub-project release id>

External packages
• <package name>-<version id>


These names are used when expressing the inter-package
dependencies
Directory structure within each kit

<sub-project>/<release-id>/InstallArea/
• contains the sub-directories bin, lib, include, share.

<sub-project>/<release-id>/<package>/<version>/cmt/
• Contains the configuration management files

<external-package>/
• Assumed to have their own internal structure for versions & builds

This is designed to support coexistence of:


Different versions of every piece of software
Different binary versions (platform and build config)
Wed 03Sep03
Simon George RHUL
17
Examples
CMT requirements file:
package ExamplePkgA
author A. Person <[email protected]>
use ExamplePkgB
use ExampleExtPkg
library ExamplePkgA *.cxx
apply pattern component_library
apply pattern declare_runtime
Package name and author
Inter-package dependencies
Instruction to build a library
from source files
Type of library to build,
implies library file names
Default location implied
Pacman file:
description=‘Package ExamplePkgA-01-07-02 in release 6.5.0’
url=‘http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO’
source=‘../dist’
download = { ‘*’:’ExamplePkgA-6.5.0.tar.gz’ }
depends = [ ‘ExamplePkgB-6.5.0’, ‘ExampleExtPkg-v1’ ]
Wed 03Sep03
Simon George RHUL
18
Creating the kits


First, build a release
Discover cycles in the dependencies


Then, use a feature of CMT to visit every package in a
dependency tree and apply a command there




cmt broadcast <command>
Usage of the script to create a kit:


Use a feature of CMT to discover cycles in the dependencies, as
these must not be propagated to the kits. Record the output in a
file.
create_kit.sh –release <release-id> -cycles <file>
<target distribution directory>
[-rpm]
Creates a pacman file and tar file, optional RPM file
Finally, there are often a few things to fix by hand specific
to each release.
Note that CMT itself is included as a kit
Wed 03Sep03
Simon George RHUL
19
Installation

Performed by site software manager or end user on
desktop or laptop
 Straightforward procedure:


Install Pacman, if not already done
Install prerequisite software
• Currently just RedHat 7.3 o/s, gcc-3.2 and Java SDK 1.4.1

Choose directory for the installation
• Probably the same as before

Choose which release to install
• Available releases are listed on a web page

Use Pacman to download, install and configure it, e.g.
pacman –get ATLAS:AtlasRelease-6.5.0
• Dependencies followed automatically to get everything you need


Optionally, run script to set up a user environment and run a test
User configures software in the usual way


Just choose release and private working area as normal
Run a setup script provided by CMT
Wed 03Sep03
Simon George RHUL
20
Conclusions



Procedures and tools have been developed for the
packaging, distribution and installation of ATLAS software
Based on Pacman, CMT, tar/rpm and some shell scripts
The basic principles could be applied more generally


Using some or all of the same tools
It satisfies most of the requirements for run-time and
developers’ kits and for installation.

Full source kit still to be done.

Early adopters have given useful feedback and it is now
being imported into Grid production systems
 Must now move to its use as part of the standard release
procedure in ATLAS

by December 2003, for our global `Data Challenge 2'
Wed 03Sep03
Simon George RHUL
21
Future developments
 Better
handling of prerequisite software and
platform compatibility checks

EDG WP4 configuration management task
 Potential
to work with an installation on
demand mechanism for GRID farms
 LCG/EDG/iVDGL GLUE

Meta packaging proposal for Grid middleware
and applications, O. Barring et al.
 Pacman
Wed 03Sep03
version 3
Simon George RHUL
22