Linux Status at Fermilab

Download Report

Transcript Linux Status at Fermilab

A Modular Administration
Tool for Linux Computers
Dan Yocum, Connie Sieh, Dane Skow
Fermilab
Kirk Bauer, Georgia Tech
February 8, 2000
CHEP
Projects
• Linux Farms (FT and Run II)
• Level 3 trigger farms
• Tape mover nodes (Enstore)
• Prototyping systems (DAQ tests)
• Desktops
FNAL System Census 1999
600
500
Linux
SGI
Sun
VMS
IBM
400
300
200
100
7/21/2015
Au
gu
st
Se
pt
O
ct
Ja
n
Ju
ly
ay
M
l
Ap
ri
ar
M
Ja
n
ch
0
Farms
(See Talks E60, E70)
• 150 dual Linux machines in production now, 50 more coming
online in next 2 months.
• Expect to ramp up 500+ for Run II production.
• CDF L3 farms will be 100’s of PCs, BTeV proposal looking at
1000’s.
• Readily cloneable systems and relatively fault tolerant
• Gotten a lot of mileage out of $20 CDROM’s
(Pre)production systems
• Linux boxes popular for test clusters to develop ideas and
software testing.
• Becoming popular platforms for dedicated server functions
(eg. tape controllers).
• More problems finding the “sweet spot” as tends to stress
some system element.
• Used by developers who are very expert and comfortable with
special optimizations.
Desktops
• Over half of all Linux boxes still are on the desktop.
• Growth continues to pace farms deployment (even with 100+
node purchases).
• Code developers are prime deployment targets.
• Physics analysis users beginning to ramp up.
• Most desktops are run in “Orange” mode.
• “Self-help” mailing list [email protected] very successful
Infrastructure
• Discussions of tools that are needed seem to break
down into 4 categories:
– system monitoring and alarm (see E173)
• Currently use simple ping tests and PATROL.
• This is area of greatest activity of Beowulf world.
– system installation and patch management.
• Use network install server and AutoRPM. CDROM’s have a
role.
– Backup and failure recovery.
• Systracker and other ideas. Still early
– Resource accounting and capacity planning.
• Use batch systems (see E191) for scheduling and pacct’ing
scripts for usage tracking.
Systracker
• Based on our success with AutoRPM we invited Kirk Bauer to
come work on a configuration management tool.
• Prototype of system change tracking system (logger and
replay mechanism).
• Desire is for easy method to restore changes to install
configuration.
• PERL modules based on concepts of tripwire, Autorpm and
RCS.
• Local machine alpha version available (coming soon to release
as a FermiTool (@ ftp.fnal.gov)). Next step would be archive
server, addition of other package handling methods (UPS,
etc.).
Systracker: The basis
• Presume that one can install a system to a base configuration.
Take a snapshot of this as the system baseline. This is the
fundamental assumption.
• Almost never “good enough”.
– Irrestistible urge to customize “personal” computers.
– Obstinant refusal to use “standard” methods of admin
– Frequently sufficient local (legitimate) customization that
restoring this manually takes longer than the reinstall of the base
system.
Systracker: The method
• Use tripwire mechanisms to monitor system files and
directories for changes and check updates into a RCS
repository.
• Modified RPM to archive RPMs to a repository.
• Create a module to create a “replay” script from differences
between baseline and target.
• Working on installation scripts to replay the “replay”
NB: NONE of this is inherently Linux specific.
Systracker: The picture
Config Files
RCS repository
System Dirs
Systracker
Difference engine
RPMs
UPS
Replay engine
Systracker: The glossary
• Systracker
The main program
• ConfigTrack
Parse/Track systrack config files
• StandardTrack
Tracks file changes
• RPMTrack
Tracks RPM package installs
• MD5Track
Maintains MD5 signature for changes
Systracker: The shopping list
• PERL
The lingua franca of admin tools
• BOOTP
Useful for turnkey startup (maybe)
• AutoRPM
Useful patch distribution method
• RPM
Red Hat package manage (or others)
• RCS
Revision Control System archive tool
• cfengine
cluster wide management tool (enforcer?)
• prsh
parallel command execution (mass deploy ?)
• cfm
competition. Not ready for prime time
• rdist/rsync
quick sync. Useful for central repository
Summary
• At FNAL, Linux installation infrastructure better than most OS
flavors.
• Users are “violently” in favor of an “Orange” configuration but
not diligent in carrying out admin duties.
• Linux growth not yet maxed out. Likely to completely
dominate the Unix desktop.
• Serious use by amateurs just beginning.
• Desired applications for Linux continue to rise. Expect to see
videoconferencing, etc coming.