Implementing Control-M in a Distributed Environment under

Download Report

Transcript Implementing Control-M in a Distributed Environment under

Implementing Control-M in a
Distributed Environment under
MC/ServiceGuard
Presentation Overview
Quick note on the installation environment
Business pressures on IT infrastructure and
the need for automation
Selection process (taken from the case study)
and getting the right tool for the job without
losing track of the needs of the business
Presentation Overview
Installation and Configuration;
Mainframe standards on Unix?
How it works in the real world
Future development (Control-M
takes over!)
Any questions/suggestions?
Notes on the Installation
Environment
One Control-M Server version 2.2.4 running on
HP-UX 11.0 (currently moving up to 11i)
Enterprise Controlstation Release 5.0.0 running
on HP-UX 11.0 (Exceed/Motif Version)
CONTROL-M Agents v 2.2.4, running on NT 4.0 Servers (3 servers)
Windows 2000 Servers (15 servers)
HP-UX 10.20 and 11.0 (approximately 20)
HP-UX 11i (4 servers)
Installation Environment
Part One - IT Infrastructure
Changes; A Typical Story
Traditional Mainframe site
Nightly batch
Attended operations
Mainframe declines as part the move
to Distributed Computing Perceived as cheaper & more flexible
'User-centric'
Mainframe (and associated staff) are
Decommissioned
However ...
The Distributed Enterprise still requires
'Unseen' operational tasks to be completed
Some method of centralised batch control/reporting needed
Specified tasks to execute reliably (i.e. more reliably than
'cron' table entries and in-house written scripts)
Additional Developments
Management commit to run services for Far East Office
(+7 hours) on existing platforms (over Citrix MetaFrame)
Increasing need for integrated cross-platform tasks
(especially as M/F applications are migrated)
Staff headcount under pressure
The Scenario
The Company's Golden IT Rules
No risks taken with core IT systems
Therefore, avoid single points of failure and
build redundancy into hardware infrastructure
Extremely high priority given to achieving
rigid security standards
Reconciling the Pressures and the
Required Standards
Accept that the Distributed Computing model is
established and needs to be embraced
Business requires reliably integrated cross platform
tasks – the Enterprise needs to supply this service
“Give me something of Mainframe quality, but on
Unix instead”
Consider a job scheduler
Part Two – The Selection Process
Selection Process – Initial Steps
After establishing the “Enterprise-Wide” need,
ask for input from Sys Administrators,
Application Support & major users
Build matrix of technical requirements and
“preferred features”
Define product “pot-holes” to avoid
Draw up shortlist of potential software
Product “Wish List”
Compatible with Unix
NT and Windows 2000
NetWare
MC/ServiceGuard
VantagePoint Operations (ITO OpenView)
PeopleSoft
Standard Command Line Interface
Preferred Features
Including “Easy use” colour coded GUIs
Extensive batch administration options
Plug-ins for widest possible range of systems
SNMP
Email messaging
Batch modelling
Avoid the Pot Holes!
Hold a Contest
Check sources (Gartner, GIGA, search web)
Invite vendors to make presentation
Rate each product (build scoring table)
Select a winner to come forward for a test
installation, possibly testing the top two onsite
And the Winner is ...
Return to the Original Issues
Does the product do the vital tasks?
How does the test installation perform and
what do the test users think?
Does the cost of the product out-weigh the
benefits to be gained from installation?
Can I be sure that the product will not
become a 'White Elephant'?
Take your time before coming to a decision ...
Part Three – Installation
and Configuration
The challenge = installing the product and
obtaining the desired standards
Consider the issues behind loading all your
mission critical jobs into a single system for
execution
Consider failure scenarios
Uphold your golden rules
Implementing High Availability
Threats to system availability;
Hardware Failure or System Error
Human Error
Application Software
Viruses
Natural Disasters
 Contingency
= 44%
= 32%
= 14%
= 7%
= 3%
Planning Research, Livingston, NJ, USA
Put the job scheduler into context (make
it as redundant as the core systems)
Consider MC/ServiceGuard as the HP-UX
Tool for automatic failover for clustered
HP 9000 Enterprise Servers
What Is MC/ServiceGuard?
MC(Multi-Computer) ServiceGuard is HP's High
Availability solution. Similar to HACMP for AIX/
R6k & MS Cluster Server Software (aka Wolfpack)
Under ServiceGuard applications are seen as
'packages' with their own DNS entry
Redundant/mirrored storage required
ServiceGuard Monitors for software failures or
SPU/Disk/LAN component failures & coordinates
the transfer between failing and redundant
components
MC/ServiceGuard in Action
MC/ServiceGuard in Action
The Important Issues for Control-M
The package names are used in Control-M job
definitions (not underlying IP addresses) and
are synonymous with the DNS entries for each
node
If the applications are packaged themselves
(possibly together with other apps) then any
outages will be minimised
Packages can be manually failed over via
operator commands, thus allowing rolling
maintenance on production platforms
Installation Issues
Control-M installation needs to be fully planned
Consider creating separate Sybase server for
Control-M/ECS
Underlying Sybase databases (used by Control-M)
also need to be defined as ServiceGuard packages
Only Exceed/Motif version of ECS (i.e. not NT) is
available when installing under ServiceGuard &
ECS version 500 (addressed in ECS 600)
Control-M Has Built-in Failover
Control-M server has options to internally create a
mirrored databases and backup server, but
This will have to defined separately by Control-M
administrator
Failover is not automatic, it requires intervention
Failover is designed as short-term contingency
(i.e. get the original server or DB fixed ASAP!)
Part Four – The Real World
Clustered hardware, redundant storage, high
availability systems
Control-M fully integrated into environment
'Intelligent' scheduling deployed
Introduce naming standards and conventions
Consider how best to implement your security
policy
For Example – Before Control-M
For Example – Before Control-M
08:00 support onsite, backups have failed
Now Under Control-M
Both backups completed by 06:30, support
Other Features Under Control-M
Control Resources & AutoEdit Variables
Definable Quantitative Resources
Shout Messages (for various situations)
From/to windows, maximum reruns & cyclical
'On' conditions for return codes & standard out
User defined calendars or set pattern of days
Critical path jobs, priority settings
Future Development Under Control-M
Roll out to large number of Agents on
W2k and HP-UX 11i
Backup strategy migrating from Legato
to Omniback
Control-M SDK to be released in 2002
and possibly used for bespoke banking
applications
ECS version 600 to be installed
The End
Questions and Suggestions
Thank you for listening
Mark Francome
Globetech AG, Basel
061 263 1360
[email protected]