Transcript Document

Systems and Operations
CIT Forum
February 21, 2007
Systems and Operations
Organization
Rick MacDonald,
Director, S&O
Rick MacDonald
Director
Systems and Operations
Brian Messenger
Assistant Director
Systems
Mark Bodenstein
Project Leader
Mainframe Systems Programming
James Bohnsack
Mike Garcia
Jason Lowe
Rick Polcaro
Manager
Facilities & Laser Printing Services
Randy Smith
Peggy Roberts
Don Mac Leod
Assistant Director
Systems Services
Mariann Carpenter
Manager
Systems Administration
Moe Arif
Mary Cronk
Doug Flanagan
Solomon Welch
Tom Walden
David Shirk
Kent Ross
Vacant (01/07)
Laurie Collinsworth
Manager
Systems Engineering
Scott Sorrentino
Mike Heisler
Jim Yang
John Wobus
Chris Manly
Javier Streb
Paul Zarnowski
Manager
Storage Services
Bob Talda
David Beardsley
Michael Hojnowski
Manager
Special Projects
Jim Howell
Manager
Messaging Services
Jennifer Moore
Gail Shaff
George Medlar
Todd Olson
Client Services
Rick Cochran
Michelle Mogil
Lee Brink
Dan Bartholomew
Vacant
Vicky Dean
Assistant Director
Operations
Carol Uber
Manager
Technical Support Operations
Faye Cunningham
Manager
Technical Support Services
Randall Frank
Bill Kelemen
Kevin Pelletier
Peter Skura
Karen Daniels
Pat Washburn
Manager
Student Computing Operations
Juan Salomon
Team Leader
NOC First Shift
Lori Beebe
Brandon Bowers
Joanne Button
Richard Cicciarelli
Kim Nicholson
Pat Graham
Alan Heiman
Jason Barnello
Beth Sprankle
Jim Conley
John Ryan
Brian Witchey
Bryan Benning
Team Leader
NOC Second Shift
Mark Allen
Jay Howell
Lillian Isacks
Chuck Thomas
Manager
Production Operations
Ken Frost
Team Leader
NOC Third Shift
Ruth Burroughs
Linda Jaynes
Greg Marvin
Dan Miller
Carl Moravec
Theresa Norman
Maureen Quillinan
Rich Fraboni
James Reed
John Becker
Barbara Van Etten
Jenny Signor
Technical Lead
Operations
 The Network Operations Center (NOC) is a centralized
control center that provides 24/7 monitoring, testing, change
management, and network configuration support.
 Production Control and Tape/Print Operations support
systems and applications for many administrative functions
and student services.
 Technical Support Operations is responsible for providing
hardware, software, and network support for workstations,
print/file servers, and LANs throughout CIT. TSO also
supports and maintains general and instructional computing
facilities across campus.
Systems Services
 Systems Services is the organizational home for S&O’s
broadly used campus services.
 Messaging Services supports our PostOffices, IMAP/POP,
Virus and SPAM filtering; Oracle Calendar; Bulkmail
Service; E- List Services; and Usenet.
 Client Services is the home for Net-Print, EZ-Remote,
Email client support, BearAccess, and other packaging and
delivery activities.
Systems
 Systems provides systems administration and systems
programming support for projects that span the 415+ CITOwned servers and 80 terabytes of storage located in our
machine rooms.
 The Systems Administration team provides the hardware and
operating system support for CIT’s servers and storage assets.
Support is also provided to other departments on a
contractual basis.
 Systems Engineering provides and enhances tools for our
production servers and services, such as NetVigil and Single
Sign-On, as well as specific support for Network Quarantine
and the DNS/DHCP service and its ancillary systems.
Systems
 Mainframe Systems Programming maintains and upgrades the
mainframe operating system software and other mainframe
infrastructure software.
 Facilities & Laser Printing Services supports the infrastructure
within the CIT machine rooms in Rhodes Hall and CCC and
administers the operations of the CIT Server Farm, hosting
more than 530 servers. Xerox laser printing services provides
custom forms, fonts, and graphic programming in support of
various administrative services.
 Storage Services operates EZ-Backup and administers the CIT
Storage Farm.
Recent Activities
 Single Sign-on was implemented for CIT servers
allowing for central management of user ID’s and
groups. Each person is allowed appropriate access
to managed servers through a single password.
 CFEngine was implemented allowing central
management and distribution of configuration files
to all managed Linux and Solaris servers.
 Systems Administration began offering full
RedHat Linux support in the server farm.
Recent Activities
 In partnership with IS Infrastructure staff, Sun
Clustering was deployed, enhancing the efficiency
and resilience of Solaris-based applications.
 Beginning in December, a monthly newsletter
highlighting Systems activities was released to
users of Systems services.
 Implemented NocDocs, a WIKI which integrated
on call lists and problem resolution documentation
with NetVigil.
Recent Activities
 Over 300 servers patched for the upcoming Daylight
Savings Time change.
 Lyris replaced Listproc as our Email list processor. 3800
lists are currently in Lyris with 130 to go. This change has
also provided enhanced bulk email functionality.
 The CIT Storage Farm became a campus service with the
Library as the first customer.
 Last weekend, installed a new mainframe, maintaining
support for non-PeopleSoft administrative systems.
Recent Activities
 We worked with IS on the Peoplesoft migration from AIX to
Solaris, also completed last weekend.
 Bulkmail requests have averaged 10/week for the last several
months
 SPAM Reduction Program - On January 15, 2007, we lowered
the “SPAM probability” threshold for rejecting email from
90% or higher to 80% or higher. This has reduced spam
significantly.
Ja
n0
Fe 1
b01
M
ar
-0
Ap 1
r01
M
ay
-0
Ju 1
n01
Ju
l-0
Au 1
g0
Se 1
p0
O 1
ct
-0
1
N
ov
-0
1
D
ec
-0
Ja 1
n0
Fe 2
b02
M
ar
-0
Ap 2
r02
M
ay
-0
Ju 2
n02
Ju
l-0
Au 2
g0
Se 2
p0
O 2
ct
-0
2
N
ov
-0
2
D
ec
-0
2
Number of Messages
Recent Activities
Spam Rejected last 24 months
Month
Spam Rejected
50,000,000
45,000,000
40,000,000
35,000,000
30,000,000
25,000,000
20,000,000
15,000,000
10,000,000
5,000,000
0
Upcoming Activities
 Mail Channels – a traffic shaping program which identifies
and throttles the processing of SPAM traffic will be
implemented in March.
 We will increase the email message size limit in place
during business hours from 10MB to 50MB in March.
 The Cornell Optional Email Address will be made
available to academic staff in April. The QI to LDAP
gateway will be retired sometime thereafter.
 We will continue to improve our support model for mobile
messaging.
Upcoming Activities
 We are working with the Library to evaluate the
appropriate technologies to support their Large Scale
Digital Initiative.
 A new 25 seat CIT instructional lab will be opened at
Mann Library for the Fall semester.
 A 12 seat CIT multimedia lab will be opened in partnership
with the Dean of Students office in Willard Straight Hall.
 Analysis of the use of VMWare and of Solaris 10 to
provision virtualized server environments is underway.
Upcoming Activities
 We will implement the Tidal software purchased for
automated scheduling and job submission within the
distributed systems infrastructure. This will enable our
Production Control staff to operate more efficiently as
additional PeopleSoft modules are deployed.
 We will continue to support the CIT building project,
particularly in the area of machine room needs and design.
 Working with ATSUS to develop customer facing
documentation to assist with acquisition and deployment of
servers.
Upcoming Activities
 Beginning a new project with NCS and Security to
improve the server farm network topology and to harden
server farm networks through use of ACL’s and other
technologies.
 Evaluating Storage Farm RFP responses. Should lead to
deployment of storage virtualization tools, iSCSI support
and introduction of a lower priced tier of storage.
Questions?!?
Virtual OS Hosting Project
Mike Hojnowski
Manager, Special Projects
Our Current model
 Service owners purchase servers from their
own budgets.
 S&O Sysadmins manage the servers.
 We experience continuing growth.
Server Growth
Servers managed per year
450
400
350
300
250
200
150
100
50
0
FY02
FY03
FY04
FY05
FY06
FY07
Averaging 23% per year growth
FY08 fcast
Limitations – Service
Owner
 Servers for test/dev/prod. Often underutilized.
 Service owners must track and budget replacement
cycles, maintenance.
 Emergency Preparedness is the responsibility of
the service owner, and may require yet more
dedicated hardware.
 Long lag from budgeting to procurement to
implementation.
 Short term server needs must be met with a
“permanent” purchase.
Limitations – Facilities
 Server sprawl – 400 CIT servers and
growing.
 Limited machine room space – we’re
running out.
 Cooling problems – new machines are
densely packed and run hot.
 Power problems – our UPS and Generator
capacity is not unlimited.
Limitations –
Sysadmins
 Servers to Admin ratio is 39/1.
 We expect more than 40 additional servers
(or virtual instances) per year.
 Our future looks like this….
FY
02
FY
03
FY
04
FY
05
FY
06
FY FY
08 07
FY fca
09 st
FY fca
10 st
FY fca
11 st
F1 fca
2 st
fc
as
t
Number of Servers
900
700
600
500
400
200
0
15
10
300
5
100
0
Administrators
Limitations – Sysadmins
Server Growth - Projected
25
800
20
Servers
Administrators
Virtualization
Overview
VMotion Overview
VMotion for DR
Proposed Solution
 Operating System Virtualization
Service owners “rent” virtual OS hosts.
 Revised Funding Model
S&O buys/maintains physical servers.
 Best Practices Improvements
Major increase in the servers/admin ratio.
 Emergency Preparedness
Handled at the “infrastructure” level, for an
additional cost.
Cautionary Tale
 “The cost savings from server virtualization come
almost entirely from hardware reduction…
However, administrative differences between
managing a virtual server and a physical server are
not significant. Most of the ongoing
administrative costs are tied with each OS instance
and not to physical hardware.
Action Item: Don’t count on virtualization to
reduce labor costs; for that, implement
standardization and automation.” - Gartner
Scope of Solution
 Windows on VMWare to start, effective
early FY08.
 Linux on VMWare and Solaris 10 Zones to
follow in late FY08 or early FY09.
 This solution would begin as a CIT-internal
service.
 Support to customers outside CIT as a
Designated Service to be in the FY09 time
frame or later.
Expected Outcomes –
Service Owners
 No longer track capital, maintenance and
replacement cycles. Simply budget ongoing
service charges.
 Add and Subtract virtual OS instances to meet the
needs of the business, rather than being
constrained by the hardware on hand.
 Shorter time from project inception to availability.
 Emergency preparedness is “built in”, reducing
effort and costs to individual programs, and
providing consistency across CIT.
Expected Outcomes –
Facilities
 The growth of servers in the farm decreases.
 Slower growth in power, cooling and space
consumption.
 Fewer server owners will simplify
administration of physical servers in the
farm, reducing staff effort.
Expected Outcomes –
Sysadmins
 Installation, maintenance, and retirement
processes greatly streamlined, reducing effort per
server.
 More servers / admin become possible.
 Improved practices and procedures yield quicker
and more accurate ticket resolution.
 Streamlined and standardized configurations allow
for more preventative maintenance, and fewer
emergencies, outages and security incidents.
Financial Overview
4 year cost
Do Nothing
Virtualize
Server Purchases
400,000
124,000
U-charges
58,368
5,837
Network Charges
49,920
7,488
SAN connection fees
23,040
46,080
SAN management fees
28,416
28,416
SAN storage
0
2,611
Virtual Center Server
0
5,000
Virtual Center Licenses
0
3,000
Virtual Center Maint
0
5,000
ESX licenses
0
27,600
ESX maint
0
45,920
559,744
300,952
4 year Cost
Timelines
 Charter: Completed
 Project Plan: 3/29/07
 Phase 1 Completion: 7/1/2007
VMWare/Windows
 Ideal initial candidates would be test and
development servers.
Questions?!?
CIT Public Computing
Pat Washburn
Manager, Student Computer Operations
Overview
 Support ~420 public systems
 5 Instructional Labs
Plus Uris Library Electronic Classroom
 7 General labs
 1 laptop lab
 30 e-mail kiosks
 Cornell in Washington lab
Usage Stats –
Instructional Labs
Class Sessions per Year 2002 - 2006
Class Hours per Year 2002 - 2006
1400
4000
1200
3500
3000
1000
2500
800
2000
600
1500
400
1000
200
500
0
0
2002
2003
2004
2005
2006
2002
2003
2004
2005
2006
Recent
Accomplishments
Phillips Hall Lab
Existing ECE 60 seat general lab
Converted to CIT Lab January 2006
25 Student + 1 Instructor instructional lab
22 seat general lab
New Layout
Distributed Video
System
Instructor’s Station
Upcoming Labs
 Willard Straight Hall
Opening soon!
Replaces WSH darkroom
12 Student stations : 6 Mac, 6 PC
Flexible Multimedia Creation space
 Mann Library
Opening Fall 2007
Instructional Lab and Flex-space
40 Systems supplement existing laptops
Willard Straight
Mann Instructional Lab
Questions?
Comments?
EZ-Backup Update
Bob Talda
EZ-Backup Team
Recent Activities
 Upgraded TSM Server to v5.3
Better data management & storage utilization
 Released TSM v5.3 client for Linux, Mac
 Provided interim solution for Intel Macs
1,000
0
2007
50
2006
2,000
2005
100
2004
# Users
2003
3,000
2002
150
2001
4,000
2000
2007
2006
2005
2004
2003
2002
2001
2000
EZ-Backup Usage
Data (TB)
EZ-Backup Price
History
Monthly Cost to Backup
50GB of data (compressed)
$240
$160
$80
94% 
2007
2006
2005
2004
2003
2002
2001
2000
$-
New TSM Client
Software
Available this month:
 v5.3.4.4 for Windows (W2K, XP, W2K3)
 v5.4.0.0 for MacOS
Requires MacOS v10.4.7+
(has libraries required for TSM)
Intel or PPC
 v5.4.0.0 for Windows Vista
 v5.4 for Unix (available from IBM site)
What’s New in TSM
5.4
 Improved Memory utilization for very large
file systems
Option to use disk cache
Memory cache is default as today
 Performance improvements
What’s New in TSM
5.4 (Mac)
 Full Support for Intel-based Macs
All command line tools are now universal
GUI is now Java based
 Switched to MacOS Installer
 No longer uses hostname for password file
encryption
Addresses issue with mobile DHCP TSM clients
What’s New in TSM
5.4 (Vista)
 TSM BA client will run on Windows Vista
and take advantage of the Windows Volume
Shadow Copy Service (VSS) to backup
system state and system services.
Notable Recently
Added Features (5.3+)
 Include/Exclude Preview Utility
 Ability to Delete Individual Backup Files
 Adaptive Sub-file Backup Option
Byte or block-level incremental
Speeds backup over slower networks
 Encryption Option for Sensitive Files
AES 128-bit
 Open File Support
 Multi-session Backup / Restore
 Journal-Based Backup
What’s Next
 TSM v5.4 for Windows (non-Vista)
 TSM v5.4 for Linux
 User Education Workshops
 Web-based reporting for departmental users
 Augment tape storage with inexpensive disk
e.g., Virtual Tape
Goal: speedier restores
DRP Task Force
Recommendations
 Recommendations were made by the ITMC DR
task force to have a collateral site at Weill Medical
college and to alter the funding model to make use
of the service more attractive.
 Those recommendations have been presented to
the administration, and are pending approval and
funding.
Questions ?