Transcript Document
Systems and Operations CIT Forum February 21, 2007 Systems and Operations Organization Rick MacDonald, Director, S&O Rick MacDonald Director Systems and Operations Brian Messenger Assistant Director Systems Mark Bodenstein Project Leader Mainframe Systems Programming James Bohnsack Mike Garcia Jason Lowe Rick Polcaro Manager Facilities & Laser Printing Services Randy Smith Peggy Roberts Don Mac Leod Assistant Director Systems Services Mariann Carpenter Manager Systems Administration Moe Arif Mary Cronk Doug Flanagan Solomon Welch Tom Walden David Shirk Kent Ross Vacant (01/07) Laurie Collinsworth Manager Systems Engineering Scott Sorrentino Mike Heisler Jim Yang John Wobus Chris Manly Javier Streb Paul Zarnowski Manager Storage Services Bob Talda David Beardsley Michael Hojnowski Manager Special Projects Jim Howell Manager Messaging Services Jennifer Moore Gail Shaff George Medlar Todd Olson Client Services Rick Cochran Michelle Mogil Lee Brink Dan Bartholomew Vacant Vicky Dean Assistant Director Operations Carol Uber Manager Technical Support Operations Faye Cunningham Manager Technical Support Services Randall Frank Bill Kelemen Kevin Pelletier Peter Skura Karen Daniels Pat Washburn Manager Student Computing Operations Juan Salomon Team Leader NOC First Shift Lori Beebe Brandon Bowers Joanne Button Richard Cicciarelli Kim Nicholson Pat Graham Alan Heiman Jason Barnello Beth Sprankle Jim Conley John Ryan Brian Witchey Bryan Benning Team Leader NOC Second Shift Mark Allen Jay Howell Lillian Isacks Chuck Thomas Manager Production Operations Ken Frost Team Leader NOC Third Shift Ruth Burroughs Linda Jaynes Greg Marvin Dan Miller Carl Moravec Theresa Norman Maureen Quillinan Rich Fraboni James Reed John Becker Barbara Van Etten Jenny Signor Technical Lead Operations The Network Operations Center (NOC) is a centralized control center that provides 24/7 monitoring, testing, change management, and network configuration support. Production Control and Tape/Print Operations support systems and applications for many administrative functions and student services. Technical Support Operations is responsible for providing hardware, software, and network support for workstations, print/file servers, and LANs throughout CIT. TSO also supports and maintains general and instructional computing facilities across campus. Systems Services Systems Services is the organizational home for S&O’s broadly used campus services. Messaging Services supports our PostOffices, IMAP/POP, Virus and SPAM filtering; Oracle Calendar; Bulkmail Service; E- List Services; and Usenet. Client Services is the home for Net-Print, EZ-Remote, Email client support, BearAccess, and other packaging and delivery activities. Systems Systems provides systems administration and systems programming support for projects that span the 415+ CITOwned servers and 80 terabytes of storage located in our machine rooms. The Systems Administration team provides the hardware and operating system support for CIT’s servers and storage assets. Support is also provided to other departments on a contractual basis. Systems Engineering provides and enhances tools for our production servers and services, such as NetVigil and Single Sign-On, as well as specific support for Network Quarantine and the DNS/DHCP service and its ancillary systems. Systems Mainframe Systems Programming maintains and upgrades the mainframe operating system software and other mainframe infrastructure software. Facilities & Laser Printing Services supports the infrastructure within the CIT machine rooms in Rhodes Hall and CCC and administers the operations of the CIT Server Farm, hosting more than 530 servers. Xerox laser printing services provides custom forms, fonts, and graphic programming in support of various administrative services. Storage Services operates EZ-Backup and administers the CIT Storage Farm. Recent Activities Single Sign-on was implemented for CIT servers allowing for central management of user ID’s and groups. Each person is allowed appropriate access to managed servers through a single password. CFEngine was implemented allowing central management and distribution of configuration files to all managed Linux and Solaris servers. Systems Administration began offering full RedHat Linux support in the server farm. Recent Activities In partnership with IS Infrastructure staff, Sun Clustering was deployed, enhancing the efficiency and resilience of Solaris-based applications. Beginning in December, a monthly newsletter highlighting Systems activities was released to users of Systems services. Implemented NocDocs, a WIKI which integrated on call lists and problem resolution documentation with NetVigil. Recent Activities Over 300 servers patched for the upcoming Daylight Savings Time change. Lyris replaced Listproc as our Email list processor. 3800 lists are currently in Lyris with 130 to go. This change has also provided enhanced bulk email functionality. The CIT Storage Farm became a campus service with the Library as the first customer. Last weekend, installed a new mainframe, maintaining support for non-PeopleSoft administrative systems. Recent Activities We worked with IS on the Peoplesoft migration from AIX to Solaris, also completed last weekend. Bulkmail requests have averaged 10/week for the last several months SPAM Reduction Program - On January 15, 2007, we lowered the “SPAM probability” threshold for rejecting email from 90% or higher to 80% or higher. This has reduced spam significantly. Ja n0 Fe 1 b01 M ar -0 Ap 1 r01 M ay -0 Ju 1 n01 Ju l-0 Au 1 g0 Se 1 p0 O 1 ct -0 1 N ov -0 1 D ec -0 Ja 1 n0 Fe 2 b02 M ar -0 Ap 2 r02 M ay -0 Ju 2 n02 Ju l-0 Au 2 g0 Se 2 p0 O 2 ct -0 2 N ov -0 2 D ec -0 2 Number of Messages Recent Activities Spam Rejected last 24 months Month Spam Rejected 50,000,000 45,000,000 40,000,000 35,000,000 30,000,000 25,000,000 20,000,000 15,000,000 10,000,000 5,000,000 0 Upcoming Activities Mail Channels – a traffic shaping program which identifies and throttles the processing of SPAM traffic will be implemented in March. We will increase the email message size limit in place during business hours from 10MB to 50MB in March. The Cornell Optional Email Address will be made available to academic staff in April. The QI to LDAP gateway will be retired sometime thereafter. We will continue to improve our support model for mobile messaging. Upcoming Activities We are working with the Library to evaluate the appropriate technologies to support their Large Scale Digital Initiative. A new 25 seat CIT instructional lab will be opened at Mann Library for the Fall semester. A 12 seat CIT multimedia lab will be opened in partnership with the Dean of Students office in Willard Straight Hall. Analysis of the use of VMWare and of Solaris 10 to provision virtualized server environments is underway. Upcoming Activities We will implement the Tidal software purchased for automated scheduling and job submission within the distributed systems infrastructure. This will enable our Production Control staff to operate more efficiently as additional PeopleSoft modules are deployed. We will continue to support the CIT building project, particularly in the area of machine room needs and design. Working with ATSUS to develop customer facing documentation to assist with acquisition and deployment of servers. Upcoming Activities Beginning a new project with NCS and Security to improve the server farm network topology and to harden server farm networks through use of ACL’s and other technologies. Evaluating Storage Farm RFP responses. Should lead to deployment of storage virtualization tools, iSCSI support and introduction of a lower priced tier of storage. Questions?!? Virtual OS Hosting Project Mike Hojnowski Manager, Special Projects Our Current model Service owners purchase servers from their own budgets. S&O Sysadmins manage the servers. We experience continuing growth. Server Growth Servers managed per year 450 400 350 300 250 200 150 100 50 0 FY02 FY03 FY04 FY05 FY06 FY07 Averaging 23% per year growth FY08 fcast Limitations – Service Owner Servers for test/dev/prod. Often underutilized. Service owners must track and budget replacement cycles, maintenance. Emergency Preparedness is the responsibility of the service owner, and may require yet more dedicated hardware. Long lag from budgeting to procurement to implementation. Short term server needs must be met with a “permanent” purchase. Limitations – Facilities Server sprawl – 400 CIT servers and growing. Limited machine room space – we’re running out. Cooling problems – new machines are densely packed and run hot. Power problems – our UPS and Generator capacity is not unlimited. Limitations – Sysadmins Servers to Admin ratio is 39/1. We expect more than 40 additional servers (or virtual instances) per year. Our future looks like this…. FY 02 FY 03 FY 04 FY 05 FY 06 FY FY 08 07 FY fca 09 st FY fca 10 st FY fca 11 st F1 fca 2 st fc as t Number of Servers 900 700 600 500 400 200 0 15 10 300 5 100 0 Administrators Limitations – Sysadmins Server Growth - Projected 25 800 20 Servers Administrators Virtualization Overview VMotion Overview VMotion for DR Proposed Solution Operating System Virtualization Service owners “rent” virtual OS hosts. Revised Funding Model S&O buys/maintains physical servers. Best Practices Improvements Major increase in the servers/admin ratio. Emergency Preparedness Handled at the “infrastructure” level, for an additional cost. Cautionary Tale “The cost savings from server virtualization come almost entirely from hardware reduction… However, administrative differences between managing a virtual server and a physical server are not significant. Most of the ongoing administrative costs are tied with each OS instance and not to physical hardware. Action Item: Don’t count on virtualization to reduce labor costs; for that, implement standardization and automation.” - Gartner Scope of Solution Windows on VMWare to start, effective early FY08. Linux on VMWare and Solaris 10 Zones to follow in late FY08 or early FY09. This solution would begin as a CIT-internal service. Support to customers outside CIT as a Designated Service to be in the FY09 time frame or later. Expected Outcomes – Service Owners No longer track capital, maintenance and replacement cycles. Simply budget ongoing service charges. Add and Subtract virtual OS instances to meet the needs of the business, rather than being constrained by the hardware on hand. Shorter time from project inception to availability. Emergency preparedness is “built in”, reducing effort and costs to individual programs, and providing consistency across CIT. Expected Outcomes – Facilities The growth of servers in the farm decreases. Slower growth in power, cooling and space consumption. Fewer server owners will simplify administration of physical servers in the farm, reducing staff effort. Expected Outcomes – Sysadmins Installation, maintenance, and retirement processes greatly streamlined, reducing effort per server. More servers / admin become possible. Improved practices and procedures yield quicker and more accurate ticket resolution. Streamlined and standardized configurations allow for more preventative maintenance, and fewer emergencies, outages and security incidents. Financial Overview 4 year cost Do Nothing Virtualize Server Purchases 400,000 124,000 U-charges 58,368 5,837 Network Charges 49,920 7,488 SAN connection fees 23,040 46,080 SAN management fees 28,416 28,416 SAN storage 0 2,611 Virtual Center Server 0 5,000 Virtual Center Licenses 0 3,000 Virtual Center Maint 0 5,000 ESX licenses 0 27,600 ESX maint 0 45,920 559,744 300,952 4 year Cost Timelines Charter: Completed Project Plan: 3/29/07 Phase 1 Completion: 7/1/2007 VMWare/Windows Ideal initial candidates would be test and development servers. Questions?!? CIT Public Computing Pat Washburn Manager, Student Computer Operations Overview Support ~420 public systems 5 Instructional Labs Plus Uris Library Electronic Classroom 7 General labs 1 laptop lab 30 e-mail kiosks Cornell in Washington lab Usage Stats – Instructional Labs Class Sessions per Year 2002 - 2006 Class Hours per Year 2002 - 2006 1400 4000 1200 3500 3000 1000 2500 800 2000 600 1500 400 1000 200 500 0 0 2002 2003 2004 2005 2006 2002 2003 2004 2005 2006 Recent Accomplishments Phillips Hall Lab Existing ECE 60 seat general lab Converted to CIT Lab January 2006 25 Student + 1 Instructor instructional lab 22 seat general lab New Layout Distributed Video System Instructor’s Station Upcoming Labs Willard Straight Hall Opening soon! Replaces WSH darkroom 12 Student stations : 6 Mac, 6 PC Flexible Multimedia Creation space Mann Library Opening Fall 2007 Instructional Lab and Flex-space 40 Systems supplement existing laptops Willard Straight Mann Instructional Lab Questions? Comments? EZ-Backup Update Bob Talda EZ-Backup Team Recent Activities Upgraded TSM Server to v5.3 Better data management & storage utilization Released TSM v5.3 client for Linux, Mac Provided interim solution for Intel Macs 1,000 0 2007 50 2006 2,000 2005 100 2004 # Users 2003 3,000 2002 150 2001 4,000 2000 2007 2006 2005 2004 2003 2002 2001 2000 EZ-Backup Usage Data (TB) EZ-Backup Price History Monthly Cost to Backup 50GB of data (compressed) $240 $160 $80 94% 2007 2006 2005 2004 2003 2002 2001 2000 $- New TSM Client Software Available this month: v5.3.4.4 for Windows (W2K, XP, W2K3) v5.4.0.0 for MacOS Requires MacOS v10.4.7+ (has libraries required for TSM) Intel or PPC v5.4.0.0 for Windows Vista v5.4 for Unix (available from IBM site) What’s New in TSM 5.4 Improved Memory utilization for very large file systems Option to use disk cache Memory cache is default as today Performance improvements What’s New in TSM 5.4 (Mac) Full Support for Intel-based Macs All command line tools are now universal GUI is now Java based Switched to MacOS Installer No longer uses hostname for password file encryption Addresses issue with mobile DHCP TSM clients What’s New in TSM 5.4 (Vista) TSM BA client will run on Windows Vista and take advantage of the Windows Volume Shadow Copy Service (VSS) to backup system state and system services. Notable Recently Added Features (5.3+) Include/Exclude Preview Utility Ability to Delete Individual Backup Files Adaptive Sub-file Backup Option Byte or block-level incremental Speeds backup over slower networks Encryption Option for Sensitive Files AES 128-bit Open File Support Multi-session Backup / Restore Journal-Based Backup What’s Next TSM v5.4 for Windows (non-Vista) TSM v5.4 for Linux User Education Workshops Web-based reporting for departmental users Augment tape storage with inexpensive disk e.g., Virtual Tape Goal: speedier restores DRP Task Force Recommendations Recommendations were made by the ITMC DR task force to have a collateral site at Weill Medical college and to alter the funding model to make use of the service more attractive. Those recommendations have been presented to the administration, and are pending approval and funding. Questions ?