Transcript Document
SvalTech Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson [email protected] www.svaltech.com “Database Archiving: How to Keep Lots of Data for a Long Time” Jack E. Olson, Morgan Kaufmann, 2008 Copyright SvalTech, Inc., 2009 SvalTech Why This Presentation • A common position of many IT shops is – We know we should be doing database archiving – We know it will be valuable to us – But we don’t know how to get started • Database archiving is an enterprise technology: it can be used in many applications • Not all database applications are suitable for database archiving • Suitable applications have widely differing return-on-investment potential Copyright SvalTech, Inc., 2009 2 SvalTech The Database Archiving Survey organize survey team application enumeration first-cut feasibility data-life-cycle analysis operational analysis risk analysis metric gathering evaluate implementation options business case development prioritization Copyright SvalTech, Inc., 2009 3 SvalTech The Survey Organization Mandate People Inputs Mandate A management directive that creates the database archiving survey task force and gives them the scope and objectives of the study. Scope: business units to include, organizational units (divisions, companies, campuses) Objectives: find best candidates for cost reduction, fixing operational problems, risk reduction Copyright SvalTech, Inc., 2009 4 SvalTech The Survey Team Mandate People Inputs Copyright SvalTech, Inc., 2009 Chair Fulltime members IT/enterprise architect storage administration records retention Subject matter members database architect data management business unit data analyst database administration Incidental members legal department IT compliance data governance security administration data analyst (BI type) 5 SvalTech Starting Materials Mandate People Inputs Copyright SvalTech, Inc., 2009 Enterprise data model Data classification results SLA’s IT storage strategy Regulations/compliance rules Data governance mandates 6 SvalTech Application Enumeration Limit search to those within mandate business unit location enterprise Identify Operational Applications classify as transactional vs. static data include those already archiving to any extent Identify Retired Applications still retaining data Identify applications about to change consolidations planned planned or recent acquisitions replacements/ conversions/ reengineering identify any strategies for application retirement Copyright SvalTech, Inc., 2009 7 SvalTech Application Enumeration For Applications with potential, Capture application data model Identify business records within the data model Connect business records to records retention and legal categories Identify database information: system/dbms/file/metadata Create a Database Topology chart Identify parallel applications within the corporation (even if out of scope) Identify operational replicates Identify backup/disaster recovery stores and strategies Identify recurring data extracts for BI, etc. Get rough idea of db size and transaction rates Copyright SvalTech, Inc., 2009 8 SvalTech Database Topology Chart create data operational replicate operational archive offline storage backup BI stores CRM backup disaster recovery Copyright SvalTech, Inc., 2009 9 SvalTech First-Cut Feasibility Factors for continuing to consider, important data lots of data lots of individual business records simple data structures relatively stable data structures (little change) long retention requirement long inactive period within retention requirement low frequency access requirement in inactive period low performance requirement in inactive period simple access requirements in inactive period Apply criteria after each subsequent step to further eliminate bad candidates Copyright SvalTech, Inc., 2009 10 SvalTech Examples Good Bank deposits and withdrawals Stock trades Credit card transactions Ticketmaster transactions Medical claim data Casualty claim data (auto, home) Retail sales inventory transactions Package tracking Passenger flight data Driver license records Sales tax records Property tax records Telephone call transactions Nuclear reactor monitoring records Auto warrantee records Copyright SvalTech, Inc., 2009 Not Good Customer master data Airplane manufacturing records HR records Felony records Home sales 11 SvalTech Data Life Cycle Analysis Create a database archiving DLCA for each business record type Data Retention Chart Business Record Process Chart to determine inactive period Business Record SLA chart by age of record Copyright SvalTech, Inc., 2009 12 SvalTech Data Retention Chart The requirement to keep data for a business object for a specified period of time. The object cannot be destroyed until after the time for all such requirements applicable to it has past. Business Requirements Regulatory Requirements The Data Retention requirement is the longest of all requirement lines. Copyright SvalTech, Inc., 2009 13 SvalTech Business Record Process Chart for a single instance of a data object Retention requirement time Create PO Update PO Create Invoice Backorder Create Financial Record Update on Ship Update on Ack Weekly Sales Report Quarterly Sales report Extract for data warehouse Extract for bus analysis Common customer queries Common bus queries Ad hoc requests Law suit e-Discovery requests Investigation data gathering operational Copyright SvalTech, Inc., 2009 reference inactive 14 SvalTech Business Record SLA Chart by Age for a single instance of a data object Retention requirement time Query response time Transaction volume create/update Security (no users) read operational Copyright SvalTech, Inc., 2009 reference inactive 15 SvalTech Operational Analysis Don’t assume there are no problems. Talk to DBAs and users. Look for trends Look for escalating operational costs. Get numbers. Copyright SvalTech, Inc., 2009 16 SvalTech Operational Analysis • • • Performance Issues – Not meeting response time SLA – Longer time to run extracts – Longer time to run backups – Longer time to run database reorganizations – Running reorganizations more frequently – More difficult to tune Risk Issues – Longer estimated time to run recovery – Longer estimated time to run disaster recovery Cost Issues – Higher annual hardware costs – Higher annual MIP-based software cost – Adding expensive DASD to support database and backups Copyright SvalTech, Inc., 2009 17 SvalTech Risk Analysis • • • • Data Loss Risk – Isolation from internet hackers – Prevent ANY updates or deletes – Preserve data through multi-site backups and periodic pings Data Quality Risk – Changing data structures and column semantics – Changing reference data Unauthorized Access Risk – Reduced (or different) user set – Audit trail of access Legal Risk – Preserve authenticity of data in archive – Reduce cost and time to produce data for discovery requests Copyright SvalTech, Inc., 2009 18 SvalTech Metric Gathering Data bytes stored per business object new transactions created per day bytes for backups, replicates growth in transactions rates any sudden expected additions past history plus future projections Storage Costs cost per byte: operational cost per byte: backup cost per byte: archive archive compression ratios System Costs mips required to process software license fees staff for operational Copyright SvalTech, Inc., 2009 19 SvalTech Metric Gathering For retired applications concentrate on displaced system cost displaced software cost displaced staff cost NOT shared Shared IBM mainframe IMS DBMS CICS DBA/SYSPROG Copyright SvalTech, Inc., 2009 LINUX server Archive software JDBC Archive admin 20 SvalTech Evaluate Implementation Options • Software – Vendor provided software – Custom built solution • Access tools – Original application – Generic report generation/ query tools – Custom built • Storage for archive – Storage subsystem – Hosted storage Copyright SvalTech, Inc., 2009 21 SvalTech Architecture of Database Archiving Operational System Application program Archive Extractor Archive Administrator Archive Designer Archive Data Manager Archive Access Manager OP DB Archive extractor Archive Server archive catalog archive storage Copyright SvalTech, Inc., 2009 22 SvalTech Estimate Implementation Time and Cost • Archiving systems required – Servers – Storage systems (hosted storage?) – Licensed software • Application Design • Implementation • Test • Deployment • Ongoing operation and administration Copyright SvalTech, Inc., 2009 23 SvalTech Business Case Development – Lower IT costs – Improved operational efficiency – Risk reduction Copyright SvalTech, Inc., 2009 24 SvalTech Lower IT Costs • Systems – Reduce size/cost of operational systems – Put off or eliminate need for system upgrades • Software – Eliminate or reduce cost of expensive system software • DBMS • Transaction system – Eliminate or reduce cost of application software • Storage costs – Switch to lower cost storage – Impact on backups/ disaster recovery stores – Reduction in byte count stored • Staff – Eliminate or reduce legacy system staff Copyright SvalTech, Inc., 2009 25 SvalTech Chart it All data in operational db Inactive data in archive db most expensive system most expensive storage most expensive software least expensive system least expensive storage least expensive software In a typical op db 60-80% of data is inactive Size Today This percentage is growing operational Copyright SvalTech, Inc., 2009 operational archive 26 SvalTech Lower IT Costs • First year impact • Time to recover project costs • Chart cost savings over time – Plot data growth over time for operational – Plot data growth over time of archive Copyright SvalTech, Inc., 2009 27 SvalTech Operational Improvements • Itemize improvements expected – Performance of operations – Reduction of utility times – Reduction of recovery times – Reduction of disaster recovery times – Reduction of DBA workload • Provide cost savings where appropriate Copyright SvalTech, Inc., 2009 28 SvalTech Risk Reduction • Itemize improvements expected – Less risk of failing e-Discovery request – Enhanced data quality of older data – Less exposure to loss of data authenticity – Better access control – Better compliance – Better data governance – Less dependence on legacy systems • Provide cost savings where appropriate Copyright SvalTech, Inc., 2009 29 SvalTech Prioritization – Determine Prioritization Criteria • Cost is most common primary factor – First archiving project may have other goals • Lower risk of failure • Faster implementation • Faster return on investment • Usually a retired application project – Risk may over-ride other factors • Preserve data authenticity Copyright SvalTech, Inc., 2009 30 SvalTech Final Thought • Always do a survey to find the best applications to start with • Always do a survey to identify those that make sense to proceed with versus those that do not: don’t waste time on apps that are too hard to implement or that will have little value. • A good database archive application can save millions of dollars per year, increase performance of operational systems and reduce risk all at the same time. The trick is identifying them and proving it. • Repeat the Database Archiving Survey from time to time in the future. Copyright SvalTech, Inc., 2009 31