Transcript Slide 1
Building Data Warehouse at Rensselaer Ora Fish Rensselaer Polytechnic Institute Best Practices in Higher Education Student Data Warehousing Forum Northwestern University October 20-21, 2003 Agenda Background Development Methodology Rollout Strategy Summary Status – where are we now Benefits Lessons Learned Demonstrations of the Financial Analysis Data Mart – Executive Information Systems Q&A Facts about Rensselaer (RPI) Founded in 1824 by Stephen Van Rensselaer “We are the first degree granting technological university in the Englishspeaking world” Research University Total Students 9,145 Graduates – 4,006 Undergraduates – 5,139 Faculty - 450 Facts about Presenter Bachelor in Math and Computer Science from Tel-Aviv University, Israel MBA from RPI 23 years in system implementation and software development in variety of technical and management positions Over 8 years in RPI Involved with the Data Warehouse for the past five – six years Fundamental Problem Operational systems are not designed for information retrieval and analytical processing First attempt to fund DW project Fails Reasons for failing funding in Fall of 1997 : Timing Expectation Lucking business sponsor Analytical culture What does it really means?? Second Attempt to Fund DW Project is Successful Spring 2000 - The following changes had occurred: Timing – Banner does not addresses reporting; Views are too slow to be used Organizational changes (New President, CIO) Performance Planning We have build a Prototype Buy-in Process Demonstrate to those who need this information desperately The word is out From the CIO to the committees to the cabinet Buy-in Process We are prepared to address: Budgets Timelines We are ready with the white paper to communicate the key components (iterative development under overall planning, business users involvement, meta data, approaches) The Fundamental Goal The fundamental goal of the Rensselaer Data Warehouse Project is to integrate administrative data into a consistent information resource that supports planning, forecasting, and decision-making processes at Rensselaer. Development methodology Phase I – Building Foundation Phase II – Iterative Process of Building Subject Oriented Data Marts On going Operations: Support and Training; Maintenance and Growth Rolling Implementation FY02 Infrastructure Planning/Staffing Software Database/Hardware Production Platform Policy Data Policy Datamarts Finance/Research Position Cntrl/Labor Human Resources Enrollment Grad Financial Aid Undergrad Fin Aid Contracts & Grants Admissions Pipeline Operations Support Software Upgrade Database Upgrade Hardware Growth Req FY03 Dev & Test FY04 Rollout FY05 Phase I – Building Foundation Organizational Structure Project framework and high level plan Building Technical Infrastructure Develop Data Policies and Procedures Hiring Project Organizational Structure Sponsorship Group Progress report Steering Committee Forming Implementation groups; Defining scope and deliverables Implementation issues Implementation groups Data Warehouse Group Bus ine s s Inte llige nce Se le ction Com m itte e Financial Analys is Im ple m e ntation Group Financial Analys is Re vie w e r Group Data Policy Gr oup Project Framework High Level Analysis Prioritization process Hire and train staff, Choose consultant Establish communication channels (web site, newsletters, kickoff event…) High Level Analysis and Prioritization process X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Enro ll me nt M gm t Stud e nt L ife Ins ti tute Ad v. Prov o st O ffice Rese arch Fin a nce Gov ' t Re la tio ns Pres i den ts St aff HR BUSINESS PROCESSES Enrollment Analysis Student Pipeline Analysis Faculty Workload Assessment Financial Analysis Contract and Grants Analysis Proposal Pipeline Analysis Financial Analysis - Research Graduate Financial Aid Alumni Demographics and Tracking Alumni Contact Management Human Resources Facilities Management Sch o ols/ D e ans Dep' t Ch ai rs Regi strar CONSTITUENCIES X X X X X X Prioritization Process High PP AC FR FA SP CG HR Value to Rennselaer Low FW EA AD GF Feasibility FM High Building Technical Architecture DATA SOURCES transactional systems banner reporting instance (clone) AIX Banner Reporting Instance DATA ACQUISITION • extraction • transformation • modeling • loading DATA WAREHOUSE • central repository • subject-based data marts • metadata • conformed dimensions production data warehouse machine AIX 4-CPU 4GB RAM Oracle 9i DWDB (targets) Informatica PowerCenter ETL Server DATA DELIVERY • user-facing applications • report generation • subject-based data cubes • data mining Brio Portal AIX Brio ODS/JF Node DATA CONSUMPTION • business intelligence • decision-support • OLAP • querying • reporting end-user machines Microsoft Excel analysts Oracle 8i [DWDB] Brio Portal Informatica Repository Brio Insight metadata analysts Banner Oracle 9i [PROD] banner production AIX Brio Shared Metadata Brio WebClient webserver AIX Dash Boards content Viewers Technical Architecture Inventory ERP – Banner from SCT ETL – Power Center from Informatica Data Base – Oracle 8i Models – Star schemas with conformed dimensions Web Front end tools – Brio, Dash Boards Desktop Front End tools – Brio, Excel Data Security, Privacy and Access Policy Security & Privacy Access & Use Can be defined as striking the “right” balance between data security/privacy and data access Value of data is increased through widespread access and appropriate use, however, value is severely compromised by misinterpretation, misuse, or abuse This policy considers security and privacy paramount Key oversight principle: Cabinet members, as individuals, are responsible for overseeing establishment of data management policies, procedures, and accountability for data governed within their portfolio(s), subject to cabinet review and CIO approval Phase II Components of Building Subject Oriented Data Marts Defining Scope and Timelines Modeling Development Record Metadata Local Testing Core Administration Testing Design and Develop Security Core Administration live in Production Front-End development for the campus Campus Rollout JAD & RAD Approach Defining Scope Identify Constituency Detailed Requirements Definitions Analyze Data Sources / raise issues Define Scope Acceptance/Project Review Develop and approve specific security policy Modeling Subject-based data marts Star Schemas Conformed dimension Graduate Financial Aid Data Model One row per student per term per support type Graduate Support Snapshot ACADEMIC TERM DIM Academic Term Key Student Dimension Student Key Fund Dimension Fund Key Organization Dimension Org Key Student Cohort Bridge Student Cohort Key Cohort Key Student Faculty Advisor Bridge Student Faculty Advisor Key Faculty Advisor Key Graduate Support Snapshot Academic Term Key Prim Program Major Grp Key Sec Program Major Grp Key Student key Student Cohort Key Class Key Fund Key Org Key Account Key Program Key Activity Key Grant Key Student Faculty Advisor Key Account Dimension Account Key Program Dimension Program Key Class Dimension Class Key Tuition Assistance Amt Tuition Assistance Fees Amt Tuition Assistance Disb. Amt Tuition Assistance Fees Disb. Amt Tuition Assistance Expensed Amt Degree Completion Amt Degree Completion Disb. Amt Stipend Amt Activity Dimension Activity Key GFA Support Type Dimension GFA Support Type Key Grant Dimension Grant Key Academic Degree Bridge Dimension Student Degree Key Student Enrollment Model – one row per enrolled student per term Student Enrollment Snapshot Academic Term Dimension Academic Term Key Student Dimension Student Key Student Enrollment Snapshot Academic Program Bridge Dimension Student Academic Program Key Academic Term Key Prim Program Major Grp Key Sec Program major Grp Key Student key Student Cohort Key Class Key Student Faculty Advisor Key Class Dimension Class Key Student Cohort Bridge Student count Matriculated count Credit Hours Registered Credit Hours Attempted Credit Hours Earned Overall GPA Term GPA Tuition Amt Charged Tuition Fees Charged Tuition Amount Billed Tuition Fees Billed Student Faculty Advisor Bridge Student Faculty Advisor Key Faculty Advisor Key Student Cohort Key Cohort Key Summary GFA Model one row per graduate student per term Graduate Student Count Enrollment Snapshot Academic Term Dimension Academic Term Key Student Dimension Student Key Student Enrollment Snapshot Primary Program Dimension Student Academic Program Key Academic Term Key Prim Program Major Grp Key Sec Program major Grp Key Student key Student Cohort Key Class Key Student Faculty Advisor Key GFA STATUS GFA Key Primary Funding Source Student count IRA count ERA count TA count Fellowship count Scholarship count Self supported count Cumulative terms enrolled Cumulative terms affiliated Current Tuition Amt Charged Current GFA Stipend Amt Student Faculty Advisor Bridge Student Faculty Advisor Key Faculty Advisor Key Funding Key Development - ETL Data Staging Design and Development Design & Develop Aggregation Process Develop Data Quality Assurance Processes User testing testing and testing ….. Note: the Data Warehouse serves the needs for ad-hoc analysis and reporting of various groups of users Testers are: Deans, Cabinet, Financial Managers, Core Administration offices… Testing period is an opportunity to create more definitions, groupings, and transformations… Prior To User Acceptance Testing Identify Testing Candidates - key users identified in the scope Train Users in Brio Transfer of Knowledge from Developer to Testing Group Sample Reports Document Data Mart Description Document Standard Naming Conventions Document Common Uses for Each Star Schema User Set-Up Testing sessions Allocating time slots Targeting – aiming to produce results Verifying that the models do address the need Great opportunity to bridge diverse groups Defect/Enhancement Log Date Reported Priority Level (i.e. High, Medium, Low) Defect and/or Enhancement Description User Reporting Defect and/or Enhancement Defect/Enhancement Status Incoming Pending Work In Progress User Acceptance Testing Closed Focus Group Assigned To Resolution User Assigned To Test Resolution Recording Metadata User driven effort Stored in Informatica repository Accessed via Brio Development - Securing Data Marts Ensure that the subject oriented Data Policy is defined Technically feasible Approved Build Security Front End application Data Security options Securing schemas Securing facts only Securing dimensions only Securing both facts and dimensions Nuts and Bolts of the Data Base Security Data Base security applies to all individuals given either direct access to the warehoused data or given permissions to process Brio dynamic reports Organization Managers And Financial Managers will have access to the warehoused financial data based on the following criteria All financials posted against that Org All funds listing that Org as a home Org (in cases of research funds, this defines where the research is brought into) All funds listing the PIs (or the Financial Manager) associated with that Org as fund financial managers. (Resolves the Multi-disciplinary issue) All funds and orgs listing that Org as a predecessor in either one of the above three cases. Administrative role: Individuals might be granted access to additional funds and org based on their needs and their role within Rensselaer. Position Control and Labor Data Policy Overview Already have access to Labor data in Banner Completed DW training Access to Budgets and Labor data for all Funding, Employees, or Positions owned by their Organization as following: Funding: All actual and budgeted labor expenses posted against their Organization Employees: All actual and budgeted labor expenses associated with the Employees reporting to their Organization within the timeframes of the employees’ employment in the Organization. Positions: All actual and budgeted labor expenses associated with Positions owned by their Organizations. All of the above within their Organizations’ hierarchy. HR Security Policy Overview Access to Employee Information Level 1 (Fund Fin. Mgr) Level 2 (Department/ Division or School Manager) Level 3 (HR) Name Yes Yes Yes RIN Yes Yes Address No Yes Yes Yes Age No No Yes Ethnicity No No Yes Marital Status No No Yes Gender No No Yes No Yes No Yes Yes No Yes Yes Employee Effective Dates and Status No Yes History: Name Changes Yes Yes Yes History: RIN Changes Yes Yes Yes History: Demographic Changes No No Yes Benefits Eligibility Categories No Yes Yes Hire Date(s) No Yes Yes T ermination Data Employee T ype (Full/Part T ime, FLSA Status, Employee Category) No Yes Yes No Yes Yes Portfolio and Department Data (Current) Yes Yes Yes Portfolio and Department Data (History) No No Yes Category Citizenship Deceased Veteran Status Yes Yes Enrollment and Graduate Financial Aid Data Policy Overview Access to aggregate data is based on “need to know” Access to student identifiable information is restricted as following: Sponsoring graduate students Major Advisement Central administration/management of the University Access to Undergraduate Financial Aid Restricted to very few positions within: President Office Institutional Research Students Records and Financial Services Financial Aid Office Development – Front End Dash Board Design and Development – Joint effort with Core Administration Training testing groups in Brio Develop first version of Brio dynamic documents and publish via Portal – Joint effort with Core Administration Campus Rollout Defining roles and responsibilities Who will have initial access to what Develop Roll out strategy Setting expectations Designing and carrying out Training Programs Communicate Executive briefings During Training Campus orientations Wed site Any possible vehicle …. Initial Tiered Access – Who will have access to what Cabinet; Deans; Department Chairs; Center Directors Dash Board Department Financial Managers Information published In Brio documents Finance Administration Portfolio Financial Managers Data in the Warehouse Brio Products Overview (Brio Intelligence) Brio Desktop User (i.e., Brio Explorer, or Designer) Data Warehouse Connects to the DW without Portal Brio Portal Portal and Insight are also available to Desktop users via the Web Finance Data Mart Position Control Data Mart Graduate Financial Aid Student Enrollment Brio Insight User Web Connects to the DW with Insight and Portal via the Web Folders, Published Documents, Personalized Content, Dashboards, ERD Brio Insight Other Data Marts Each user will have separate Portal and database usernames and passwords. The Portal login provides the user with access to published content based on a security profile. The database login is necessary to extract data from the Data Warehouse. Brio Portal Allows users to access published documents (e.g., BQYs, Brio manuals, training documents) and personalize their content Executive Dashboard Overview Accessed via the Portal High-level, graphical views of Portfolio-specific data Designed primarily for executive use, though available to other users as well Comprised of monthly summary data, refreshed nightly Dashboard Help: http://www.rpi.edu/datawarehouse/dw-help-dashboards.html Campus Rollout Assumptions Training is mandatory at all levels. Several levels of training will be offered to campu in Brio tools, Data, and Data Policies. Joint effort between DW Group and Core Administration Portfolio Financial Managers responsibilities: Rollout within Portfolio Training within Portfolio Data Warehouse Cascaded Rollout Strategy 1. Core Administration 2. Portfolio Level (Cabinet, Deans, Portfolio Managers) 3. Department Level (Directors, Center Directors, Department Chairs, Department Financial Managers) 4. Other Training Methodology Training Required High Track 1 Primarily Portfolio Financial Managers who will build adhoc queries and reports (i.e., Brio documents) from data mart star schemas and meta topics. Brio 101 Level 1: Data Mart Basics Track 2 Medium Department Financial Managers who will work primarily with pre-built Brio documents. Brio 101 Level 1: Portfolio/Dept-Specific PreBuilt Docs Track 3 Low Designed for Executive users, this track focuses on Dashboards and the Brio Portal. Dashboard & Portal training One-on-one or small group format Level 2: Advanced Brio Documents Setting and Communicating Expectations Communicate to Institute Executives Creating an Information Revolution Changing culture Top down approach is needed Recognize Barriers Ask for commitment Recognizing Barriers People’s resistance to a new tool Expectations on information availability and usability for decision making are low Habit of relying on Central Administration to provide information, or on their own sources (many versions of the ‘truth’) People will need to acquire new job skills Job expectations will need to change How to get there …. Common Vision: One version of the truth Data Experts across campus and across organizational boundaries Data Experts: Portfolio Financial Managers or Equivalents will be expected to: access data create reports perform analysis enable/train Portfolio end-users Training approach Evaluating skill levels: Surveys before training Measuring satisfaction with training program: Overall satisfaction with program content is very high: 91% gave the highest survey rating. Partnering with HR – The DW training was included in the appropriate Performance Evaluations / Job Descriptions and course offerings Measuring access levels – generating log files Utilizing Web – self help, registration, communication Status of the Data Warehouse Initiative Development Data Access Policy Operations Campus Rollout DW Program Timeline FY02 Infrastructure Planning/Staffing Software Database/Hardware Production Platform Policy Data Policy Datamarts Finance/Research Position Cntrl/Labor Human Resources Enrollment Grad Financial Aid Undergrad Fin Aid Contracts & Grants Admissions Pipeline Operations Support Software Upgrade Database Upgrade Hardware Growth Req FY03 Dev & Test FY04 Rollout FY05 Data Warehouse Operations Support Transitioning from Development to Operations Portal Administration Dash Board maintenance Data Marts maintenance Users support Data Base Administration Brio documents development, support, and administration Informatica Administration On – Going Training Functional Training Brio training Refreshers courses: Finance/Research, Labor, HR, Enrollment, GFA, etc. Advance curses DW User Support levels Signing up for sessions Creating user profiles/security Installs Publishing requests General problems/questions 5-10 Emails Daily 2-4 Calls Daily 1-2 Major problems that need extensive work from developers/front-end technical support on a daily basis. Data Policy Administration Each Data Policy is administered by the appropriate Committee appointed at the VP level Requests outside the policy are submitted in writing to the Data Warehouse group The Committee has the discretion to either authorize/deny access or recommend access to the appropriate VP depending on the nature of the request. The respective portfolio owner are notified of access granted. Maintenance DW Maintenance - DM Review Month in Production Role 1-3 4-6 7+ Business Staff 10% 10% 10% Power user 60% 45% 20% Data Warehouse Administrator 75% 50% 30% OLAP/Reporting Tool Administrator/Developer 80% 40% 30% Data modeler 20% 10% 5% ETL Specialist 75% 50% 25% Development DBA 50% 25% 10% Operating System Administrator 20% 10% 5% Operations 25% 25% 10% Production DBA 20% 10% 10% DM Review 2003 Resource Guide RPI Resources Role Business Staff Power user IT Group Data Warehouse Management OLAP/Reporting Tool Administrator/Developer 1 0.5 Data modeler 0.25 ETL Specialist 3.5 DBA 0.5 Operating System Administrator Operations (Desktop, Security set-up, etc.) Training Customer Support Total 0.25 0.5 1 0.5 8 Benefits Gained Empowers decision-makers Redirects costly personnel hours Enhances institutional effectiveness Improves integrity and conformity of campus-wide information Promotes the “no walls” culture. Improves data quality over time. Kirsten M. Volpi, Assistant VP/Controller “...There has been analysis that we have not been able to get at before because the data was not retrievable in a fashion conducive to perform analytics. For instance, we have begun utilizing the warehouse to analyze the indirect cost yield on our research grants. This data was not readily available before.” “We are also using the warehouse not only for analytics but for reports to assist with monitoring compliance with internal policies, assisting with data gathering for external surveys, as well as assisting with automating certain processes (encumbrances for graduate financial aid).” Eileen G. McLoughlin, Director of Financial Planning & Budget “The Budget material was consolidated two weeks sooner than the previous years. Many factors contributed to the success, however a significant contributor was the data warehouse allowing the Budget Office to provide data and analysis of the data to decision makers faster than in the past.” “…reinforces the “no walls” culture – i.e. as the warehouse becomes known as the one and only data source – this will contribute towards individuals recognizing that we are one organization with one version of the truth.“ “…Improved quality over time, integrity, conformity – as data is viewed and questioned issues have and will come to the surface on processes that impact data. This has occurred in the budget office, accounting practices have been simplified so the resultant data is more easily interpreted” Diane Veros, Director Research Accounting “The Data warehouse along with the BRIO software has proven to be an extremely useful tool for providing information for reporting, monitoring and analysis. BRIO queries and pivot tables have definitely helped to make some of our work more efficient and effective. We have developed queries for monitoring reports, verifying data integrity, and analysis that before would have required days, weeks, or even months working with IACS to program and develop. Once developed, those older reports (and/or the data in them) would have allowed limited access to campus, and another user might have started from scratch to produce a similar report. The data warehouse provides a consistent data stream that allows all campus users to view and analyze the same information in many alternative ways.” Jeff Tanis, Manager of Financial Operations School of Science “The time it has taken me to gather information has been cut by at least half. I now query the warehouse--where previously I had to initiate many e-mails and phone calls to collect what I needed. Last month while doing a research expenditure analysis, it took me a matter of hours--where in the past it took days to get what I needed.” “While doing a research expenditure analysis last month I identified a substantial amount of research expenditures on other schools grants using School of Science Orgs. I could not have identified and subsequently corrected these errors without the use of the Data Warehouse.” Helen Grzymala, Associate Director Budget “As we roll the Finance Data Mart out to all Portfolio Financial Managers, the Budget Office will be providing more and more reports via the Data Warehouse. Portfolios will be able to see the various reports that are prepared on an Institutional level for the data. We will be able to have ongoing, meaningful discussions about the data, rather than how to get the data and how to manipulate it.” “The Data Warehouse will result in a change in job expectations for both the Budget Office and the Portfolio Financial Managers. The forecast and budget process will evolve to a more analytical review of history and a fact-based projection of the future. Users will move from simply ‘crunching the numbers’ because they will have more time and because more data is actually available. Once the Contracts and Grants information is available, the research units will be able to track activity right from the pre-proposal stage thru the award close out. Using this data, trending and other analysis will follow, leading to more accurate forecasts and budgets.” Benefits User Name Task Performed Pre Data Warehouse Implementation Post Data Warehouse Implementation Sandra Redemann Butcher Portfolio YTD Analysis Half Day to retrieve and compile information manually. Seconds to retrieve information from the Data Warehouse. Gina Ricci Report Analysis Multiple Truths existed across campus. Multiple information sources existed, which destroyed data integrity and conformity. One Truth exists. One information source promotes a common understanding of the data and allows users to derive at the same conclusions. Tanya Struzinsky Available Balance Report 2 Weeks to retrieve and compile information manually. 30 Minutes to build the report in the Data Warehouse, which can be refreshed daily in Seconds. Donna Tomlinson Org 3 Year Comparison By Account Group Not Readily Available Readily Available on demand. Benefits User Name Task Performed Pre Data Warehouse Implementation Post Data Warehouse Implementation Jeff Tanis Research Expenditure Analysis Multiple Days to retrieve data from multiple sources and compile information manually. Few Hours to retrieve information from the Data Warehouse. Diane Veros Data Integrity Verification Days, Weeks or Months to develop reports to ensure data integrity or to perform analysis. Few Hours to develop reports in the Data Warehouse to ensure data integrity or to perform analysis. Tanya Struzinsky Credit Card Transaction Reconciliation 2-3 Hours to compile, review and verify credit card transactions for each user. 5 Minutes to retrieve, review and verify credit card transactions for each user. Sandra Redemann Butcher Month-End Report 2 Days to retrieve and verify program and activity codes in order to ensure accurate results. 2 Hours to retrieve information from the Data Warehouse. No data manipulation required. Program metrics Web access only (not including desktop or Dash Boards users) Timeframes: January 27 – July 31 Financial Analysis Web Access Lessons Learned Picture worth thousand words – prototype Funding (time, resources, and dollars) Business Sponsorship – find the Champion and promote them Properly designed Organizational Structure helps to navigate political obstacles Partnership with the Business users – build it alone and they will never come Identify your Business ‘Stars’ as early as possible JAD and RAD approaches are best fitted for the iterative DW development Dash Boards – unless it is visible it is not there Building Data Warehouse is far more than a technical endeavor it is all about changing the culture Questions ??? Ora Fish [email protected] ?