Statistics New Zealand’s Case Study ”Creating a New Business Model for a National Statistical Office if the 21st Century” Craig Mitchell, Gary Dunnet,
Download ReportTranscript Statistics New Zealand’s Case Study ”Creating a New Business Model for a National Statistical Office if the 21st Century” Craig Mitchell, Gary Dunnet,
Statistics New Zealand’s Case Study ”Creating a New Business Model for a National Statistical Office if the 21st Century” Craig Mitchell, Gary Dunnet, Matjaz Jug Overview • Introduction: organization, programme, strategy • The Statistical Metadata Systems and the Statistical Cycle: description of the metainformation systems, overview of the process model, description of different metadata groups • Statistical Metadata in each phase of the Statistical Cycle: metadata produced & used • Systems and Design issues: IT architecture, tools, standards • Organizational and cultural issues: user groups • Lessons learned 63 Geography, Regional & Environment Prices John Morris x4307 57 Tammy Estabrooks x4614 Manager (Acting) National Accounts EA: Indigo Freya x4858 51 Macro-Economic, Environment, Regional & Geography HRAM: Alan McIntyre x4662 37 Macro-Economic Statistics Development Unit Government & International Accounts Strategic Communication Peter Swensson x4060 Sam Fisher x4225 Julian Silver x4387 General Manager Strategy & Communication EA: Hanli van der Westhuizen x4235 Business Indicators 36 Louise Holmes-Oliver x8780 and Kathy Connolly x8975 35 Andrew Hunter Business, Financial & Structural Andrew Hunter x8355 Business Performance & Agriculture Eileen Basher x4701 Census 2011 65 Carol Slappendel x4947 General Manager EA: Tania Mattock x4074 47 56 59 Last Updated 20/06/07 01 General Manager Statistical Education & Research EA: Indigo Freya x4858 General Manager Statistical & Methodological Services EA: Indigo Freya x4858 Deputy Government Statistician Social & Population Statistics EA: Tania Mattock x4074 Social & Population HRAM: Robynn Cade x4681 HR Account Manager Executive Assistant Gary Dunnet Cathryn Ashley Jones Ray Freeman General Manager Auckland Office EA: Diane McGuire x9315 38 OSRDAC Hamish James x4237 61 Collection & Classification Standards 31 Bridget Hamilton-Seymour x4833 27 Statistical Methods Diane Ramsay x4355 Product Development & Publishing Chief Information Officer Gareth McGuinness x4851 EA Hanli van der Westhuizen x4235 29 Application Services Information Customer Services 82 Business Transformation Strategy Gary Dunnet x4650 Matjaz Jug x4238 78 Mike Moore x8701 Integrated Data Collections HRAM: Alan McIntyre x4662 Statistical & Methodological HRAM: Robynn Cade x4681 90 General Manager Business & Dissemination Services EA: Hanli van der Westhuizen x4235 Integrated Data Collection Ray Freeman x9143 Statistical Education & Research HRAM: Alan McIntyre x4662 Sharleen Forbes Vince Galvin Tere Scotney x4956 Standard of Living 66 Andrea Blackburn x4680 21 Dallas Welch Social Statistics Development Unit Paul Brown x4304 (Acting till 11 July 2007) Corporate Services EA: Eugenie Bint x4903 Deputy Government Statistician Industry & Labour Statistics EA: Eugenie Bint x4903 Population Social Conditions 09 Vina Cullum General Manager Maori Statistics Unit EA: Eugenie Bint x4903 EA: Kathy Warren x4760 HRAM: EA: Vina Cullum X4815 Raj Narayan x4709 67 Denise McGregor x4303 58 David Archer 39 34 Sandy Natha x4242 EA: Eugénie Bint x4903 Whetu Wereta Government Statistician General Manager Christchurch Office Corporate Support Financial Services 03 General Manager Geoff Bascand Greta Gordon x4223 Human Resources Business Unit Elizabeth Bridge x4696 15 Nancy McBeth 08 Maori Statistics Unit 14 Deputy Government Statistician (Acting) Macro-Economic, Environment, Regional & Geography Statistics EA: Indigo Freya x4858 09 07 Rachael Milicich Work, Knowledge & Skills 62 Paul Maxwell x4727 Judith Hughes X4803 55 Industry & Labour Statistics HRAM: Lisa Mulholland x4871 Strategic Policy & Planning Michael Anderson x4930 52 Planning & Performance Reporting Corporate Services and Maori Statistics Unit HRAM: Robynn Cade x4681 Strategy & Communications HRAM: Alan McIntyre x4662 98 Nathan Scott x4156 Business & Dissemination Services and Chief Information Officer HRAM: Lisa Mulholland x4871 91 IT Operations & Services Sharon Hastie x4645 Business model Transformation Strategy 1. A number of standard, generic end-to end processes for collection, analysis and dissemination of statistical data and information 2. 3. Includes statistical methods Covering business process life-cycle To enable statisticians to focus on data quality and implemented best practice methods, greater coordination and effective resource utilisation. A disciplined approach to data and metadata management, using a standard information lifecycle An agreed enterprise-wide technical architecture BmTS & Metadata The Business Model Transformation Strategy (BmTS) is designing a metadata management strategy that ensures metadata: – – – – – – – – fits into a metadata framework that can adequately describe all of Statistics New Zealand's data, and under the Official Statistics Strategy (OSS) the data of other agencies documents all the stages of the statistical life cycle from conception to archiving and destruction is centrally accessible is automatically populated during the business process, where ever possible is used to drive the business process is easily accessible by all potential users is populated and maintained by data creators is managed centrally A - Existing Metadata Issues • • • • • • • • • • metadata is not kept up to date metadata maintenance is considered a low priority metadata is not held in a consistent way relevant information is unavailable there is confusion about what metadata needs to be stored the existing metadata infrastructure is being under utilised there is a failure to meet the metadata needs of advanced data users it is difficult to find information unless you have some expertise or know it exists there is inconsistent use of classifications/terminology in some instances there is little information about data, where it came from, processes it has been under or even the question to which it relates B - Target Metadata Principles • • • • • • • • • • • • metadata is centrally accessible metadata structure should be strongly linked to data metadata is shared between data sets content structure conforms to standards metadata is managed from end-to-end in the data life cycle. there is a registration process (workflow) associated with each metadata element capture metadata at source, automatically ensure the cost to producers is justified by the benefit to users metadata is considered active metadata is managed at as a high a level as is possible metadata is readily available and useable in the context of client's information needs (internal or external) track the use of some types of metadata (eg. classifications) How to come from A to B? 1. Identified the key (10) components of our information model. 2. Service Oriented Architecture. 3. Developed Generic Business Process Model. 4. Development approach from ‘stove-pipes’ to ‘components’ and ‘core’ teams. 5. Governance – Architectural Reviews & Staged Funding Model. 6. Re-use of components. 10 Components within BmTS E-Form Raw Data Clean Data Aggregate Data 2. Output Data Store ‘UR’ Data Summary Data Official Statistics System & Data Archive 1. Input Data Store Web 6. Transformations RADL 5. Information Portal Output Channels CAI Multi-Modal Collection Imaging Admin. Data 4. Analytical Environment INFOS CURFS 10. Dashboard / Workflow 8. Customer Management 7. Respondent Management 3. Metadata Store Statistical Process Knowledge Base 9. Reference Data Stores Statistics New Zealand Current Information Framework Need Design/ Build Collect Process Analyse Disseminate Generic Business Process QMS, Ag Range of information stores by subject area (silos) HES etc. Time Series Store (& INFOS) ICS Store Web Store Metadata Store (statistical, e.g. SIM) Reference Data Store (e.g. BF, CARS) Software Register Document Register Management Information - HR & Finance Data Stores Statistics New Zealand Future Information Framework Need Design/ Build Collect Analyse Process Disseminate Generic Business Process Input Data Store Raw Data Clean Data Summary Data Output Data Store TS (confidentialised copy of IDS Physically separated) ICS Metadata Store (statistical/process/knowledge) Reference Data Store Software Register Document Register Management Information - HR & Finance Data Stores WEB CMF – gBPM Mapping CMF Lifecycle Model Statistics NZ gBPM (sub-process level) 1 - survey planning and design Need (sub-processes 1.1 - 1.5) + Develop & Design (sub-processes 2.1 - 2.6) 2 - survey preparation Build (sub-processes 3.1 - 3.7) + Collect (subprocess 4.1) 3 - Data collection Collect (sub-processes 4.2 - 4.4) 4 - Input processing Collect (sub-process 4.5) + Process (subprocesses 5.1 - 5.3) 5 - Derivation, Estimation, Aggregation Process (sub-processes 5.4 - 5.7) 6 - Analysis Analyse (sub-processes 6.1 - 6.6) 7 - Dissemination Disseminate (sub-processes 7.1 - 7.5) 8 - Post survey evaluation Not an explicit process, but seen as a vital feedback loop. Metadata: End-to-End Need – – capture requirements eg usage of data, quality requirements access existing data element concept definitions to clarify requirements Design – – – – capture constraints, basic dissemination plans eg products capture design parameters that could be used to drive automated processes eg stratification capture descriptive metadata about the collection - methodologies used reuse or create required data definitions, questions, classifications Build – – capture operational metadata about selection process eg number in each stratum access design metadata to drive selection process Collect – – – capture metadata about the process access procedural metadata about rules used to drive processes capture metadata eg quality metrics Metadata: End-to-End (2) Process – – – capture metadata about operation of processes access procedural metadata, eg edit parameters create and/or reuse derivation definitions and imputation parameters Analyse – – – – capture metadata eg quality measures access design parameters to drive estimation processes capture information about quality assurance and sign-off of products access definitional metadata to be used in creation of products Disseminate – – – – – capture operational metadata access procedural metadata about customers Needed to support Search, Acquire, Analyse (incl; integrate), Report capture re-use requirements, including importance of data - fitness for purpose Archive or Destruction - detail on length of data life cycle. Metadata: End-to-End - Worked Example Question Text: “Are you employed?” Need – – – Concept discussed with users Check International standards Assess existing collections & questions Design – – – – Design question text, answers & methodologies Align with output variables (e.g. ILO classifications) Data model, supported through meta-model Develop Business Process Model – process & data / metadata flows Build – – – Concept Library – questions, answers & methods ‘Plug & Play’ methods, with parameters (metadata) the key System of linkages (no hard-coding) Metadata: End-to-End - Worked Example Question Text: “Do you live in Wellington?” – – – – – Collect Question, answers & methods rendered to questionnaire Deliver respondents question Confirm quality of concept Process Draw questions, answers & methods from meta-store Business logic drawn from ‘rules engine’ Analyse – – – Deliver question text, answers & methods to analyst Search & Discover data, through metadata Access knowledge-base (metadata) Disseminate – – Deliver question text, answers & methods to user Archive question text, answers & methods Conceptual View of Metadata Anything related to data, but not dependent on data = metadata There are four types of metadata in the model: Conceptual (including contextual), Operational, Quality and Physical …defined by MetaNet Implementation: Dimensional Model Metadata •Standard classifications Dimension •Standard variables •Standard Dimension questions •Survey •Instruments Dimension •Survey mode •Standard data Dimension definition FACT Architecture User access INFORMATION PORTAL Reference data Classifications Metadata Service layer FACT FACT Input Data Environment Version 2.0.06 IDE/MetaStore question q_key question_text <pk> fact_definition int identity varchar(1000) fd_key desc _text <pk> class ification_used fact_definition_class ification int identity varchar(1000) fd_key cu_key <pk,fk> <pk,fk> int int cu_key class fn_nbr class fn_ver_nbr level_nbr class fn_cat_code <pk> Fact definitions instrument_attribute_type iat_key attribute_type_code <pk> variable_library int identity varchar(10) v_key var_name fd_key data_type_code <pk> <fk> fact_life_cycle int identity int int int varchar(15) flc_key status_code int identity varchar(30) Versioning fact_group_key cu_key <fk> int <fk> int rfc_key reason_text <pk> period_key year month day date week int identity int int varchar(25) varchar(25) int int varchar(25) varchar(25) char(1) dim_member <pk> int identity int int int datetime int dm_key dl_key dm_parent dm_text <pk> <fk> <fk> int identity int int varchar(255) Generic Dimensions instrument_question_map answer_part ap_key <pk> int identity answer_part_text varchar(255) Time Dimensions Hiearchies fact <pk> <fk> <fk> <fk> int <fk> int identity Static Reference Tables instrument_attribute <pk,fk> int <pk,fk> int varchar(255) iqm_key i_key qap_key q_code ap_code line_seq_nbr column_seq_nbr unit_of_measure magnitude question_type_code dm_key fd_key time int identity varchar(255) Questions & Variables iat_key iqm_key attribute_text fact_defn_dimension varchar(50) varchar(50) varchar(30) varchar(255) reason_for_change Question Dimensions question_answer_part qap_key <pk> int identity q_key <fk> int ap_key <fk> int fd_key <fk> int data_type_code char(1) domain_table domain_column domain_code domain_label Versioning Dimensions fact_class ification int identity varchar(255) int char(1) domain_value <pk> instrument_variable_map ivm_key i_key v_key column_nbr file_offset var_length unit_of_measure magnitude <pk> <fk> <fk> int identity int int int int int varchar(25) varchar(25) f_key fact_group_key fact_ver_nbr flc_key r_key ci_key uoi_key qap_key fd_key i_key rfc_key su_key v_key actual_period_start_key actual_period_end_key create_date create_user fact_value <pk> int identity int int int int int int int int int int int int int int datetime sy sname varchar(2000) <fk> <fk> <fk> <fk> <fk> <fk> <fk> <fk> <fk> <fk> <fk> <fk> IDE Operational Area and Exceptions Area exception_fact ef_key <pk> exception_type_code f_key <fk> fact_group_key fact_ver_nbr flc_key <fk> r_key <fk> ci_key <fk> uoi_key <fk> qap_key <fk> fd_key <fk> i_key <fk> rfc_key <fk> su_key <fk> v_key <fk> actual_period_start_key <fk> actual_period_end_key <fk> create_date create_user fact_value dim_level dl_key ad_key dl_parent dl_text int identity char(1) int int int int int int int int int int int int int int int datetime sy sname varchar(2000) ad_key ad_text <pk> <fk> <fk> int identity int int varchar(255) additional_dimension <pk> int identity varchar(255) * e xc ep tion_ fac t ta ble r ela tio ns hip s ha ve no t b ee n de pic te d. Re lation sh ip s a re implied be tw ee n pa r en t ta ble p rima ry ke ys an d c hild tab le for eig n k ey s tha t e xist in e xc ep tio n_ fact. collection c_key name_text freq_code <pk> instrument int identity varchar(255) char(1) i_key name_text instrument_code instrument_type_code <pk> instrument_mode int identity varchar(255) varchar(30) char(1) i_key m_key Collections & Instruments Collection Dimensions collection_instance ci_key <pk> c_key <fk> collection_instance_code collection_instance_type_code name_text status_code reference_period_start_date reference_period_end_date int identity int varchar(25) char(1) varchar(255) varchar(30) datetime datetime m_key mode_code ii_key ci_key i_key su_key instrument_instance <pk> <fk> <fk> <fk> response_attribute <fk> int <fk> int rat_key r_key attribute_text mode <pk> <pk,fk> <pk,fk> response_attribute_type int int varchar(255) response int identity varchar(10) r_key m_key ii_key response_id <pk> <fk> <fk> rat_key attribute_type_code <pk> int identity varchar(10) supplying_unit int identity int int varchar(50) su_key su_id su_source_code name_text su_type_code <pk> Respondents Respondent Dimensions int identity varchar(25) char(3) varchar(100) char(3) unit_of_interest uoi_key ii_key uoi_id uoi_source_code name_text uoi_type_code status_code s_key <pk> <fk> <fk> strata_attribute int identity int char(10) char(3) varchar(100) char(1) char(1) int sa_key s_key data_type_code name_text value_text <pk> <fk> int identity int char(1) varchar(50) varchar(255) Units of Interest weight uoi_key s_key weight_type_code weight_value create_date comment_text <fk> int <fk> int char(1) float datetime varchar(1000) int int int int strata s_key ci_key strata_code sub_strata_code <pk> <fk> int identity int varchar(10) varchar(10) Goal: Overall Metadata Environment Search and Discovery Metadata and Data Access Data Passive Metadata Store/s Business Logic Classification Question Library Data Definition Management Frames/Reference Stores Schema Metadata: Recent Practical Experiences – – – – Generic data model – federated cluster design Metadata the key Corporately agreed dimensions Data is integrateable, rather than integrated Blaise to Input Data Environment Exporting Blaise metadata ‘Rules Engine’ – – Based around s/sheet Working with a workflow engine to improve (BPM based) IDE Metadata tool Currently s/sheet based Audience Model – Public, professional, technical – added system SOA Channel Interfaces Data Warehouse Databases Support Functions Functions Services Internet Intranet Extranet BI Cubes, SAS etc Web Services Security Application Admin System Monitoring Application Services Analytics Execution Engine Services Transaction Mgmt Process Management Workflow Directory Services Services Resource Mgmt Service Layer (Message and Data Bus) Queuing Load Mgmt Scheduling Business Rules Rules Engine Transformations Rules Engine Adapter Adapter Adapter Adapter Adapter Adapter Adapter Adapter Respondent Management CRM Customer Management CRM Call Centre SAS ETL Tools SQL Server Blaise Other Services Standards & Models - The MetaNet Reference ModelTM Two Level Model based on: Concepts = basic ideas, core of model Characteristics = elements, attributes, make concepts unique Terms and descriptions can be adapted Concepts must stay the same Concepts should be distinct and consistent Concepts have hierarchy and relationships Collection Eg. Census Frequency= 5 yearly Collection Instance Questionaire A Eg. Census 2006 Questionaire B Do you live in Wellington? Question 1 Question 1 What is your age? 2 WGTN Classification: Question CITY Category: Classification: NZ Island Category: NTH ISL Question 2 Question 3 Fact definition 1 Classifications How old are you? Fact definition 2 Person lives in Wellington Classifications Fact definition 3 Classifications Fact definition 4 Age of person Classifications Defining Metadata Concepts: Example How will we use MetaNet? 1. Use to guide the development of a Stats NZ model 2. Another model (SDMX) will be used for additional support in gaps 3. Provides the base for consistency across systems and frameworks 4. Will allow for better use and understanding of data 5. Will highlight duplications and gaps in current storage Metainformation systems Concept Based Model SIM Data Collections Variables CARS Classifications Categories Statistical Units Sample Design Concordance IDE Other Metadata stored in: Domain Value •Business Frame Fact Classification Response Collection •Survey Systems •BmTS components •etc Metadata Users - External • Government, • Public, • External Statisticans (incl. Intl Orgs) Metadata Users - Internal – Statistical Analysts – IT Personnel (business analysts, IT designers & technical leads, developers, testers etc.) – Management – Data Managers / Custodians / Archivists – Statistical Methodologists – External Statisticians (researchers etc.) – Architects - data, process & application – Respondent Liaison – Survey Developers – Metadata and Interoperability Experts – Project Managers & Teams – IT Management – Product Development and Publishing – Information Customer Services Lessons Learnt – Metadata Concepts • • • Apart from 'basic' principles, metadata principles are quite difficult. To get a good understanding of and this makes communication of them even harder. Every-one has a view on what metadata they need - the list of metadata requirements / elements can be endless. Given the breadth of metadata - an incremental approach to the delivery of storage facilities is fundamental. Establish a metadata framework upon which discussions can be based that best fits your organisation - we have agreed on MetaNet, supplemented with SDMX. Lessons Learnt – BPM • To make data re-use a reality there is a need to go back to 1st principles, i.e. what is the concept behind the data item. Surprisingly it might be difficult for some subject matter areas to identify these 1st principles easily, particularly if the collection has been in existence for some time. • Be prepared for survey-specific requirements: the BPM exercise is absolutely needed to define the common processes and identify potentially required survey-specific features. Lessons Learnt – Implementation • • Without significant governance it is very easy to start with a generic service concept and yet still deliver a silo solution. The ongoing upgrade of all generic services is needed to avoid this. Expecting delivery of generic services from input / output specific projects leads to significant tensions, particularly in relation to added scope elements within fixed resource schedules. Delivery of business services at the same time as developing and delivering the underlying architecture services adds significant complexity to implementation. Lessons Learnt – Implementation • Well defined relationship between data and metadata is very important, the approach with direct connection between data element defined as statistical fact and metadata dimensions proved to be successful because we were able to test and utilize the concept before the (costly) development of metadata management systems. Lessons Learnt – SOA • The adoption and implementation of SOA as a Statistical Information Architecture requires a significant mind shift from data processing to enabling enterprise business processes through the delivery of enterprise services. • Skilled resources, familiar with SOA concepts and application are very difficult to recruit, and equally difficult to grow. Lessons Learnt – Governance • • The move from ‘silo systems’ to a BmTS type model is a major challenge that should not be under-estimated. Having an active Standards Governance Committee, made up of senior representatives from across the organisation (ours has the 3 DGSs on it), is a very useful thing to have in place. This forum provides an environment which standards can be discussed & agreed and the Committee can take on the role of the 'authority to answer to' if need be. Lessons Learnt – Other • • • There is a need to consider the audience of the metadata. Some metadata is better than no metadata - as long as it is of good quality. Do not expect to get it 100% right the very first time. Questions?