Transcript Slide 1

Kick-off meeting Friday, May 1, 2020 Anders Östman Imad Abugessaisa

Purpose

• • • • The methodology Annex I-III.

relates to the data specifications for the themes specified in The main objective is to assure that INSPIRE specifications are balanced in terms of cost (CBC) and solving user needs.

To make transformation from local data sets into INSPIRE compliant data sets feasible.

Get feedback and discussion about the methodology.

Annex I test

• • GeoTest focuses on transformation test.

The aim is to test how to transform data in

schemas

to data in

INSIPRE schemas

.

local UML models

Conceptual Level

Local models T.Process

INSPIRE GML encdoing

Implementation Level

Application schemas

Conceptual level

• UML Class diagram for each

THEME

– – – – – – Application schema Cross-theme relationships Constraints Feature types Data types Enumerations and code lists

class Addresses Base Types::Identifier

+ localId: CharacterString + namespace: CharacterString «lifeCycleInfo, voidable» + versionId: CharacterString [0..1] «voidable» «dataType» «dataType»

GeometryOrigin

+ base: GeometryBasis + method: GeometryMethod «featureType»

AdminUnitName

+ name: GeographicalName [1..*] + level: AdministrativeHierarchyLevel «featureType»

PostalDescriptor

+ postName: GeographicalName [0..*] + postCode: CharacterString [0..1] {PostNameEmpty} {PostCodeEmpty}

constraints

«featureType»

AddressAreaName

Generic Conceptual Model +parent 0..* «featureType»

Gazetteer::LocationInstance

+ geographicIdentifier: PT_FreeText + alternativeGeographicIdentifier: PT_FreeText [0..*] + geographicExtent: GM_Object + admin: CI_ResponsibleParty «lifeCycleInfo» + dateOfCreation: Date [0..1] +gazetteerInstance «voidable» «featureType»

Address

+ identifier: Identifier + geographicPosition: GM_Point + geometryOrigin: GeometryOrigin «voidable» + status: Status [0..1] + validFrom: DateTime + validTo: DateTime [0..1] + lastChange: DateTime «voidable, lifeCycleInfo» + beginLifespanVersion: DateTime + endLifespanVersion: DateTime [0..1]

constraints

{Locator} {ThoroughfareName} {AddressAreaName} {AdminUnitName} {AddressAdminUnit} {AddressComponentParent} {AddressLocator} {AddressCountry} +address 0..* +component 2..* {ordered} +instance «featureType»

AddressComponent

0..1

«voidable» + identifier: Identifier [0..1] + alternativeIdentifier: CharacterString [0..1] + status: Status [0..1] + validFrom: DateTime + validTo: DateTime [0..1] + lastChange: DateTime «voidable, lifeCycleInfo» + beginLifespanVersion: DateTime + endLifespanVersion: DateTime [0..1] 0..1

+child 0..* +parent «voidable» 0..* +child «voidable» 0..* + name: GeographicalName [1..*] «featureType»

ThoroughfareName

+ name: GeographicalName [1..*] «dataType»

LocatorDesignator

+ designatorValue: CharacterString + type: LocatorDesignatorType «featureType»

Locator

+ designator: LocatorDesignator [0..*] {ordered} + name: LocatorName [0..*] {ordered} + level: LocatorLevel «dataType»

LocatorName

+ nameValue: GeographicalName [1..*] + type: LocatorNameType

Implementation level

• • • XML encoded schemas for Annex I are offered from INSPIRE one XSD per theme. SIS offered application schemas for: – – –

Väg- och järnvägsnät ( only) Ytvattensystem Belägenhetsadresser

Distributed over a multitude of XSD documents.

Cost-benefit considerations CBC

During T. testing quantitative information will be collected about:

– Efforts needed , e.g. as person-hours per dataset and Initial investments necessary to implement the data transformation.

– How existing tools and “know-how” have been exploited.

– Resources needed transformation service.

for maintaining an operational – Time efficiency of the on-the-fly transformation services – Demand for the data being tested.

– To what extent INSPIRE specifications can be used in the future within organisations.

– How participation in testing helps stakeholders to identify corresponding or missing data and the processes necessary for implementation of INSPIRE within the Member State.

Strategy for transformation testing

• During transformation test the following are to be tested per theme: – Data content and structure – Delivery – Data quality and metadata – Portrayal – Reference systems (might be needed!)

Testing strategy

• • The main strategy in this testing is to use the ETL (extract-transform-load) approach. SW license available. In the future, when INSPIRE services are to be implemented new approach might be introduced.

Preliminary desk study

• The objective is to gather some basic information about the data theme being tested. The study includes the following – Preliminary schema matching – Identification of source schema – Metadata survey – Investigate if there is any OGC compliance or services using the source DS

PDS- steps

Schema matchning

’’semantic correspondences ‘’

Identification of source schema Metadata survey OGC compliance

• Identifying corresponding concepts in the source schema and target schema.

• For each INSPIRE features types identify the corresponding attributes in source DS. • Indentify the schemas that used for source DS. • Investigate non formal descriptions of schema. • Interviews can be used to collect more info about DS schema. • Investigate if the metadata as specified by INSPIRE are present. • Specify the spatial reference systems being used. • Specify if there are any OGC services that are using the source DS.

Generation of Source GML data

• • The objective of the extraction process is to identify costly procedures when generating GML data that conforms to the source schema.

This problem is expected to increase in cases of: – when the source schema is not available – when data are loosely coupled to the source schema

Steps to Generate GML data

• • • • Specify source schema in XML and constraints. Extract sample data sets.

Convert sample data sets to GML/XML. Quantification of data inconsistencies.

Generation of GML data .1.

Specify source schema in XML and geometric constraints.

Specify source schema

Yes

Schema specified

No

Convert to XML Schema has to be created • In addition to the schemas, some geometric constraints might also need to be specified, if not specified in the schema.

Generation of GML data .2.

• Extract sample data sets – Extract sample data sets from the entire source database – The sample data sets shall be representative for the quantification of problems that may occur – The extraction should be based on random sampling

Generation of GML data .3.

• Convert sample data sets to GML/XML. – In case the extraction in stage 2 is not based on XML/GML, then – A transformation from the export format to GML/XML is required. • Quantification of data inconsistencies – Study the consistency between the source data and the schema

Transformation to INSPIRE GML schema

• The objective of the transformation test is to estimate the occurrence of costly procedures in a schema translation process. This is will be performed as 1. Schema matching 2. Schema mapping 3. Schema transformation

Schema matching

• • • Based on the preliminary schema matching study, a final matching is established.

– Performed manually – Fragment-based Matching – Partial automation use schema matcher.

Schema-level OR instance-level Element-level OR structure-level

Schema mapping

• • The goal is to specify the transformation rules when transforming from source schema to INSPIRE schema.

Data transformation tools either from Humboldt experiences or other resources. Or others software license providers •

SDIC/LMO: Safe Software

SDIC/LMO: Snowflake Software (limited to Oracle engine - GO Loader and GO Publisher

The HUMBOLDT editor (supports formal description of application schemas) and the main transformation services in a Beta release

Schema transformation

• • The goal here is not to do the actual transformation, instead to estimate the occurrence of costly procedure.

Main issues to be reported to INSPIRE – – Can local data be mapped to the INSPIRE schema?

Do existing data cover the content required?

– Can local CRS be mapped to target CRS and is there loss in precision?

– What are transformations?

the technical challenges for

Thank you for your participation