Dutch Semantic Web Get-together Vrije Universiteit Amsterdam, March 16th http://esw.w3.org/topic/DutchSemanticWebGettogether Organization: Antoine Isaac, Eyal Oren Sponsor: the Network Institute.
Download ReportTranscript Dutch Semantic Web Get-together Vrije Universiteit Amsterdam, March 16th http://esw.w3.org/topic/DutchSemanticWebGettogether Organization: Antoine Isaac, Eyal Oren Sponsor: the Network Institute.
Dutch Semantic Web Get-together Vrije Universiteit Amsterdam, March 16th http://esw.w3.org/topic/DutchSemanticWebGettogether Organization: Antoine Isaac, Eyal Oren Sponsor: the Network Institute Agenda • • • • • • • • • • 09:30-09:45: welcome 09:45-10:30: big talk (Ivan Herman) 10:30-10:45: coffee + koekjes 10:30-11:30: speeddating 11:30-12:30: lightning talks + discussion 12:30-13:15: lunch 13:15-14:15: lightning talks + discussion 14:15-14:30: coffee 14:30-15:30: lightning talks + discussion 15:30-16:30: speeddating and drinks Lightening talks - 1 • • • • • • • • • Marshall Hoekstra Deursen Top Ossenbruggen Amin Hildebrand Wang Yiwen Brugman Scott Marshall http://www.leibnizcenter.org Rinke Hoekstra (VU/UvA) and Saskia van de Ven (UvA) Law and Semantic Web Representation of regulations using Semantic Web languages Legal reasoning using standard (DL) reasoners Issue: “What if two sources say different things?” Specificity (…) Lex Superior Temporal Validity (overwrite) Lex Posterior Applicability to old cases Authority (implicit) Lex Superior Jurisdiction Location (…) Jurisdiction Scope (deeming provision) Import References to definitions, not documents (documents) ELIS – Multimedia Lab Semantic Web vs. Multimedia Annotation Feature extraction Find the best match Feature DB • • • feature extraction results in low-level concepts matching algorithms use a number of rules to propose a high-level concept use SW technologies for this purpose – formally described feature DB – formally described rules Metadata modeling http://dbpedia.org/resource/Barack_Obama <Semantic Web vs. Multimedia Annotation> <Davy Van Deursen, Sam Coppens, Erik Mannens> <Dutch Semantic Web – 16.03.2008> 6/1 Jan Top e-Science for Food Research • Semi-open innovation in food • RDF/OWL model of the scientific workflow research question, preparation, experiment, data analysis, reporting, … • • • • Food Thesaurus Web application Tiffany – Sesame plus .NET Openness or ‘stimulated-disclosure’? Flexibility of the model, but how flexible is the user? The new Luxaflex® powerpoint template Who are the users? Why would they use the cloud? What tasks can be supported? How will the semantics help? Jacco van Ossenbruggen Alia Amin Comparison Search Who : CH conservators, researchers, students Why: Important Information Gathering task (JCDL’08) What: Compare sets using multiple thesauri, heterogeneous dataset alignment between properties and values How: semantic search & visualization Subject Annotation Who: Professional annotators Why: Subject matter annotation of 700.000 prints What: Search in multiple thesauri for annotation terms How: Autocompletion on who/what/where/when Michiel Hildebrand Patterns of Semantic Relations in Content-based Recommender Systems Accuracy Frequency Serendipity teachOf/ studentOf Yiwen Wang, CHIP Project www.chip-project.org 16/03/2009 annotation repository Hennie Brugman texts annotation service (GATEApolda) annotations ranking service term suggestions thesaurus (skos) catalog conversion enrichment vocabulary thesaurus texts Semantic annotations Annotation (based on GATE) Recommendation & Ranking video Lightening talks - 2 • • • • • • • • • • Brickley Omelayenko Cimiano Cornet Willems Koenderink Rijgersberg Rutledge Nederbragt Bocconi Dan Brickley Borys Omelayenko AnnoCultor porting collections and vocabularies to the Semantic Web Museums: Various models Louvre e-culture: DC / SKOS Tropenmuseum Concept RKD Rijksmuseum Volkenkunde Work etc. Image AnnoCultor Converter in Java or XML* 100s properties and 100.000 concepts per institution Structural conversion: from simple to very complex Semantic enrichment: term lookup, disambiguation Up to 80% terms found in vocabulary lookup annocultor.sourceforge.net CATCH day, 28.02.2008 Philipp Cimiano Dutch SW Day @ VU, Amsterdam 16th March 2009 Web Information Systems (WIS) - EWI TU Delft Towards Linguistically Grounded Ontologies (joint work w. P. Buitelaar, P. Haase and M. Sintek) The The Model Need Related Work The Needdo not need labels „per se“. Ontologies Does SKOS do the job? Related Work WeRequirements need labels for:No, SKOS was defined for totally different purposes. human consumption It provides a datamodel (highjacking RDF/OWL) to represent 1.linking textual data to ontologies (ontology capture relations betweenpopulation) terms, e.g., classification schemas: The Goalmorphological generating descriptions from ontologies through NL inflection (animal,animals), separately from the The Model The goal of this research is to yield askos:Concept; principled and generic model that etc. etc. domain ontology; ex:animals rdf:type allows to declaratively specify aor lexicon fordecomposition an ontology. 2.Requirements represent the morphological syntactic skos:prefLabel "animals"@en; The main goal is to avoid all applications have to re-specify the We need a general and principled model associate of composite terms and that the "creatures"@en; linking oftothe components to skos:altLabel connection between and an ontology in an „adhoc“ fashion linguistic information to language ontologies. the ontology; skos:prefLabel "animaux"@fr; vision is oneskos:altLabel where wepatterns, can also such publish 3. The model complex linguistic as lexica for ontologies "créatures"@fr. (insubcategorization addition to the ontologies andtogether people can frames forthemselves) specific verbs withsearch and reuse „ontology lexica“ theirthese mapping to arbitrary ontological structures; There are other models which are more in line with our work, Future Work The Goal 4. specify the meaning of LIR linguistic with e.g. the Modelconstructions from UPM/Madrid. Future Work respect to an arbitrary (domain) ontology, and Spread the model and make people use it (first version of an API is available) 5. clearly techniques separate the linguistic and semantic (ontological) Develop that automatically instantiate the model representation levels. Investigate relation to other models (e.g. LIR) Ronald Cornet - Department of Medical Informatics Academic Medical Center – University of Amsterdam Understanding & Evaluation Implementation •GUI Design •Functionality •Classifications (rules) •Information models •Large-scale reasoning Collaborations VU IHTSDO NEN/CEN/ISO Terminological SNOMED CT Systems Development •Formalization •Standardization •Architecture Auditing & Maintenance •(DL-based) Qual. Assurance Domains Intensive Care Anesthesiology Nephrology ERDSS Emerging Risks Holistic Ontology Forward chaining Risk assessment Uncertainty Don Willems WUR/IM Specify scope Identify sources ROC Tool Extract triples Sesame repository Protoontology generic Interviews cost time for KE and DE → DE mature role Subject layer 1 Difficult for DE to provideSubject knowledge layer 2 → prompting by tool ... Models created D from scratch → Subject layer n p e o ma rs p i n reuse existing sources ec Application perspective tive Task-specific knowledge required → monitor scope B A C D Protoontology Scope E F specific Discipline perspective ROC Domain perspective Nicole Koenderink, WUR -- IM Subject layer x G H specific generic Application perspective Hajo Rijgersberg Design and use of a quantitative research vocabulary for e-science Problem Approach Lessons learnt Vocabulary Web services and web apps Evaluate use Support simple, recurring actions Focus on those who actually need support Integrate in popular tools Excel add-in Semantic Friendly Forms - - RDFS/OWL functionality in form-based wiki Now - Semantic MediaWiki enables crowd semantics (and displays) - Semantic Forms facilitates crowd entry (at ~RDF level) Semantic Friendly Forms: RDFS&OWL-based menus/autocompletion - Entry is quicker and with fewer errors (?) - Process RDFS&OWL for form-based input - Input form value selection - Property selection for class instance input & infobox - - - Domain, range, cardinality, restrictions, symmetry, … Questions - Do RDFS&OWL-based menus accelerate crowd entry? - Can crowds engagingly and effectively design ontologies? - What is effective pattern and scenario for use? JWS special issue on Interaction: deadline April 20th! Lloyd Rutledge RNA infrastructure / Sterna project web interfaces Hans Nederbragt web interfaces API's repository connector RNA toolset repository connector rdf-store: rdf-records metadata rdf-store: rdf-records metadata rdf-store: rdf-records metadata rdf-store: reference structures data connector data connector conversion collection: records unstruct files collection: records unstruct files legacy reference structures local applications local applications content and metadata reference structures Sterna project / RNA infrastructure Launched in 2008, the Sterna project is an eContentplus best practice network that aims to contribute to the further development of the European Digital Library initiative. Sterna’s participants, mostly European institutions that are concerned with collecting and managing content on biodiversity, wildlife and nature in general, join forces to explore new ways of providing their content to the public. The project was initiated by the Netherlands natural history museum Naturalis and major technical contributor Trezorix. Sterna is short for Semantic web-based Thematic European Reference Network Application. Sterna is also the scientific name for the bird genus of terns. Not coincidentally, because birds are the central theme of Sterna with respect to the content that will be made accessible by project partners via the semantic information network, which is a genuine RNA environment. This content can be any type, from scientific articles and imagery to MP3 files of bird sounds, field recordings and artefacts with bird feathers in them. Multimodeling: The RNA environment is very flexible with creation and use of different datamodels. Harmonisation of data modeling is focused on the use of common properties, rather than on trying to end up with one common data model. Heterogenuous reference structures: In the RNA environment reference structures can accommodate both reference items (skos concepts) and content items (xml and rdf structures). Content items can be based on different data models, they even can combine different data models. Inferencing: In the RNA environment inferencing is used to create mappings based on schema's, rather than mapping the data itself. Also inferencing can be used on the heterogenuous reference structures to realise interesting modes of findability, but this raises some difficult questions as well. www.sterna-net.eu www.trezorix.nl Stefano Bocconi Entity-based data integration The concept of identity for entities: is identity between two entities a matter of context? Entity-based data integration, i.e. in how to integrate different knowledge sources about the same entity. Need for: handling inconsistencies? a quality mechanism to discard less trustworthy information in case of conflicts? Identifier lifecycle: guarantee persistency of identifiers (e.g. duplicate cases) The domain is scientific publishing (particularly in Biology) and news publishing (event detection). Lightening talks - 3 • • • • • • • • • • • Jellema Wang, Shenghui Tordai Groenouwe Groth Siebes Brussee Hollink Oren Jijkoun Schreiber STITCH @ CATCH SemanTic Interoperability To Cultural Heritage Thesaurus alignment techniques (lexical, structural, extensional and using background knowledge) Alignment deployment and evaluation in real-world scenarios (book reindexing and search, thesaurus merging and collection navigation, etc) Challenges Heterogeneity Scalability Multilingualism Shenghui Wang Towards a Methodology for Vocabulary Alignment E-Culture project: Semantic search engine for CH collections and vocabularies We do not want to create new techniques. We want to use existing techniques and their combinations. • Select multiple alignment techniques • Combine for higher recall • Evaluate • Apply disambiguation techniques to improve precision Anna Tordai Game “SWiFT”: Semantic Web in Fast Translation Chide Groenouwe, Jan Top, Mark van Assem • • • • • • • Goal: Translate all information in high quality SW representations. Problem: Not enough knowledge engineers, A.I. is too “stupid”. Towards Solution: Fostering capability in information creators. Means: multi-player online game SWiFT? Case studies: TIFN scientific collaboration (Jan Top et al), wikipedia translation. Background information: Towards a Constitution Based Game for Fostering Fluency in “Semantic Web Writing” http://km.aifb.uni-karlsruhe.de/ws/insemtive2008/ vrije Universiteit amsterdam Sign up to play! [email protected] Paul Groth From pipes.deri.org From Chris Bizer Who’s responsible ? How were they produced? Triples Which ones should I trust? Ronald Siebes The web is not about anglebrackets RDF’s layering on top of XML is the single largest obstacle for the adoption of Semantic Web technology. Mismatch in datatypes:Trees vs. Graphs It stimulates bad practices: e.g. URI’s far to hard to read for human beings, NEED tools XML causes real practical problems No unique way to represent an RDF graph as an XML tree If you need an XML parser anyway, RDF becomes extra burden Parsing is seriously inefficient or (worse) chokes you XML parser XML makes RDF unnecessary hard to understand Try find a tutorial RDF/XML example which is not a tree Obscures triple model Where does XML stop and RDF begin Better alternatives exist: please make turtle the official preferred RDF serialisation. Rogier Brussee MuNCH Multimedia Analysis Semantic Technologies Language Technology Enrich thesaurus structure sidewalk – pavement sand – concrete pearls – juwelery fjords – seas barbecues – picknicks queens – aristocrats acupuncture – negotiation User Behavior In Audiovisual Archive -Does the thesaurus represent the user queries? -Popular programs over time. -Can we use queries to automatically annotate shots? Eyal Oren Large-scale distributed RDF(S) reasoning http://larkc.eu/marvin • Web service for text information processing: – Extraction (terms, names, reported speech,…) – Cross-document name normalization and linking – Analysis (compare, track dynamic changes) • Protocols and standards – REST (HTTP POST/GET) + XML – SOAP – RDF/XML (on demand) Valentin Jijkoun • Basic application loop: – Upload your documents (text, html, pdf,…) – Specify processing type – Access the results of processing • Example: what themes played in Dutch news in the past month around Ikea? Questions/contacts UvA: • Maarten de Rijke • Valentin Jijkoun • mdr,[email protected] • http://ilps.science.uva.nl Guus Schreiber http://www.europeana.eu/portal/thought-lab.html