Diapositive 1

Download Report

Transcript Diapositive 1

SWIB 2012 Linked Open Library Data in Practice: Lessons Learned and Opportunities for data.bnf.fr

Romain Wenz Bibliothèque nationale de France Conservateur Département de l’information bibliographique et numérique

What it looks like

  Web pages about  Authors,   Works, Subjects Gathering information  Library records (12 million at BnF)  Archive materials  Digital objects (2 million at BnF: Gallica)

Part I

 The purpose and difficulties  Build Web pages   About writers, books, subjects Linking to all resources in the library  Completely automatic

Exemple

 Information about Cicero, http://data.bnf.fr/11885977/ciceron/  Most studied books, editions of theses books  Digitized books,  Activities, such as translations by Cicero

Regroupement par « Œuvres »

http://data.bnf.fr/11952658/dante_alig hieri_la_divine_comedie/  Manuscripts   Editions Digital books

About a « theme »

 Books about diving http://data.bnf.fr/12647518/natation/

Several formats

    Marc catalogues XML-EAD archives and manuscripts Dublin Core digital Library Authorities:  Persons and Organisations   Works (Uniform titles) Subject Headings

Several structures

 Library records : flat structure  Archival fonds with hierarchical structure and heritage  Digital Content that can be processed: tables of contents, OCR

Purpose: info about concepts

 Pages for humans  Structure for machines

Links and authorities

 ARK identifiers from authorities  Materials to make the matchings:    Dates Preferred and alternative labels Graph of links : relations, roles

Workflow Digital documents Archives and Manuscripts Library catalogue records

Matchings- Alignments

Web pages for humans data for computers

Data model

Ontologie complexe

Romain WENZ BnF-IBN 13

Part II

 Feedback on activities

How?

 FRBR principles  Things that work

Principes FRBR

  Functional Requirements for Bibliographic Records Uses  Dates   Labels Related roles  Wich roles:  creation of a work    production of a version: language, type, material production: publication, life of an item

Why FRBR?

Linking writers and works with a useful type of links: - Writer of a work - Contributor of an edition: translator, preface, … - Producer : physical copy with a printer, distributor - Associated with a unique item: owner, annotator

From a bibliographic record

   Make the link towards a work Common properties Possible « expressions »      Author  Dates   Name Role Type of document Language Date Title

Matching (« Aligning »)

 Using a « prediction function » to:    Predict to wich Work a bibliographic ressource is associated :  Words of all titles  Goups of words Give a threshold Stopwords and improvements

Clustering

 From the manifestations that are not matched    If there are enough common points What it looks like in theory… and in practice

The purpose

 Gather data  Make them useful on the Web  Upgrade the catalogs

Part III

 « Linked Open Library »

Open: Technical Legal

 With the “Open data” initiatives led by the French government, it is possible to use an Open Licence.  Currently a strong state incentive around open data and formats  Once data is linked and open, what comes next?

 First, changes in general use, since people can now find BnF’s resources directly on the Web.  Mailing address: lots of mail, « new publics »   Use statistic: 80%+ users from search engines R and D: Improvements to integrate in main catalogues and archives

 Secondly, the data is being used by broader communities.   small public libraries, new procedures are being explored for re-use of the dataset in local catalogues. Example of « OpenCat » with Fresnes Use in other contexts: example of IF verso (translations) Institut français  http://ifverso.com/ Specific catalogues (bindings)

In the long term ?

 Semantic Web technologies could set a standard for library data,  if we keep them   linked and open.

Library missions

Strengths or weaknesses?

Descriptive information :trust

produced to handle a collection and not for marketing purposes 

Describing local « concepts » : local use

For documents, not encyclopaedically 

Use of standards: long-time perspective

MARC catalogues, EAD archives, DC digital collection 

Already « machine-readable »

But not with Web standards yet

[email protected]

Projet: [email protected]

Thanks