Digital Author Identification

Download Report

Transcript Digital Author Identification

Digital Author Identification
UKSG 17 – 18 april 2007
Daniel van Spanje
DAI in DARE
• DARE: Digital Academic REpositories
–
–
–
–
Universities + KNAW + NWO + KB
Infrastructure for linking the IR
Stimulate production of digital scientific output
2003 – 2006
• 2007 – 2010: SURFshare
2
Main issues in DAI
• Unique identifying number for researchers / authors
• National scale
• Benefits:
– Improve searching for electronic publications
– Integrate searching for electronic and nonelectronic publications
– Link Library (Catalogue) and research environment
(Metis)
3
Two projects
• Pilot in 2005 – 2006
– one university: Groningen
• Roll-out 2006 - 2007
– 13 academic research organizations
• Project leader: Anneloes Degenaar
• DAI website at University of Groningen:
– http://dai.weblog.ub.rug.nl/
– http://dai-uitrol.ub.rug.nl/
4
Organizations involved in DAI
• 13 universities + CWI + KNAW
• SURF
• UCI
• OCLC PICA
5
Systems involved
• Institutional repository / DAREnet
• Metis
• Dutch Union Catalogue (NCC/PiCarta)
6
Institutional
Repositories /
DAREnet
7
Institutional
Repositories /
DAREnet
8
METIS
9
METIS
10
METIS
11
National Union
Catalogue
12
Shared
Cataloguing
System (GGC)
13
Shared
Cataloguing
System (GGC)
14
Names and other issues
•
•
•
•
•
•
•
•
•
•
•
Authors with the same name
Use of one or more initials
Changing names
Spelling variants
Diacritics
Pseudonymes
Name in religion
Nicknames
Collective names
Different structure of names in other languages and cultures
…..
• Discussions on standardization and unification started in the
Netherlands in the Orion project (2003-2004)
15
Proposed solution
• Need established
• “External”Requirements:
– use existing mechanisms
– local management
– national function
• Solution: use “collocation” mechanism of libraries and
Metis as source
16
Cataloguing and Metis
GGC
NTA
Cataloguing
Repository
Metis
17
Use authority records (NTA) in Metis
NTA
Repository
GGC
Cataloguing
Metis
CWI
18
How did we link
• Mechanisms
– Initial load per organization
– Online input buttons (webtemplates)
– XML output
– Synchronization mechanisms
• Requirements
– No overwrite of library data!
– Deduplication (Matching/merging)
19
Datamodel developed
• Datamodel copied from bibliographic model: three
levels
• Metis name-information added to library data; no
overwrite
• Affiliations and other fields added
20
Structure of bibliographic data
general
Bibliographic metadata
YoP / LoP / / Title / Author
Imprint / LCSH / DDC
local
Groningen bibdat:
Subject headings
copy
Copy level:
•Location
•holding
•shelfnumber
Copy level:
•Location
•holding
•shelfnumber
Linked Authority
record
Amsterdam bibdat
Subject headings
Copy level:
•Location
•holding
•shelfnumber
Copy level:
•Location
•holding
•shelfnumber
21
Structure of authority data
Library record
Thesaurusrecord
Linked Authority
record
Name of author
Variant names
Metis
Groningen data
(Metis name)
Affiliation
Affiliation
•Begin
•End
Affiliation
•Begin
•End
Amsterdam data
(Metis name)
Affiliation
•Begin
•End
Affiliation
•Begin
•End
22
Library data
Metis
Researcher Name
Affiliation data
Example authority
record + added
fields
23
Example authority
record + added
fields
24
Datamodel: fields
Authority file
• Nationality
• Language
• Name (best known)
• Name (most complete)
• Maiden name
• Name variants
• Date of birth
• Date of death
• Profession / subject
• Link to pseudonyms
• notes
• Entry date
• Update date
•
Note: proper name field
includes subfields for first
name, middle name, last name,
prefix, suffix
Added fields
•
•
•
•
Local researcher number
Metis name (preferred)
Metis name
Sex
•
•
•
•
•
•
•
•
•
•
Code organisation
Name organisation
Start date employment
Enddate employment
Code function
Description of function
Code of employment
Notes
Entry date
Update date
25
Initial load
Metis makes list of names
Format conversion
Load DAI in Metis
Match names with
auth file
Manual dedup of list
Merge names with
names found
Dedup in Metis
Load new names
(not found)
Load B-records
(? Duplicates?)
Make Metis export
Manual dedup
by library staff
Export DAI’s to Metis
26
Initial load
•
•
•
•
•
Data enrichment in Metis
Export from Metis
Conversion to cataloguing system
Matching
Merging: merge / new / B-record
• Results depend on quality metadata
– 95 % automatic / 5% manual
– 70% automatic/ 30 % manual.
– 50 % automatic / 50 % manual
27
Online process
• DAI-button in Metis to create DAI-number
• Export DAI-button in NTA/Cataloguingsystem to Metis
• DAI-button in IR to create DAI-number
• Separate DAI-http-request for online input
• Online input via current cataloguing tool
• + Offline synchronization mechanisms between Metis
and NTA
28
DAI-button in
Metis
29
URL link instead of button
•
http://www.pica.nl/dai/dai_redirect.php?action=maak_dai&user=<use
rnumber>&metis_export_url=http://oras.service.rug.nl:1111/metisda
d&p_onderzoekernummer=00033&p_naam_medewerker=Rotteveel&p
_voorletter=R&p_voorvoegsel=&p_titulatuur=&p_voorkeur=J&p_gesla
cht=M&p_geboortedatum=01-071974&p_code_functie=20&p_functie=Universitair%20hoofddocent&p_c
ode_organisatie=22020200&p_organisatie_a=Medical%20Microbiology
&p_begin_aanstelling=01-01-2005&p_einde_aanstelling=01-01-2006
30
Input form for Metis
fields
31
Results of the DAI project
• Now:
–
–
–
–
50% of the researchers have a DAI
Procedure for initial load in place
Start with online procedure
P rivacy statement
• Autumn 2007
– Online procedure in place
– Procedure for synchronization in place
– 100% of the researchers will have a DAI in 2007 (ca. 40.000)
32
Things to do
• Finalize the roll-out, develop services (passport …)
and implement a usergroup
• Add DAI in metadatastandards (DCX, MODS)
• International standardisation: ISPI
• Involve authors for controll and updating
33
Concluding remark
34
•
Thanks
35