Collecting, sharing and improving data: Changing roles for librarians and users 1st International Seminar of the Library of Galicia: Digital Libraries Santiago de Compostela.

Download Report

Transcript Collecting, sharing and improving data: Changing roles for librarians and users 1st International Seminar of the Library of Galicia: Digital Libraries Santiago de Compostela.

Collecting, sharing and improving data:
Changing roles for librarians and users
1st International Seminar of the Library of Galicia:
Digital Libraries
Santiago de Compostela 7-9 April 2011
Rose Holley
Manager - Trove
National Library of Australia
[email protected]
Overview
• Changes in librarianship 1907-2010.
• New strategies for 2011.
National Library of Australia
• Australian Newspapers Digitisation Program.
• Trove single discovery service.
2
Women librarians 1907
Cite: http://nla.gov.au/nla.news-article14849217
3
Reference Librarian 1985
4
Arrival of the Internet 1998
Photo courtesy Genevieve Bell. Location: near Morgan, South Australia
5
Digitisation 2001
• Millions of items
digitised by cultural
heritage institutions
• Maps, photos, artworks,
architectural plans,
journals, archives,
documents, books,
newspapers, music.
6
Collaborative Delivery 2002
7
Single search vision (2002)
Search and Navigation Interface
Image
Collections
8
Websites
Databases
E-Journals
Library
Catalogues
Mass digitisation 2008
9
Digital Librarian 2010
•
•
•
•
Digitising resources.
Collecting/creating born digital objects.
Making resources accessible online.
Giving users online tools to interact with data, each
other and support research.
• Encouraging addition of knowledge to resources and
creation of new resources.
• Preserving digital objects.
10
The scope 2011….
•
•
•
•
Digital AND non digital
Galleries Libraries, archives, museums (GLAM)
Full-text (books, newspapers) GOOGLE
User-generated content Flickr, YouTube, Wikipedia
Changing roles……
Technology has turned librarianship on its head:
• Content can be created by anyone
• Content can be described by anyone
11
Why libraries still matter
• Long term preservation and access
• No commercial motives
• Universal access
• “Free for all”
ALWAYS and FOREVER….
12
Libraries have:
Librarians who can open doors with technology
+ Vast amounts of data
+ Information expertise
Who are the researchers?
• Once content is liberated
anyone can become a
‘researcher’.
• The ‘ivory tower’ of gated
and protected knowledge is
gone.
• ‘Formal’ scholars are
replaced by the crowd in
the cloud.
• Today’s public are educated
and engaged, demonstrated
by their participation in
citizen science projects.
13
What are their expectations?
“Self service, satisfaction and seamlessness are definitive of
information seekers expectations. Ease of use, convenience
and availability are equally as important to information
seekers as information quality and trustworthiness.”
2003 OCLC Environmental Scan
To interact with content,
other users and the
organisation (web 2.0)
To be able to annotate content
and contribute their own
14
Important Things
•
•
•
•
Connections
Linkages
Related
Context
Giving users
•
•
•
•
Sharing
Re-purposing
Mashing
Adding
• Access to resources
• Tools to do stuff
• Freedom and choices
• Ways to work collaboratively together
15
Where are the walls?
There are no walls only
bridges:
People outside your
building are accessing
information within it.
People inside your building
are accessing information
from outside.
Changing use of spaces.
Mobilisation of services.
16
Change institutional thinking
“Freedom is actually a bigger game than power.
Power is about what you can control.
Freedom is about what you can unleash.”
Harriet Rubin
Librarians are gatekeepers who need to focus on opening
rather than closing doors….
17
New ways of developing services
Learning the ‘art of with’ Charles Leadbeater
Not to people
Not for people
WITH PEOPLE (USERS)
Public feedback should drive development of services:
CRITICAL, RELEVANT, INTERESTING, FUN
“Libraries need to think they are leading a mass
movement, not just serving a clientele.”
18
NLA Strategic Directions 2009-2011
“We will explore new models for creating and
sharing information and for collecting
materials, including supporting the creation
of knowledge by our users. “
(not just NLA resources… all Australian content)
“The changing expectations of users that they
will not be passive receivers of information,
but rather contributors and participants in
information services.”
19
2007 http://www.nla.gov.au/ndp
20
20
National Program and Content
• Initial focus on major
titles from each state
and territory
Northern
Territory
Times
• ‘Regional’ titles being
contributed by
libraries 2010 onwards
• Coverage: published
between 1803 – 1954
(out of copyright)
Courier Mail
West Australian
Sydney Morning Herald
Advertiser Sydney Gazette
Canberra Times
• Start with 4 million
pages
Argus
Mercury
21
Aims
• Increase access to
historic Australian
newspapers
• Key Features
– Online Access
– Freely available
– Full Text
searchable
The Argus 12 October 1951
22
1803 to 1954
23
23
http://www.nla.gov.au/anplan/
24
24
Sydney Morning Herald
$1 million donation 1831- 1954
25
25
Finding missing pages not on microfilm…
26
Australian Women’s Weekly 1932-1982
27
Building National Infrastructure
• Storage
• Newspaper Content Management system
(digitisation workflow)
• Public delivery system
• Panel of digitisation contractors (mass digi)
• Quality assurance processes and team
28
Microfilm scanned into digital images
29
30
30
Checking Pages
Page
sequence
Metadata
creation
Missing
page
targets
31
31
Tapes with digital images sent to India
32
Article zoning and categorising,
Optical Character Recognition (OCR)
33
150 data operators Chennai
34
Final quality assurance checks
35
Articles go into public beta system
36
Text correction
- testing user engagement
37
37
Greatest fears!
• No one will do it
OR
• People will deliberately vandalise the text.
Questions?
• Moderation?
• Login?
• Integration of data?
38
Interaction at article level
39
39
Add a tag ‘titanic sinking’
40
40
41
41
Add a comment
42
42
Fix text – power edit mode
43
43
After enhancements
44
44
Text Correction Activity 2008-2010
Lines corrected - millions
14
12
10
8
6
4
2
0
Aug-08
45
Nov-08
Feb-09
May-09
Aug-09
Nov-09
Feb-10
30 million lines January 2011
46
Public feedback on the feature
‘OCR text correction is great! I think I just found my new hobby!’
‘It’s looking like it will be very cool and the text fixing and tagging
is quite addictive.’
‘An interesting way of using interested readers “labour”! I really
like it.’
‘A wonderful tool - the amount of user control is very surprising
but refreshing.’
‘
47
‘I applaud the capability for readers to correct the text.’
Why do it?
•
•
•
•
•
•
•
•
•
•
•
48
I love it
It’s interesting and fun
It is a worthy cause
It’s addictive
I am helping with something important e.g.
recording history, finding new things
I want to do some voluntary work
I want to help non-profit making organisations
like libraries
I want to learn something
It’s a challenge
I want to give something back to the community
You trust me to do it so I’ll do it
Achievements
March 2011 (2.5 yrs since release)
 30,000+ volunteer text correctors
 32 million lines of text corrected
in 1.3 million articles
 811, 000 tags added
 18,800 comments added
 3 million users
 46 million articles
49
Significant newspaper research
•
•
•
•
•
•
50
Climate change
Influenza in Australia
Australian words and first usage e.g. ‘jumbuck’
Dating early colonial music
Building of railways and tramways
Convicts and outlaws
Trove – single search 2009
Migrate NLA discovery services into Trove:
•
•
•
•
•
•
•
•
51
Australian Newspapers
Picture Australia
Australian Research Online
Libraries Australia
Register of Archives and Manuscripts
Australia Dancing
Music Australia
PANDORA
Single search vision (2002)
Search and Navigation Interface
Image
Collections
52
Websites
Databases
E-Journals
Library
Catalogues
Single search
Restrict
search
browse
zones
53
Refine/limit
search results
Groups results
in zones
Use API’s for
Wikipedia,
Amazon, Google
video…
54
Get item
Features
Tag, comment,
list, send link to,
cite, check
copyright
55
Trove Strategy 2010 -2011
1.
2.
3.
4.
56
Grow
Develop
Engage
Promote
1. Grow – existing contributors
Content Collectors
1100 organisations:
• Libraries
• Museums
• Galleries
• Archives
Content Creators
Australian Broadcasting
Commission
120 million items
57
Open sources
• Open Library (Internet
Archive)
• Hathi Trust
• OAISTER
Targets – websites
•Amazon
•Wikipedia
•Google Books
•YouTube
Grow – new contributors 2011-2012
• Large aggregators e.g. Atlas of Living
Australia, Bio-Diversity Heritage Library –
Australian node
• Large Australian cultural institutions
especially museums and archives
• National Libraries with Australian content
e.g. UK, New Zealand.
• Collection specific e.g. Australian sport
58
2. Develop
•
•
•
•
Agile development based on user feedback
In 2010 - 17 new releases v1-v3
Usability testing
IT team of 5
– 2 Programmers
– Business analyst
– Web developer
– IT Manager
59
Version 4: April 2011
New
homepage
Access to
subscription
e-journal
content
‘Contribute’
has greater
prominence
60
3. Engage: with content and each other
61
User generated content via Flickr: objects
62
User generated content: photos
http://trove.nla.gov.au/work/37255844 By Nomad Tales
63
Family photos – identify people
http://trove.nla.gov.au/work/37288101 Flexigel
64
Context – Tools - Lists
65
Personal List to record your finds and add notes
66
Institutional list for virtual exhibition
67
Educators List – Teaching aid
68
69
Alerting to new content
70
Text correctors - Hall of Fame
71
Profile -overall ranking and history
72
Wikipedia citation style
73
Lionel Logue – The King’s Speech
74
Wikipedia links to Trove sources
75
Wikipedia links
76
77
Feedback Christmas Day 2010
3000 comments and feedback received in 2010
78
User Forum
79
Trove Blog
80
Trove Tweets
81
New Years Eve 2010
82
Public raise money for digitisation
83
Rockhampton ‘Trovers’
84
http://climatehistory.com.au This landmark project, spanning the sciences and the humanities, draws
together a team of leading climate scientists, water managers and historians to better understand southeastern Australian climate history over the past 200–500 years. It is the first study of its kind in Australia.
85
Re-purposing
information
and sharing
Blog using
newspaper
articles
http://lynnwalsh.wordpress.com
86
http://themcwhirtersproject.blogspot.com
87
4. Promote use
88
Spikes caused by media
http://www.abc.net.au/news/video/2010/04/29/2885984.htm
89
Trove screencasting on YouTube
90
Trove promotional video
91
Incoming Trove traffic
Direct
14%
Referrals:
Bing, Yahoo,
Wikipedia,
NLA sites
16%
January 2011
Google 70%
92
Trove dependant on…
Collaboration across cultural heritage
institutions (digitisation, storage, service
delivery, crowdsourcing, standards).
Data sharing
Being ‘open’ e.g. OAI, API’s
Changing institutional strategic thinking from
power/control to freedom
New ideas and revisiting old ideas
93
Rose
The site you manage is a nightmare! It’s addictive.
Keeps me awake at night. Congratulations!
Mary
Trove finds the pieces and
[email protected]
puts them together for you
[email protected]
94