Newspaper Digitisation Project Warwick Cathro Assistant Director-General, Innovation Strategic context  Enhance access to Australian content  Provide information infrastructure for the researcher and the general public  Digitise public.

Download Report

Transcript Newspaper Digitisation Project Warwick Cathro Assistant Director-General, Innovation Strategic context  Enhance access to Australian content  Provide information infrastructure for the researcher and the general public  Digitise public.

Newspaper Digitisation Project
Warwick Cathro
Assistant Director-General,
Innovation
Strategic context

Enhance access to Australian content

Provide information infrastructure for the
researcher and the general public

Digitise public domain content

Give the Library experience with “industrial
scale” digitisation
The project concept [1]

Cover the period 1803-1954

Cover every state and territory

Text-searchable newspaper database

Freely available online
The project concept [2]

Convert microform to digital images

Process digital images to produce enhanced,
zoned, OCR content

Build a search and delivery system to use
this enhanced content

Provide a user feedback and annotation
capability

Provide enhanced content to researchers to
support data mining
The project concept [3]

State libraries would ensure availability of
acceptable microfilm versions

NLA would fund creation of digital content for
one newspaper from each state/territory

NLA would fund development and support
for a search and delivery system

State libraries would fund creation of digital
content for additional newspapers
Search example
Challenges





Microfilm quality
OCR accuracy
Zoning, article categorisation, article linking
Quality checking procedures
Costs
Project status





Creation of TIFF images is underway
Request for Tender for production of enhanced
content was issued and tenders evaluated
Post tender negotiations are well advanced
Contract approval process is not yet complete
Search and delivery system will be developed in first
half of 2007
First 500,000 pages (indicative)








Sydney Gazette (1803-1842)
Maitland Mercury (1843-1893)
Argus (1846-1899)
Courier Mail (1846-1899)
Hobart Town Gazette, Courier (1816-1859)
Adelaide Advertiser (1858-1889)
West Australian (1833-1879)
Northern Territory Times (1873-1899)
Linking to other services



Searching in conjunction with other services
Obituaries and biographical articles – People
Australia
Citability of articles – persistent identifiers
Conclusion



Project has significant potential to enhance
access to Australian historical content
Will provide a benefit to researchers and the
general public
Will provide a platform for the NLA to explore
options for wide-scale digitisation of other
text-based collections