Key terms and concepts: introducing first principles

Download Report

Transcript Key terms and concepts: introducing first principles

Key terms and concepts:
introducing first principles
Overview
digital imaging
resolution and bit depth
file types
colour management
metadata
digital libraries
digital preservation
Reminder: what can be delivered digitally
Born digital content
Paper
Text content
Bound volumes or manuscripts
Photographs – prints, slides, transparencies
Microfilm, microfiche and aperture cards
Video and audio
Maps, drawings and large paper formats
Original art works, textiles etc.
Physical 3-dimensional objects or views
Digitization is?
Digitization is the process of converting
analogue originals to computer-readable form
Digital imaging and scanning are mechanisms
for capturing a digital picture
A digital image is sampled and mapped as a grid
of squares known as picture elements (pixels)
Digital files use a binary format with a series of
‘1’ and ‘0’, ‘on’ and ‘off, to represent data
like a light switch ...
Tech joke: there are 10 types of people, those who understand binary
and those who don’t
Digitization processes: Scanning
Capturing lines of pixels moving across the
object
Used in flatbed scanners, slide scanners and
scanning back cameras for instance
Digitization processes: Digital Photography
Digital photography or direct digital capture:
captures all the pixels in a single matrix
Used in digital cameras and bookscanners for
instance
Digital imaging = digital pictures
All we get is a digital picture, not digital text
Digital text
Digital text requires additional processes:
Optical
Character
Recognition
(OCR)
Rekeying
Mark-up:
XML, SGML
Pixels redux
Pixels are picture elements
They are usually square
They are the smallest component of the digital
image
By combining pixels in different orientations and
density we get shapes and content
By changing the tonal values of pixels we get
colour
i.e. resolution and bit depth
Resolution
describes the density of spatial detail
is usually expressed as dots-per-inch (dpi) or
pixels-per-inch (ppi)
these terms are synonymous, but dpi usually
refers to printed images and ppi to screen
images
remember the spatial detail is in relation to the
original item imaged
Resolution
It is often more useful to use absolute terms for
resolution
actual pixel dimensions are given
2490 x 3510 for example
Equals the pixel dimensions of an A4 sheet of paper
scanned at 300 dpi
But also equals* the dimensions of
* within 5%
A5 page at 425 dpi
A3 page at 200 dpi
A2 page at 150 dpi
8.7 Megapixel digital camera image of a landscape @
96dpi
Resolution is spatial density
Bit depth
Defines the colour space for each image and
pixel
this is the number of bits (binary digits) used
to define each pixels tonal value
Black and white (bitonal) = 1-bit per pixel
Greyscale = 8-bit (256 shades of grey)
RGB Colour = 24-bit (16.7 million colour tones)
Some rules of thumb
Resolution:
capture the smallest significant detail
the smaller the original the higher the resolution
double the resolution - quadruple the filesize
Bit depth:
1 bit = Black and white
8 bit = greyscale (x8 filesize)
24 bit = full RGB colour (x24 filesize)
CMYK: avoid using for scanning or storage
Select the right colour space for your original
Digitization Basics: Tutorials
Cornell Digital Imaging Tutorial
www.library.cornell.edu/preservation/tutorial/contents.html
Digital files
Can use compression to reduce file sizes. There
are 2 main types:
Lossy
there is irrecoverable loss of data with inevitable
worsening of quality, but can achieve considerable
size reductions
JPEG
Lossless
no loss of data, but not such great size reductions
LZW, ITU.T.6 (formerly CCITT Group 4)
Some common file formats
There are many, many file formats
The commonest you will meet are probably:
TIFF
GIF
JPEG
PDF
TIFF: Tagged Image File Format
De facto standard
Needs plug-in or external application for web
display although some browsers now accept it
Can be tagged with basic metadata
Can be used for files up to a bit depth of 64
The format of choice for long-term archiving
JPEG:
Joint Photographic Expert’s Group/JFIF (JPEG File
Interchange Format)
De facto standard for web display
Native to web browsers (ie no plug-ins needed)
Has free-text comment field for metadata
Can be used for files up to 24 bit
Commonly used for web display images
JPEG2000 – enables zooming and more
metadata
GIF: Graphics Interchange Format
De facto standard for web display
Native to web browsers (ie no plug-ins needed)
Has free-text comment field for metadata
Can be used for files up to 8 bit
Commonly used for web display images
Likely to be replaced by PNG (Portable Network
Graphics)
PDF: Portable Document Format
Proprietary (Adobe) format, but now a de facto
standard for document delivery
Needs plug-in or external application for web
display
Can be used for files up to 64 bit
Used for printing and viewing multipage
documents
Comes in 3 versions:
Image only
Image and text
Full text
Colour Management: What is it?
Colour is device dependent and looks different
when:
printed on different printers
viewed on different monitors
printed on a printer and viewed on a monitor
viewed in a light booth and under office lighting
Colour Management Systems (CMS) maintain the
consistent and accurate "appearance" of a colour
on different devices (e.g. scanners, monitors,
printers, etc.) throughout an imaging workflow
"Colour" Workflow
RGB Display
RGB Scanner
Original
App
Displays
Scanner
RGBs
Driver
Sends
RGBs or
CMYKs
to Printer
CMYK Printer
Colour Management: components
Use a consistent colour space
Apply an independent colour profile
International Color Consortium
www.color.org
Monitor calibration
Colour targets
GretagMacbeth
www.gretagmacbeth.com
Metadata
What is metadata
What is metadata for
What is metadata?
Tony Gill – ARTstor
Metadata refers to structured descriptions,
stored as computer data, that attempt to
describe the essential properties of other
discrete computer data objects.
Big picture definition:
the sum total of what can be said about any
information object at any level of aggregation
What is metadata for?
World Wide Web consortium say metadata is:
to provide a means to discover that the data
set exists and how it might be obtained or
accessed
to document the content, quality, and
features of a data set, indicating its fitness for
use.
Therefore we need to think:
content, context and structure
What characterises a digital library
1. A digital library is a managed collection of digital
objects
2. The digital objects are created or collected according
to principles of collection development
3. The digital objects are made available in a cohesive
manner, supported by services necessary to allow
users to retrieve and exploit the resources just as
they would any other library materials
4. The digital objects are treated as long-term stable
resources and appropriate processes are applied to
them to ensure their quality and survivability."
What is collection development?
American Library Association's definition:
"A term which encompasses a number of activities
related to the development and determination of the
collection, including the determination and
coordination of selection policy, assessment of
needs of users and potential users, collection
evaluation, identification of collection needs,
selection of materials, planning for resource sharing,
collection maintenance, and weeding."
(ALA Glossary of Library & Information Science)
Digital Preservation: digital lifecycle approach
‘The major implications for lifecycle management of digital
resources, whatever their form or function, is the need to actively
manage the resource at each stage of its lifecycle and to recognise
the interdependencies between each stage and commence
preservation activities as early as practicable. This represents a
major difference with traditional preservation, where management
is largely passive until detailed conservation work is required,
typically many years after creation and rarely, if ever, involving the
creator. There is an active and interlinked lifecycle to digital
resources which has prompted many to promote the term
'continuum' to distinguish it from the more traditional and linear
flow of the lifecycle for traditional analogue materials.’
Preservation Management of Digital Materials: A Handbook - Neil Beagrie &
Maggie Jones www.jisc.ac.uk/dner/preservation/dpc/