Document 7133728

Download Report

Transcript Document 7133728

Getting to Disk-based Lossless
Digital Video Preservation –
An Introduction
Paul Theerman, Walter Cybulski, Glenn Pearson
National Library of Medicine
NIH/HHS
Historical Audiovisuals at the
National Library of Medicine
Paul Theerman, Ph.D.
Head, Images and Archives
History of Medicine Division, NLM
Historical Audiovisuals at NLM
• Origins as the National Medical
Audiovisual Collection
• A clearinghouse for these materials
• Variously held here and at CDC, Atlanta
• Only relatively recently transferred to
the History of Medicine Division
Historical Audiovisuals at NLM
• Current definition of the collection
• All audiovisuals before 1970
• Films and videos of historical interest
dating after 1970—that is, of interest for
historical value, not informational value
Historical Audiovisuals at NLM
• The collection ranges from the first decade
of the 20th century through the 1990s
• Content:
– Early films on “how to go to the doctor”and
other public service and public information
films
– Films on the U.S. Public Health Service
Historical Audiovisuals at NLM
• Content
– Dental films due to an ADA donation
– Training films for surgical procedures
– Military: battlefield surgical films
– Large recent donations from NIMH and FDA
– “home movies”
– Research footage
– Films promoting usage of films in medicine
Historical Audiovisuals at NLM
•
•
•
•
•
•
Size: the largest such collection in the U.S.
Total number of titles: ~9650
Number cataloged: 4300
Number inventoried: 3550
Number to be inventoried: ~1800
Number preserved: 2250+
Historical Audiovisuals at NLM
• The ability to collect is dependent on the
ability to preserve and to catalog, and, in
the short run, to stabilize in order to
preserve and to catalog in the future
• Controlled environments
– On-site cool vault for new accessions,
masters
– Off-site cool and cold vaults for new
accessions, originals
Historical Audiovisuals at NLM
• The decision to preserve and to catalog is
not made lightly, because of the
investment of resources
• Based on condition and content
assessments
Historical Audiovisuals at NLM
• Condition assessment
– Age of medium
– Obsolescence of format
– Possible or actual deterioration of medium
• Nitrate
• Acetate
– Generation
Historical Audiovisuals at NLM
• Content assessment
– Ownership and restrictions
– Uniqueness
– Age, especially pre-1950
– Then a sliding scale, based on collection
development guidelines
Historical Audiovisuals at NLM
• When both condition and content indicate,
then:
– Preservation copying, to three copies (in
some cases two)
– Cataloging, either to full or core records
Historical Audiovisuals at NLM
• Currently we are on the cusp of moving to
digital formats, but our originals are chiefly
analog, and our duplication and viewing
copies are as well
– Betacam SP for duplication copies
– VHS for use copies
• This also matches patron needs for
Interlibrary Loan and production
Historical Audiovisuals at NLM
• The Preservation and Collection
Management Section enters the picture:
– Determining formats
– Technical specifications
– Managing vendor copying
– Managing on-site and off-site cool and cold
vaults
– Managing shelving for use copies
Historical Audiovisuals at NLM
• New Ventures with Center for Information
Technology (CIT) at NIH
– Videocasting service of “history in the making”
– Possible collaboration with NLM
– Interlocking systems for preservation and
cataloging
– New venture for NLM in a large cache of
digital materials
Historical Audiovisuals at NLM
• New Library Research at NLM
• NLM’s Lister Hill Center is looking at
means of digital preservation
• The origin of this conference—excited
what it will bring
Analog Motion Picture and Tape
Preservation at NLM –
Duplication & Offsite Storage
Walter Cybulski, Preservation Librarian
Preservation & Collection Management Section,
Public Services Division, NLM
Examples of Film and Tape Media
in the NLM Collections
8mm
16mm
35mm
2” Quadruplex
1” Type C
¾” U-Matic
½” Beta
Deterioration
Nitrate added spice to the idea of deterioration –
unfortunately, nothing but hot pepper
(There are no nitrate film materials at NLM)
“250 TEASPOONFULS
OF VINEGAR FOR A
1,000 FOOT CAN OF
35mm FILM”
Main Objectives of Preservation
• Identify content that merits
preserving
• Mitigate against known risks
• Extend useful life of content
Mitigate against risk.
Temperature (F)
Relative
Humidity
Years for Acidity to
Double
Room Temperature: 70º F
50% RH
5
NLM Cool Vault: 55º F
Magnetic Tape
30% RH
50
NLM Cold Vault: 35º F
Acetate Movie Film
25% RH
200
SECURE,
CLIMATECONTROLLED
STORAGE AT
IRON
MOUNTAIN
Extend useful life :
copy onto new media
+
==
For libraries and archives, obtaining new copies may not
be possible, and copying content on deteriorated media
to the same media (e.g. 35mm to 35mm film transfer) can
be prohibitively expensive
At this point, the most widely used AV
preservation media are BetacamSP and
Digital Betacam
But the clock is ticking even as we
copy content onto these formats…
Rapidly changing technology takes its toll
with each technological advance,
the storage picture changes …
WE ARE TRANSITIONING FROM FILMS AND TAPES TO
DATA, BUT THE QUESTION REMAINS:
HOW TO EXTEND THE USEFUL LIFE OF THE CONTENT
101010100110000101010101010101011010101010101010101001101010110
101010101
101010101
010101010
010101010
101001010
001010101
101000100
010000101
101010100
0010101110
101101001
0101101011
000101010
010101010
101010101
1011010101
010101010
010101010
101010101
101010101
010101010
010101010
111101001
100101010
110101011
101010101
010101010
001010100
101010100
110010
101010101010101010110011011111010101110101010000001011010101110
Getting to Disk-based Lossless
Digital Video Preservation –
Which Way Forward?
Glenn Pearson, Ph.D.
Senior Software Developer
Communications Engineering Branch,
Lister Hill National Center for Biomedical Communications
NLM
Generational Loss Once Digital
• Migration as preservation strategy
– To cope with obsolescence of digital formats, gear
• If using lossy image compression algorithms
– No degradation when making exact copy
Master  Master
– Degradation when migrating (or editing)
Master  uncompress  recompress  Master
– Examples: M-JPEGs, DVs, MPEG-1, -2, most -4
• Mathematically-lossless algorithms
– Avoid this problem
– Don’t compress as well (2x – 4x) as “virtually
lossless” (5x – 9x) or obviously lossy (web streaming)
Lossless Video Storage
• Uncompressed video
– Can be stored with general binary file compressors (RLL, LZW
[zip] ), typically 1.6:1 - 2:1 compression
• Lossless video codecs
– Standardized, open (but may be patents)
•
•
•
•
HuffYUV – original, uses Huffman “entropy” encoding
Apple Quicktime “None” codec [documented, not standard]
JPEG 2000 Lossless (within, say, Motion JPEG 2000)
MPEG4/AVC Lossless
– Proprietary
• Matrox DigiSuite: Lossless = entropy-only portion of M-JPEG
• New - MatrixView’s “Adaptive Binary Optimization”, from patented
“Repetition Coded Compression” (boolean grids + Huffman)
Economics of Digital Storage
10000
1000
DRAM/Flash
HDD Storage System
2.5" Hard Disk Drive
3.5" Hard Disk Drive
Tape Media
100
$ per
GigaByte
10
1
0.1
0.01
1998
Sources:
2002
2006
2010
2014
2020
E. Grochowski & R. Halem, IBM Sys J, 42(2), 2003 (Disk, Flash)
R. Harada, Comp Tech Rev, June 2004 (Tape)
Data is for
computer
tape, but
digital video
tape uses the
same
technology,
which drives
media price
The Twilight of Tape
10000
DRAM/Flash
HDD Storage System
2.5" Hard Disk Drive
3.5" Hard Disk Drive
Tape Media
1000
100
Hierarchical
storage
yesterday:
Hard
Disks
Tapes
10
Hierarchical
storage
tomorrow:
1
0.1
Flash
Disks
0.01
1998
2002
2006
2010
2014
2020
Hard
Disks*
*Powered
on-demand
Economics of Subsampling and
Lossless Compression
• Gold Standard for digital video: 4:4:4 uncompressed
• Not so affordable today for archives
Chroma
4:4:4 4:2:2
In YUV colorspace:
Y is luma (B&W intensity)
Luma
Uncompressed 1
2/3
U, V are red, blue color differences
. respectively
Lossless
~1/4
4:4:4 = full sample/pixel
~1/3
• 4:2:2 lossless
4:2:2 = sample for Y at full pixel
resolution, for U, V at half resolution
– will be affordable 2 years before 4:4:4 uncompressed
– stay ¼ the cost
• When is 4:2:2 good enough for preservation?
Film Master  Digital Master
• Traditional good advice: Film  Film
• Can Film  Digital be
– as good as/better than Film  Film
– as affordable?
• Quality of source
– 8mm, 16mm, 35mm, 65/70m
– B/W vs color
– camera original, intermediate print, distribution print
• Versus quality of target
• HD video has1920x1080 (“e-Cinema”)
–
–
–
–
Variety matching film best: progressive-scan 24 fps (1080p24)
But video has but 8-10 bits linear/component – less than film’s range
Good enough for archiving some 16mm B&W distribution prints?
HD 16:9 aspect matches some sources, not others
Film Master  Digital Master
- Hollywood Style
•
•
•
•
•
Better than HD but $$
12 bit linear/component (36 bits/pixel)
Or 10-bit log/component
No subsampling
2K @ 24 fps = most practical res. & rate
– 2K = 2048 x 1080
– That’s outer bounds for various aspect ratios
3 Steps, 3 Types of File Formats
• Sources (Production)
• Digital Intermediate
• Package for Theatrical Release
Sources
• Computer Graphics
• New cinema digital cameras
– Viper, Dalsa Origin 4K, Arri D-20, Kinetta
• Film Scanners
– Kodak Genesis, Northlight, Arriscan, Imagina
• “Datacines” (data telecines)
– Thomson Spirit, Cintel DSX, Millennium
• Raw, Unwrapped Frame-per-file Formats
– Flexible resolution, aspect ratio
– But sound, most metadata in separate files
• Awkward: per-shot info
– Examples
•
•
•
•
Kodak Cineon scanner .CIN (10-bit log rgb)
SMPTE std DPX (derived from Cineon)
Others: TIFF, SGI, EXR, JP2
“Digital Negative” from 1-CCD camera
with Bayer-pattern color filters atop pixels
Magazine
has 12
40 GB
iPod
Drives
Digital Intermediate Process
• Creates Digital Masters
– May include “Digital Source Master” from which multiple
masters come: DVD master, TV master, DCDM
• Typical Steps
– Color grading, compositing, editing, finishing
– Projects moved along in vendor formats or AAF
– End products archived in vendor formats or MXF
• Such unencrypted masters closely held by
studios, but archivists could make their own
Theatrical Distribution
• DCI Distribution Master (DCDM)
– MXF wrapper + JPEG2000 frames
– But lossy due to real-time bandwidth
constraints (250 Mb/s peak)
– Something Similar for Archivists?
• a lossless variety of this
• or MJ2 instead of MXF
Roadblocks in Getting to a DiskBased Lossless Archive Master
• Rapid digital-technology change
• High current costs
– Top quality needs massive storage, high-speed pipelines
– An uncompressed color movie (2K @ 24 fps, 12-bit)
• Would consume ~2 Gigabits per second bandwidth if realtime
• Needs 0.8 TB storage per hour of length
– Plus $$$ for color grading/restoration services & software
• Analog tape  SD digital is more affordable now
• A proliferation of standards
– File Formats
• Essence representation/codecs/color spaces
• Wrappers
– Metadata & Rights Management
• Can we help find a way forward?