Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
Download ReportTranscript Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist Who we are: WGBH Media Library and Archives 2 Transition challenges (Analog to Digital) Preservation needs are more complicated — — — — — New and changing content formats Network connections Software Storage media Hardware Access expectations challenging — Faster access — Anywhere, anytime 3 Content formats 4 HD Acquisition Codecs 720p 1080p 1920 1920 1440 1280 960 1280 Mbps 1080i Video Sample Sample 10 Bit Audio # Bits √ √ √ 4 8 8 8 4 4 8 8 20 16 16 16 16 16 24 24 12 4 24 Container DCT [1] HDCam DVCProHD@24p DVCProHD-720p DVCProHD-1080i Avid DNxHD Avid DNxHD Apple ProRes Apple ProRes HQ Wavelet Red 140 40 100 100 145 220 145 220 √ √ √ √ √ √ 25 18-35 50 25 35 50 50 √ √ √ √ √ √ √ 100 √ √ √ √ √ MPEG2 I-frame GFCam H.264 AVCHD PS AVCHD AVCHD@24 Canon 5DMKII Nikon D800 50 100 RGB or 4:2:2 √ 4:2:0 4:2:0 4:2:2 4:2:0 4:2:0 4:2:2 4:2:2 2 4 8 2 2 4 2 MP1 16 24 16 16 16 16 4:2:2 4 16 4:2:2* 4:2:0 4:2:0 4:2:0 4:2:0** 2 2 2 2 2 AC3 AC3 AC3 16 16 √ √ √ √ √ 28 24 24 38 24 H.264 I-frame AVCIntra AVCIntra √ √ √ √ √ √ 224-336 MPEG2 Long GOP HDV XDCamHD XDCam422 XDCamEX XDCamEX GFCam Canon C300 √ √ √ √ 3:1:1 4:2:2 4:2:2 4:2:2 4:2:2 4:2:2 4:2:2 4:2:2 √ √ √ √ √ √ √ √ √ √ 440 √ √ 880 √ √ √ Tape DV-AVI, DV-DIF, MXF, QuickTime & Tape DV-AVI, DV-DIF, MXF, QuickTime & Tape DV-AVI, DV-DIF, MXF, QuickTime & Tape MXF & QuickTime MXF & QuickTime QuickTime QuickTime REDCODE & QuickTime M2T, MXF, QuickTime & Tape DV-AVI, MP4, MXF & QuickTime DV-AVI, MP4, MXF & QuickTime DV-AVI, MP4, MXF & QuickTime DV-AVI, MP4, MXF & QuickTime MXF MXF MXF MTS, MP4 MTS, MP4 & QuickTime MTS, MP4 & QuickTime QuickTime QuickTime 4:2:0 4:2:2 √ √ 2 2 16 16 MXF 4:2:2/4:4:4 √ 12 24 DPX, Tape 4:2:2/4:4:4 √ 12 24 DPX, Tape MXF MPEG4 Studio Profile [2] HDCamSR HDCamSR-HQ [2] √ *Sony FS100 HDMI output ** 4:2:2 HDMI output [1] Tape format for comparison [2] Tape with DPX file out D. MacCarn, WGBH Storage and retrieval How do we: Capture the audio and video generated by myriad cameras Store the project information to allow potential re-edit Store files with rich, meaningful metadata Store born-digital materials Display and retrieve born-digital materials 5 Access: Organizational Issues Metadata Descriptive metadata — Need description for video to be useful, findable — How to capture that — How to make sure it is linked to video files 6 Folder Structure 7 Create folders by card — Assign unique number — Continue numbers — Add description — Place ENTIRE card contents into this folder!! Original footage © 2011 WGBH 8 Proposed tapeless workflow Create a mapping document between filemaker and DAM Used to generate an xml stylesheet Video is ingested simultaneously with the metadata from filemaker using the xml stylesheet Technical metadata is ingested simultaneously with the video and production data using the xml generated by the source digital files 9 Challenges - again Access issues — File size — Formats – to playback — Useable — Search/findable Metadata Organize files Preservation issues — Copies — Formats – for migration — Being able to play again later — Speed of access (big file size) – to use/process — Migration ease 10 Software /Network File management — Where are the files? Needed for access to files — Large preservation files — Smaller access, proxy files Network speed — Larger files, need faster network to meet speed expectations 11 Issues with current file mgmt systems/software Preservation not a priority Interface issues — Access vs. Preservation IT relationship — Tech support — Vendor reliance issues — Need library based system for Archivist needs rather than traditional IT company needs Expense — License cost — Development — Customizations 12 Access 13 Can find Can view Can select Can get out of system Can reuse in editing system Preservation Needs Multiple Copies Validity Bit quality checks Long lasting storage Regular migration Persistence 14 Challenges of preservation and access For preservation — Want to capture as close to original as possible — Originals may be many different formats — Will need to make sure you can export and use different formats in future — File format issues — Fixity check big files For access — Want one consistent format for playback/access — Needs to be easy to migrate, use 15 What makes video different? Preservation files are large — Uncompressed — Slow to move around Need proxy files for viewing — Smaller size for quick transport over network Complicated formats — Not just one file type — Codecs, wrappers, frame speed, etc 16 Technology Mix: 17 Hydra project Combine preservation system with access system Better interface Flexible design Easy to evolve 18 Insert graphic Blacklight Hydra heads Hydra mgmt layer Fedora repository HSM storage system 19 Fundamental Assumption #1 No single system can provide the full range of repository-based solutions for a given institution’s needs, …yet sustainable solutions require a common repository infrastructure. 20 Fundamental Assumption #2 No single institution can resource the development of a full range of solutions on its own, —…yet each needs the flexibility to tailor solutions to local demands and workflows. 21 Hydra Philosophy -- Community • An open architecture, with many contributors to a common core • Collaboratively built “solution bundles” that can be adapted and modified to suit local needs • A community of developers and adopters extending and enhancing the core • “If you want to go fast, go alone. If you want to go far, go together.” 22 CRUD in Repositories Create/Submit/Edit (CUD) Search/View (R) Repository/ Persistent Storage Major Hydra Components hydra-head Rails Plugin (CUD) Blacklight (Read (R) Only) Solrizer Fedora Solr Hardware/Storage media: HSM Access — Online XX bytes Spinning disk — Offline — Nearline Preservation (offline) — Robotic tape library system — LT04 data tapes — 2 copies — One stored off site Migration needs 3-5 years — Both tape migration to newer formats — Technology migration New Storage Types and Costs Need hierarchical storage (HSM) — Video files are large — Spinning disks are expensive — Tape can help save cost — Tape copies/migration can be automated 26 New Storage Types and Costs But HSM has licensing issues — Some systems cost by gigabyte managed — Need Open source alternative 27 Q&A Karen: [email protected] Dave: [email protected] 28