Transcript Slide 1

Public Records in the Digital
Age
Salvador Barragan
Curator of Government Records
Nebraska State Historical Society
What is a record?
• A ‘record’ is the complete set of
documentation required to provide
evidence of a business transaction.
Shifting Media
• Before Paper we stored valuable historical data on stone
and papyrus.
• In our current period records of historical permanency
were stored on paper and kept in filing cabinets
– When the cabinet was full, records were sent to file
room.
• Now records are stored electronically on computers
– When the computer is ‘full’ – add more hard drives or
servers.
Basic skills to manage and maintain records has been
lost, replaced by infinite storage
Electronic Records
Management Goals
1.Bring the record to the forefront of system
design activities.
2.Identify electronic records functionality as
part of system design.
3.Create electronic records that support
legal, fiscal and evidentiary needs.
4.Create long term archival storage for both
retention schedule and historical purposes.
Goals con’t
5.Create electronic records that are accessible
and usable over time (non-proprietary
formats).
6.Integrate diverse document forms and
formats into records.
7.Identify need for internal and external
primary and secondary access to records.
Three Functional
Requirements for Electronic
Records Management &
Preservation
1. Records Capture – Records are created or
captured and identified to support the
business process and meet all records
management requirements related to the
process.
2. Records Maintenance and Accessibility –
Electronic records are maintained so that
they are accessible and retain their integrity
for as long as they are needed.
3. System Reliability – A system is
administrated in accordance with best
practices in the information resource
management field to ensure the reliability of
the records it produces.
What happens when you do not have a
RM system?
Higher Standards
• As electronic records become more
integrated into society, producers of those
records will be held to higher standards of
conduct
– HIPPA
– Sarbanes Oxley
– Federal and State Mandates
– Case Law
NE Public Records Laws
002.01 Record. The Records Management Act (Revised Statutes of Nebraska, Chapter
84, Article 12) defines a record as: "any book, document, paper, photograph, microfilm,
sound recording, magnetic storage medium, optical storage medium, or other material
regardless of physical form or characteristics created or received pursuant to law, charter,
or ordinance or in connection with any other activity relating to or having an effect upon
the transaction of public business." A record is information that is inscribed on a tangible
medium or that is stored in an electronic or other medium and is retrievable in
perceivable form.
http://statutes.unicam.state.ne.us/Corpus/statutes/chap84/R8412013.html
Records Retention
The foundation of democracy in
America is government accountability
to the people and permanency of our
culture and heritage.
So the question becomes…
who takes care of the
records, and do they have
the knowledge
understanding of the new
technology?
Caretakers of Information
• Historically records sent to file room, staff
maintained access to records and
managed lifecycle based on need and
legal requirements
• Now records are managed by users and IT
staff, based on capacity and cost.
Taking into account the goals of
records management and the
function of Records, what are we to
do?
Or what is the solution?
•Best Practice Models
•Standards
•Systems
•Digital Archive?
Best Practice Models
•OAIS Model
•Washington State Archives
http://www.digitalarchives.wa.gov/default.aspx
OAIS Model
www.digitalarchives.wa.gov
Standards
When ever possible follow the prevailing best
practices and standards.
Standards for E-Records….
•Hardware
•Software
•Formats
•Management
•Authenticity
Hardware
• File Room of the 21st century
• Capacity and Speed double every 18 months
• Many choices
– Tape
– Optical
– Spinning Disc
First Immutable Law of Digital Archiving
“What hardware you use today will be obsolete
within four years”
Washington State Digital Archives Network Configuration, May 2, 2005
HP DL380
2 * 3GHz HT CPU
2GB RAM
36GB Mirrored HD
MS WIN 2003 std
2 Coyote HW Loadbalancers
HP DL380
2 *3GHz HT CPU
2GB RAM
36GB Mirrored HD
WIN 2003 std
DA-DC1
HP DL580
4 *3GHZ CPU
4GB RAM
36GB Mirrored HD
MS WIN 2003 ent
DA-DC2
IIS
IIS
IIS
DA-SE1
DA-SE2
DA-SE3
Domain Controllers
EMC Clariion CX700 SAN
1TB 15K FC
4TB 7200 SATA
IIS
Hardware Load
Balanced
80
`
IIS
DMZ
IIS
Digital Archives
Asset Metadata
Cluster
DA-SE5
Services Tier
(Search Services)
Citizen
Internet
User
HP DL740
8 *3GHZ HT CPU
8GB RAM
36 GB Mirrored HD
MS WIN 2003 ent
MS SQL Server 2000
MS Clustering Active/Passive
Tape Library
DA-SE4
SAN
Storage
80
80
DA-WEB1
Hardware Load
Balanced
http/
https
IIS
DA-BIZ-RS1
BizTalk
Receive/Send
Location
80/443
Internet
Secure
FTP
ADIC iScalar 2000
10 LTO-2 drives
500 tape slots
22
DA-WEB2
http/https
DA-BIZ-RS2
BizTalk
Receive/Send
Location
BizTalk 2004
Database
Cluster
Data Tier
22
DA-BIZ-INBOX1
RAW Data “Temp” Storage
Image Conversion
XML “Temp” Storage
DA-Tectia1
(Secure FTP)
State/Local
Office
HP DL380
2 * 3GHZ HT CPU
2GB RAM
36GB Mirrored HD
MS WIN 2003 std
DA-DMZ-DC1
REVISIONS
Legend
Processing Tier
DA-Media1 & 2
(Images &
Streaming
Media)
Internet Send/Receive
DA-DMZ-DC2
HP DL580
4 *3GHZ CPU
4GB RAM
36GB Mirrored HD
MS WIN 2003 ent
MS SQL Server 2000
MS Clustering Active/
Active
Firewall
Database Server
HP DL380
2 *3GHz HT CPU
2GB RAM
36GB Mirrored HD
MS WIN 2003 std
HP DL380
2 * 3GHZ HT CPU
2GB RAM
144GB RAID 5 HD
MS WIN 2003 std
MS BizTalk 2004 ent
Web/FTP Server
Web Services
BizTalk Server
Administration
Shared Disk Array
Digital Archives Hardware
• Network – Cisco Backbone end to end
– LAN and SAN
• EMC – SAN storage
– 5 TB now, 20TB by end of Year
• HP – Servers and desktops
• ADIC – Tape Library for offsite, disaster
recovery (nightly or weekly back up,
remember Katrina and 9-11)
• Microsoft – Software and Development
•
•
•
•
•
Archival Software and File Format
Standards
Native
ASCII
TIF
PDF/A (Used by the Federal Courts)
XML http://www.thexmltoolkit.org/guides.asp (metadata
and interoperability)
• DoD 5015.2-STD compliant system
• Nebraska State Records Guidelines
Whenever possible seek the
Open, documented solution!
Remember WordStar and DBase II ???
Metadata & Interoperability
• Cross cultural and contextual boundaries
• Interoperability
• Interoperability & Metadata schema
Interoperability and XML
Content Management
• Essential to maintain control of the information
explosion
• Allows hard coded rules and information
exchange
• BUT still requires a strong knowledge,
understanding and implementation of basic
records management
Second Immutable Law of Digital Archiving:
“Data is Data, a Record is a Record, It is the
content that drives retention, not the media”
‘Content Management’
• DoD 5015.2-STD compliant system
http://www.dtic.mil/whs/directives/corres/pdf/50152std_061902/p50152s.pdf
Wrap original file in native format
• Wrap XML copy
• Apply metadata & XML for indexing,
searching & retrieval
• Provide chain of custody & authenticity
‘Content Management’
•
•
•
•
Microsoft Solution
SQL Server back end
BizTalk translation utility
SSH Tectia for secure transport
http://www.ssh.com/products/
Washington State Archives Case Study
Authenticity
• Maintain Chain of Custody
• In the care of trusted 3rd party
• Received from trusted, known source
Data Security
•
•
•
•
•
•
•
Encrypted SSH FTP transmission
Issue Digital Certificate
Verify IP and computer information
MD5 Hash on all original files
Copy of FTP on tape prior to ingestion
DB backups on tape
Record Level Security for confidential Info
Record Level Security
• Restrict records at item, field or series
level
• Restrict to individual, dept, office or global
• Uses authenticated login to reveal fields
• Anonymous users see ‘Restricted’
Deep Storage XML
Deep Storage XML Schema
Record Common
•Who
Vital Records
•What
• Type
•When
•Where
•Original File
•‘web’ file
•Security
•Fixity
Birth
• Date of
• Father, Mother
• Hospital
Ingestion Process
• MUST be flexible
• Microsoft BizTalk 2004
• Transforms, adds metadata based on
business rules
• Creates ‘deep storage’ copy wrapping
original file in XML, with Hash
• Creates ‘web’ version of original file
Archive Database
• Designed around latest industry standards
• Open source, non-proprietary file storage
• Applies metadata ‘tags’ to save information
about record
– creator, date, agency, subject, etc.
• Provides chain of custody & authenticity of
record
• Allow search and retrieval of archival records
through a web page
Risks
• Distributed, non-standardized environment
• Limited technology expertise in some
agencies
• Unpredictable data growth rate
• Few business models
• Emerging technologies
• Limited internal expertise
Management Issues
•
•
•
•
•
•
•
•
•
•
•
Authenticity of record
Metadata
File naming conventions
Corporate Culture
Start small with e-mail, web page
Use existing retention schedules
Educate
Shift AWAY from desktops…
…And move to central servers
Management Software is a must!
Privacy of sensitive data
Third Immutable Law
“Anything that you do today, will need major
overhaul in two years or sooner”
Technology and industry changing at
unprecedented rates… But, more records
are ‘lost’ every day!
– Key is to be flexible and address with
systematic forethought
How to handle Records over
the Web.
Open Record
Restricted Record
Confidential
E-Commerce
Add to Shopping Cart
• Ecommerce Functionality
– Add to Shopping cart
Shopping Cart
Billing Information
View and Submit Order
Why a Digital Archives?
•
Comply with statutory & regulatory mandates.
– The Law requires preservation of certain public records – it
doesn’t specify whether those records are paper or electronic. All
records must be given the same care.
•
Avoid loss of legal & historical records
– As technology changes, the older media (5 ¼” floppy disks, for
instance) become harder to read.
•
Centralize Records
– Centralization means uniformity in maintenance
– ‘Trained professionals’ serve as caretakers
• Preserve rare and ‘at-risk’ paper records
•
Improved access for citizens
– By centralizing historical electronic records in one location, ‘onestop shopping’ will provide the information quicker and easier
The Digital Archives will:
• Preserve electronic records with long-term
legal, historical and/or fiscal significance
• Assure platform-neutral retrieval 50, 100, or
more years from now
• Provide security back-up of certain
permanent electronic legal records (courts,
vital records, land records, etc.)
Acknowledgements
• Adam Jansen, Digital Archivist for the
Washington State Archives.
• Dr. Ed Papenfuse, State Archivist for the
Maryland State Archives.
• Andrea Falling, State Archivist for the
Nebraska State Historical Society.
• Cathy Danahy, Assistant Director of the
Nebraska Records Management Division.