Accessing Your Electronic Archives in 5, 50, 500 Years

Download Report

Transcript Accessing Your Electronic Archives in 5, 50, 500 Years

Email, IM, Wikis, and Blogs –
Oh MY!
ARMA Bismarck/Mandan
Spring Seminar
Jesse Wilkins
April 10, 2008
Seminar Agenda
1. Active Email Management
2. Instant Records: Managing Your Instant
Messaging
Lunch: “Oh, The Places You’ll Go!”
3. Digital Preservation
4. Blogs, Wikis, RSS, Oh MY!
2
Active Email Management
Session 1
Agenda
•
•
•
•
4
Email management drivers
Email management today
Email management technologies
Elements of an email policy
EMAIL MANAGEMENT
DRIVERS
5
Email – defining the issue
• First email was sent in 1971
• Today more email is sent every day
than the USPS delivers in a year
– 11 billion emails a day in the US alone
– More than 57 billion a day world-wide
– NOT including spam
• 60% or more of business-critical
information is stored within
messaging systems
6
Why are we sending so much email?
•
•
•
•
•
It’s easy
It’s asynchronous
It’s convenient
It’s less formal
It’s ubiquitous and
platform-neutral
• There’s a written record of
communication
7
Business issues
• Email storage costs
– Up to 200 GB email per month for
1,000-user company
– Costs to add and manage storage
– Costs to back up to tape
– Costs to restore
• Productivity costs
8
Business issues cont’d
• Email retrieval costs
– It takes more than 11 hours to recover
an email more than 1 year old from an
archive
– Typically have to restore the entire
tape to a spare (!) server to find the
desired message
– 29% of organizations would not be
able to restore an email message
over 6 months old
9
Legal issues
• Electronic discovery for a Fortune 500
company averages $750,000 per case
• 75% of demands for discovery are for
email
• Courts want discovery in native
format…
• …but may also require that it be
provided in an accessible format
10
Legal considerations for messages
• Messages are discoverable –
whether they are records or not
• Message archives are discoverable,
regardless of the format or storage
medium
• The “deleted messages box” is
discoverable
• Personal copies are discoverable
11
When is an email a record?
•
•
•
•
When statutorily defined
When it documents a business transaction
When it memorializes a business decision
When the attachment
is a record
• When it is the only written
record of something
12
EMAIL MANAGEMENT TODAY
13
Email management defined
According to AIIM, The ECM Association, the
essence of email management is that
“As the de facto standard for business
communication, removing emails from the
server and saving them to a repository isn't
enough. Email must be classified, stored,
and destroyed consistent with business
standards-just as any other document or
record.”
14
Approaches to managing email today
Policy approaches to retention:
1. Do nothing
2. Let users manage their own email
3. Keep everything forever
4. Delete all messages older than X
5. Limit mailbox size to X
6. Declare and manage email as records
15
Approaches to managing email today
Technology approaches to retention:
1. Outsource it!
2. Server-based rules
3. Client-based rules
4. Decentralized – employees do it
• Messages on the server
• Messages in .PST/.NSF files
16
Email management is NOT:
• Saving all email messages forever
• Saving all email messages in the
messaging application
• Setting mailbox time limits
• Setting mailbox size limits
• Declaring “email” as a record
series
– Or as simply “correspondence”
• Doing nothing
17
General principles
• Email management is part of
time management
• Email is a medium, not an action
• Email should not be used for
everything
• Email should be kept as long as
needed – and no longer
18
Who captures the message?
• YOU have to capture an email:
– You receive from outside the
organization
– You send, either internally or to
someone outside the organization
• Designate someone to
capture messages sent to
groups/lists
19
Emails that are not captured
•
•
•
•
20
Transitory messages that are not timely
Personal messages unrelated to business
“Me-too” messages
Messages already captured by someone else
EMAIL MANAGEMENT
TECHNOLOGIES
21
Messaging system
• Not built to store massive amounts of messages
– And attachments
– And manage as records
• Difficult to search across
inboxes
– Discovery, auditing
22
Print & file
• Common approach
• Challenges:
–
–
–
–
23
Loss of metadata
Attachments
Volume to print and to file
Authenticity (phishing)
Backup tapes
• Backups store data, not files or
messages
• Designed for “smoke & rubble
scenario
• Multiple copies of data
• Readability of older tapes
– Format, media, hardware
24
Email management applications
• Move messages out of the messaging
application
• Typically use a rules engine
• May provide simple retention management
• Single instance storage
• Many different capabilities available
25
Email management technologies
•
•
•
•
•
•
•
26
Email archiving
Personal archive file management
Email encryption and digital signatures
Email compliance
Email discovery
Email security
Policy management
ECRM solutions
•
•
•
•
Most systems support email management
May run at server or client
Many support single-instance storage
May allow declaration, management of
messages as records
• Varying support for attachment management,
metadata management
27
ELEMENTS OF AN
EMAIL POLICY
28
Email policy principles
• Email belongs to the organization, not the
individual
• Email is not a records series unto itself
• Email management program must comply with
appropriate regulatory requirements
• Policy has to be followed and enforced!
29
Email policy elements
•
•
•
•
•
•
•
Acceptable/appropriate usage
Personal usage
Access to external messaging systems
Effective email usage
Ownership of email
Retention and disposition
Legal issues
– Holds
– Discovery and production
30
Elements of an email policy
•
•
•
•
•
•
•
•
31
Mobile and web-based email
Backups
Archival
Privacy
Security
Retention and disposition
Training
Audit and compliance
Questions?
32
Conclusion
• We have to manage messaging
technologies better
• Start with policies and
procedures
• Technology can help
• Communicate, communicate,
communicate
• Enforce the program
33
Instant Records: Managing Your
Instant Messaging
Session 2
Agenda
•
•
•
•
•
35
Instant messaging today
How IM works
Approaches to managing IM
IM policies
Better IM through technology
What is instant messaging?
• Communication between users in real time over
the Internet
• Most often one-to-one; some clients support
group chat
• Indicate presence and status
• Send and receive messages
• Manage contacts (“buddy lists”)
36
The IM client
37
Origins of instant messaging
• 1980s: BBSs allowed some person-to-person
chat in real time
• Early 1990s: “On-line messages”
• 1996: ICQ debuts
• 1998: Introduction of enterprise IM
– Lotus Sametime
• 2000: Open source-based Jabber debuts
38
Where is IM today?
• 12+ billion instant messages sent per day in the
U.S.
• More than 46.5 billion per day worldwide
by 2009
• 1.2 billion users worldwide by 2009
• 96% of organizations use IM today
• Up to 75% of usage is commercial clients
39
Where is IM today?
•
•
•
•
34% of current traffic is business-related
Most IM networks support audio, video
Most IM networks support file transfer
Most IM networks are not interoperable
• Most IM networks are not managed
40
The four stages of IM
• Unfamiliarity
– “We don’t use IM – that’s for my kids!”
• Prohibition
– “Use of IM is grounds for dismissal”
• Acceptance
– “Don’t do evil”
• Optimization
– Compliance, efficiency key goals
41
IM issues 1 - informality
• IM sessions are casual and employ cryptic
shorthand
– IMHO, AFAIK, TTYL, LMAO
• IM sessions are free-flowing
• User names not standard (and not under
organization’s control)
– SilentSmurf, 2Hot2Handle (!)
• 31% of organizations have a policy regarding IM
usage
42
IM issues 2 - retention
• Sessions typically not saved on a central server
– May require users to “turn on archiving”
– Archives are retained on individual PCs
– Archives often saved as plaintext or XML
• IM is still subject to retention requirements
– According to content, not as own series
• 13% of organizations retain IM effectively
43
Retention
44
Retention cont’d
45
IM issues 3 - functionality
• Threads stored by users/dates, not by subject
– No subject line to index!
• Conference/group chat capabilities
• File transfer capabilities
– Which may also bypass other filters such as email
size limits and compliance filters
• Active URL transmission
• Audio and video capabilities
46
IM issue 4 - interoperability
• Commercial IM networks originally proprietary
• More standardization today
– Session Initiation Protocol (SIP) for Instant Messaging
and Presence Leveraging Extensions (SIMPLE)
– eXtensible Messaging and Presence Protocol
(XMPP)
• Different applications use SIMPLE vs. XMPP
47
APPROACHES TO MANAGING IM
48
First step for handling IM
Prohibit it!
49
Prohibition and technology
•
•
•
•
Easy install
Can't block "server" URLS, IP addresses
Port-seeking behavior
Simulate TCP connection to IM service using
HTTP and polling
• Web-based IM clients: MSN Web Messenger,
Yahoo Web Messenger, Google Talk, meebo,
many others
50
Web-based IM
51
meebo
52
7/6/2015
Meebome on a blog
53
7/6/2015
Prohibition and culture
• Employees use it for legitimate reasons
– Informal and real-time
– Presencing
– Email overload
• Customers want it!
– See above
54
Top 5 steps for handling IM
1. Update policies to address proper usage
2. Train users on the policies
3. Audit and review adherence to the policy and
address gaps
4. Implement IM gateway or enterprise instant
messaging
5. Export IM traffic to archival or records
management application
55
INSTANT MESSAGING POLICIES
56
Acceptable usage policy
•
•
•
•
•
57
Whether personal usage is allowed
How personal usage may be constrained
How business usage may be constrained
Commercial vs. enterprise IM
Disclaimers
Proper “netiquette”
• Same as email, e.g.
–
–
–
–
No off-color jokes
No disparaging remarks
Proper business tone
Nothing that wouldn’t be appropriate for the front
page of the newspaper
• Proper naming, if using consumer IM
58
Content restriction policy
• What is allowed to be transmitted?
– Attachments
– Sensitive information
– URLs and hyperlinks
• To whom may it be transmitted?
– Internal vs. external
– Public IM vs. federated EIM
– Certain groups or users
59
Retention policy
• *That* it will be done
• How it will be done
• A note on wiretaps
60
Training
• Contents of the policies
– Proper usage
– Content transmission
– Archival
•
•
•
•
61
How to identify potential records
IM ownership and privacy
Retention and archival
Security
BETTER IM THROUGH
TECHNOLOGY
62
Enterprise IM options
• Gateways:
– Provide retention and auditing capabilities for
commercial IM such as AIM, ICQ, YIM, MSN
– May provide some interoperability
– Audit usage, compliance with usage policies
• Enterprise instant messaging (EIM):
– Everyone on the same (corporate) client
– Tighter integration into directory services
– Much more granular control over
functionality and usage
63
Enterprise IM solutions
• Gateways:
–
–
–
–
64
Akonix L7
Symantec IMLogic
Facetime IMAuditor
CipherTrust IronIM
Enterprise IM solutions
• EIM solutions:
–
–
–
–
65
IBM Lotus Sametime
Microsoft Live Communications Server
IMiN
JabberNow
Minimal reqts for IM solutions
• Provide full-text search capability across all
messages
• Audit content
– Keyword/content-based
– Context-based (users, time, etc.)
• Capture and store all messages
• Export to controlled repository
• Review/markup capability (e.g. for auditors)
66
Minimal reqts for IM solutions
• Encryption of external communications
• Route internal messaging inside firewall
• Attachment blocking and notification
– Virus scanning of attachments if allowed
– Storage of attachments if allowed
• URL blocking/filtering
• Insert disclaimers into message stream
67
Minimal reqts for IM solutions
• Federation
– Commercial, enterprise
• Provide identity management
– Integration with directory services/LDAP
– Enforce corporate naming conventions
• Enforce communication restrictions
– Ethical walls
– External vs. internal communications
68
Questions?
69
Conclusions
• IM is a communications medium
• IM has to be managed according to
content
• It is difficult to prohibit it for a number of
reasons
• It can be managed with policies and
procedures
• Technology can help
70
LUNCH!
Digital Preservation
Session 3
Agenda
• The problem with digital information
• Approaches to digital preservation
• Strategies for long-term access
73
THE PROBLEM WITH
DIGITAL INFORMATION
74
The problem with digital information
Digital documents last forever – or
five years, whichever comes first.
--Jeff Rothenberg, RAND Corp.
75
The problem with digital information
• Explosion of information
• Documents and files are
increasingly “born digital”
• Digital formats support more
complex information objects
• Digital preservation does not just
happen – it must be actively
pursued
– And IT can’t do it alone
76
Media
• There are no archival-class
media for storing digital
information
– Media can be damaged,
scratched, stretched
• And if there were –
it wouldn’t matter!
77
Hardware compatibility
• Technical obsolescence
– 8” floppy disks, laser video discs
• Generational changes
– Floppy disks, CDs
• Non-standard formats
– ZIP drives, LS-120
• Rapid rate of change
78
Software compatibility
• Between applications
– Microsoft Word, Corel WordPerfect
• Between platforms
– Word, Word for Mac
• Between versions
– Word 1.0, Word 2007
79
Security and encryption
• Passwords can be lost
• Some applications don’t play nicely with
encrypted or protected files
• Some applications don’t
recognize security features
-- and ignore them
80
A note about standards
• Formal standards are agreed to by users,
vendors, industry experts, and managed by
standards organizations.
– XML, PDF
• Ad hoc standards are controlled by vendors or
smaller groups and are considered standards
because they are in widespread use
– Microsoft Word
• Standards protect the organization!
81
APPROACHES TO
DIGITAL PRESERVATION
82
Digital preservation strategies
•
•
•
•
•
•
83
Analog storage
System archival
Emulation
Conversion
Migration
Each has its own strengths & weaknesses
Analog storage
• Analog storage suffers from
a number of issues:
• Search and retrieval issues
• Storage requirements and
costs
• Data loss, particularly
for rich media formats
84
System archival
• Maintain copy of original hardware, software,
operating system, and information objects
• Still run into issues with media and hardware
lifespan
• Centralizes access to locations with older
systems
• Increasing number of systems required to
ensure access to everything
• Difficult to ensure everything is taken into
account
85
Emulation
• Virtual recreation of original environment
• Does not require any conversion
• Requires periodic refreshing of the emulation
environment
• Still have issues around
media and, maybe,
hardware to read it
86
Conversion
• Move from proprietary to standard
– HTML to XML
– Windows bitmap to JPEG or TIFF
– Excel to ASCII text
• Can be labor-intensive
• Often results in some loss of data
– Proprietary formatting
– Rich objects, images, formulas, etc.
87
Migration
•
•
•
•
•
Digital media doesn’t last forever…
…and neither does the hardware
Media must be refreshed while it’s still readable
Very labor intensive
Often results in loss of some information
– Migration over generations often more reliable than
migration through generations
88
Migration cont’d
The Domesday Project
• Domesday book written in 1086
• In 1986, BBC created interactive
presentation using LaserVision
LV-ROM
• By 2002 the discs were
unreadable
• Through significant effort and
the use of migration and
emulation, the Domesday
presentation remains available
90
STRATEGIES FOR
LONG-TERM ACCESS
91
Recommendations – 5 years
• Capture information using no compression or
lossless compression
• Use standard file and media formats
• Select high-quality media that will last 5-10 years
• Capture relevant metadata
92
Recommendations – 50 years
• Capture information using no compression or
lossless compression
• Capture information in standard formats or
formal descriptions
• Select high-quality media and plan for migration
• Capture relevant metadata
• Do not use encryption or passwords on
individual documents
93
Recommendations – 500 years
• Capture information in standard formats or
formal descriptions
• Select high-quality media and plan for migration
• Capture and embed relevant metadata
• Consider converting to analog
• Do not use encryption or passwords on the
individual documents
94
Questions?
95
Conclusions
• Digital preservation requires work
• Ultimately a question of tradeoffs
– Cost to preserve
– Cost of not preserving
– Exactly what must be preserved
• Pursue multiple preservation strategies
• Standards can help preservation efforts
• TANSTAAMB
96
Blogs, Wikis, and RSS, Oh MY!
Session 4
Agenda
• Blog This!
• Wiki-Wiki
• Really Simple RSS
98
BLOG THIS!
99
Blogs in Plain English
100
Source: Common Craft
What’s a blog?
•
•
•
•
101
Started as online diaries
Today used more as lightweight CMS
Hides complexity of Web publishing
Generally arranged in chronological order, most
recent at top
The ARMA blog
102
Informata
103
Technorati
104
Blogging basics
• Centralized - one person or group posts, others
can only read the posts
• Comments and trackbacks
• Easy to link to other pages
• Easy to blog using toolbars
• Important to keep current!
105
Getting started
•
•
•
•
•
Sign up for a free hosted service
Start posting
Keep posting!
Make it relevant if you want it to be read….
Consider commercial solutions
– More control over content
– Finer-grained control over access, updates
106
Blog Records Management
• If the CEO is blogging, is it a record?
– Maybe…
• Most blogging systems support basic content
management capabilities
• Review comments periodically
– Or consider turning them off
• Track changes to postings, comments
– Document reason for changes
107
Blog solutions - hosted
•
•
•
•
•
•
•
•
108
Wordpress
Typepad
Blogger
LiveJournal
Myspace.com
Blog.com
MSN Spaces
Yahoo 360°
Blog solutions - internal
•
•
•
•
•
•
•
109
Movable Type Enterprise
Traction Teampage
Blogtronix Enterprise
Sharepoint 2007
Drupal
Telligent Community Server
UserLand Manila and Radio UserLand
WIKI-WIKI
110
Wikis in Plain English
111
Source: Common Craft
The wiki basics
•
•
•
•
•
•
112
Collaborative website
Organized as linked articles
Hides complexity of HTML from users
Easy to add articles
Easy to link articles
Easy to correct mistakes
Wiki-Wiki
• Wikipedia: 2,100,000+ articles in English
– More than 7 million in 250+ languages
• Wiktionary: 598,000+ definitions in English
• WikiQuote: 13,400+ quotations
113
Wikipedia
114
Wikipedia RM article
115
Wiki business cases
• Project management
• Collaborative authoring and review
• Knowledge management
116
Wikis and RM
• Excellent for collaboration on records
management policies, procedures, RRS, etc.
• Changes tracked automatically
– Need to save logs of changes
• Periodically may need to review/clean up
– “Spam” comments/articles
– Outdated materials
117
Change tracking
118
Implementing a wiki
•
•
•
•
•
Sign up for a free hosted service
Start writing
Invite others to write
Moderate…or not
Consider a commercial wiki
– MUCH more control over look & feel, access
rights/security, content
119
Wikis - hosted
•
•
•
•
•
•
•
•
•
120
Atlassian Confluence Hosted
Central Desktop
Cyn.in (“bliki”)
EditMe
pbWiki
Socialtext
Wikia (uses MediaWiki)
Wikispaces
Zoho Wiki
Wikis - internal
•
•
•
•
•
121
Atlassian Confluence
MediaWiki
Sharepoint 2007
Socialtext Managed Service Appliance
TWiki
REALLY SIMPLE RSS
122
Really Simple Syndication
•
•
•
•
123
XML-based content syndication language
Makes it easy for users to find your content
Push instead of pull
Most blogs and wikis support RSS natively
RSS in Plain English
124
Source: Common Craft
How RSS works
• Find a website with a feed
• Subscribe to the feed using a reader
• Reader polls the website periodically and
downloads updated feed items
• Read the feeds in the reader!
125
126
127
The “river of news”
Single article
Google Reader on iGoogle
RSS in Outlook 2007
Feed readers
•
•
•
•
•
•
•
•
132
Lots of them available
Many of them free
Google Reader
My Yahoo!
WizzRSS
Newsgator
Attensa
Many others….
Questions?
133
For more information
Jesse Wilkins
CDIA+, LIT, edp, ICP, ermm, ecmm, bpms
Access Sciences Corporation
[email protected]
http://www.accesssciences.com
Blog: http://informata.blogspot.com
(303) 574-1455 direct
134