Controlled Migration in Digital Archives Thomas Triebsees University of the Federal Armed Forces Munich Department of Computer Science [email protected].

Download Report

Transcript Controlled Migration in Digital Archives Thomas Triebsees University of the Federal Armed Forces Munich Department of Computer Science [email protected].

Slide 1

Controlled Migration in
Digital Archives
Thomas Triebsees
University of the Federal Armed Forces Munich
Department of Computer Science
[email protected]


Slide 2

The burden of long term preservation
What is the
challenge?

Technologies change
new file formats
new storage media
new computer technologies

And the digital objects?

Thomas Triebsees, Department of
Computer Science

2


Slide 3

The digital objects
What is the
challenge?

can have very complex structures
can have various dependencies

are only interpretable in a context

But
must be accessible, viewable and
understandable for future generations

Thomas Triebsees, Department of
Computer Science

3


Slide 4

Crucial points about migration
What is the
challenge?
And when it
is up to
migration?

The objects are many
The tools are many
The dependencies are many and can induce
migration obligations
Full content hardly preservable

Automation
Formal model allowing to qualify information and object dependencies

Thomas Triebsees, Department of
Computer Science

4


Slide 5

The information model (1)
What is the
challenge?
And when it
is up to
migration?

Hierarchical structure of ‘‘information” through
type system with inheritance

How do we
manage it?

Obj
ID

Image

Identifiable

Newspaper

NewsRec

Locator

NewsDescr

ArticleRec

Article

ArtDescr

Thomas Triebsees, Department of
Computer Science

5


Slide 6

The information model (2)
What is the
challenge?
And when it
is up to
migration?
How do we
manage it?

Type description fully formal
Article
{ Valid XML file according to Article.dtd }
- id : ID
- headline : String
- content : String
- refs : List
<> Article(id:ID, h: String, c:String, refs:List):Article
{ <
> id.isValid(),
<> self.id=id and self.headline=headline and
self.content=c and self.refs=refs }
+ getID():ID
{ <
> True, <> result=id }
+ getHeadline():String
{ <
> True, <> result=headline }
+ getContent():String
{ <
> True, <> result=content }
+ getPicRefs():List
{ <
> True, <> result=refs }
+ toString():String
{ <
> True, <> result=id.concat(headline.concat(content)) }

Formal inheritance mechanism
Automatic dependency generation
Thomas Triebsees, Department of
Computer Science

6


Slide 7

The information model (3)
What is the
challenge?

Dependencies described fully formal

And when it
is up to
migration?
How do we
manage it?

Obj
9)

ID
Image
*

*

Identifiable

*
*1) references

2)
0..1

0..1

Newspaper

NewsRec

7)describes

8)describes

Locator

*

ArticleRec
*

Article
1..*

*

10) references
5) describes

6) describes

NewsDescr

ArtDescr

4)
3) references

Thomas Triebsees, Department of
Computer Science

7


Slide 8

Transformations
What is the
challenge?
And when it
is up to
migration?
How do we
manage it?

Described fully formal
Black box view, i.e. only characterized by
 source type
 target type
 type of preserved contents

But:

What about other migration obligations?

Set(Image)

Newspaper

describes
1

1

δ

NewsRec

NewsDescr
δ2

describes
1

Thomas Triebsees, Department of
Computer Science

1

Locator

8


Slide 9

Future Research
What is the
challenge?
And when it
is up to
migration?
How do we
manage it?
What has still
to be done?

Choice of best transformation functions according
to defined user preferences
Further automation according to type definitions
Automatic derivation of migration algorithm respecting object dependencies
...

Thomas Triebsees, Department of
Computer Science

9


Slide 10

Questions
What is the
challenge?
And when it
is up to
migration?

Thank you for your patience !

How do we
manage it?
What has still
to be done?
Questions?

Thomas Triebsees
University of the Federal Armed Forces Munich
Department of Computer Science
[email protected]
Thomas Triebsees, Department of
Computer Science

10