Transcript Slide 1

Fedora
New Features, New Collaborations, Bright Future
Fedora Users Conference
Copenhagen, Denmark
September 28, 2005
Sandy Payette
Co-Director Fedora Project
Cornell University
Fedora Brief History
• Cornell Research (1997-present)
–
–
–
–
DARPA and NSF-funded research
First reference implementation developed
Interoperable Repositories (experiments with CNRI)
Policy Enforcement
• First Application (1999-2001)
– University of Virginia digital library prototype
– Technical implementation: adapted to web; RDBMS storage
– Scale/stress testing for 10,000,000 objects
• Open Source Software (2002-present)
–
–
–
–
–
Andrew W. Mellon Foundation grants
Technical implementation: XML and web services
Fedora 1.0 (May 2003)
Fedora 2.0 (Jan 2005)
Fedora 2.1 (coming soon!)
Fedora Development Team
Cornell University
University of Virginia
•
•
•
•
•
•
•
•
•
•
Sandy Payette (co-director)
Chris Wilper
Carl Lagoze
Eddie Shin
Thorny Staples (co-director)
Ross Wayland
Ronda Grizzle
Bill Niebel
Bob Haschart
Tim Sigmon
“Fedora Inside”
Known Use Cases
•
•
•
Digital Library Collections
Institutional Repository
Educational Software
•
•
•
•
•
Information Network Overlay
Digital Archives and Records Management
Digital Asset Management
File Cabinet / Document Management
Scholarly publishing
Fedora Repository and Web Services
Client
App
Web Services
Exposure
Batch
Program
Web
Browser
Other
Service
REST SOAP
REST SOAP
REST SOAP
REST
REST
Manage
Access
Basic
Search
RDF
Search
OAI
Provider
Fedora Repository Modules
Manage
AuthN
AuthZ
RDF
files
Access
Validation
ResourceIndex
Storage
Dissemination
Registry
rdbms
The Basics: Fedora Digital Object Model
Container View
Persistent ID (PID)
Digital object identifier
Relations (RELS-EXT)
Dublin Core (DC)
Reserved Datastreams
Key object metadata
Audit Trail (AUDIT)
Datastream
Datastream
Default Disseminator
Disseminator
Datastreams
Aggregate content or metadata items
Disseminators
Pointers to service definitions to
provide service-mediated views
Fedora – Object Model XML
• FOXML (Fedora Object XML)
– Simple XML format directly expresses Fedora object model
– Easily adapts to Fedora new and planned features
– Easily translated to other well-known formats
• Enhanced Ingest/Export of objects
– FOXML, METS (Fedora extension)
– Extensible to accommodate new XML formats
– Planned: METS 1.4, MPEG21 DIDL
Fedora 2.1
“Release Notes”
Fedora Service Framework
(Fedora 2.1)
Services
PROAI
OAI Provider
Service
Future
Service
Fedora Repository
Service
Other
Service
Other
Service
Directory
Ingest
Service
Future
Service
ZIP or JAR
input
Apps
Administrator
DirIngest Client
2.1 Release Notes
• Authentication plug-ins
– HTTP Basic auth
– Tomcat realms and login modules
• Plug-in #1 : Tomcat user/password file or database
• Plug-in #2 : LDAP tie-in
• Plug-in #3 : Radius Authentication
• Support for SSL
• Authorization module
–
–
–
–
XML-based policies using XACML
Repository-wide policies
Object-specific policies
Fine-grained policy enforcement
• API actions X subject attributes X object attributes
XACML Policy Examples
• Repository-wide Policy
– [xacml-1] Deny access to DC datastream to specific user group
• Object-specific Policy
– Deny all access to the object “cornell:cs100” if user is a not a Cornellian.
• Genre-oriented Policy
– [xacml-2] For objects with content model of “uva-image”, permit
students access to disseminations, but deny them access to raw
datastreams, but allow professors access to both.
• Time-oriented Policy
– Permit students access to “answers” datastream of learning object
cs:125 after May 15, 2005
• Backend Service Security Policy
– Deny callback by the external MRSID service identified as “bmech:10”
2.1 Release Notes
•
Review of RDF-based Resource Index
–
“Relationships” Datastream
– Ontology of common relationships (RDF schema)
– RDF stored in datastream identified by “RELS-EXT”
–
Resource Index (RI)
– RDF-based index of repository (automatic indexing into
Kowari triple-store))
– Graph-based index includes:
– Object properties and Dublin Core
– Object-to-object relationships
– Datastream Disseminations (and properties)
–
RI Search (Search the repository as a graph)
– Powerful querying of graph of inter-related objects
– REST-based query interface (using RDQL or ITQL)
– Results in different formats (triples, tuples, sparql)
2.1 Release Notes
•
New in Fedora 2.1 for Resource Index
–
Resource Index corruption problems diagnosed and fixed (Kowari
memory bug)
–
Minor RI model changes (may require modification of existing static
queries by users
–
Relaxation of validation rules on RELS-EXT:
now accepts ( objectURI --- relation/property --- > URI/literal)
–
Method Disseminations (and properties)
with option for method X parmVal permutations
–
Scale and Performance Testing (NSDL 2M objects, >100M triples)
–
Sesame support for triplestore
RI: Fedora Objects
RDF Graph view
Member
Object
ator
dc:cre
dDate
lastMo
"Eddie Shin"
"2005-01-10:11:02"
er
mb
Me
has
Collection
Object
info:fedora/
image:11
hasRep
hasR
ep
info:fedora/image:11/BLDG
info:fedora/
collection:1
ha
sM
em
be
r
hasRep
lastModDate
dc:creator
info:fedora/image:11/bdef:2/getRelatedLetter
info:fedora/
image:12
dc:crea
to
r
lastM
odDa
te
hasR
ep
"Chris Wilper"
"2005-02-01:12:05"
ha
sR
ep
info:fedora/collection:1/bdef:1/MEMBERS
info:fedora/image:12/BLDG
"2005-01-01:10:00"
"Elly Cramer"
info:fedora/image:12/bdef:2/getHIGH
Fedora 2.1 Release Notes
•
PROAI Server (Advanced OAI Provider)
•
Directory Ingest Service
•
Directory Ingest Client
–
–
–
–
–
–
–
–
–
–
–
–
–
Harvest multiple metadata formats
Harvest datastreams and disseminations
Support for incremental harvest by modified date
Support for OAI sets
Highly configurable via queries against Resource Index
Facilitate ingest of hierarchical directories of files
Submit files as .zip or .jar (with a METS manifest)
Automatically asserts parent-child relationships in RELS-EXT
Stages content and ingests as FOXML objects into repository
Web client (signed applet)
Browse directory trees, select dir/files, add metadata, add relations
Auto-generates METS manifest for entire collection
Packages as zip/jar and ingests into Fedora repository
2.1 Release Notes
•
•
Rebuild Utility for Repository Indices
Improved logging using log4j
•
•
•
Handle System Plug-in for PID Generation
Command-line utility syntax changes
New Command-line utilities
•
–
–
–
Trippi.log
Kowari.log
Repository log
–
–
–
fedora-reload-policies
validate-policy
fedora-rebuild
FedoraClient utility class for building new clients
Fedora Future
2006-2007
You asked…
• “We wish for a out-of-box” end-user client for
Fedora.”
• “Can’t you put the DSpace interface on top of a Fedora
repository?”
• “We need something to show people Fedora right away
(before we get $$ for development resources).”
• “We love Fedora. It would be really great if you
distributed a default end-user client.”
The Answer: FIRE Client
•
•
•
•
•
•
Web-based client for “institutional repository”
End-user content submission
Object creation template for “content models”
Configurable Workflows
XACML policies coordinated with workflow
Search/Browse collections
Development in progress!
Fedora Service Framework (2005-07)
OpenURL
Fedora Services
OpenURL
aDORe
Federation
PID
Resolution
Preservation
Monitoring
PROAI
Other
Service
(OAI Provider)
JHOVE
arXiv
Event
Notification
Fedora Repository
Service
Preservation
Integrity
OpenURL
GDFR
DSpace
OpenURL
Access
Point
Pathways
InterDisseminator
Directory
Ingest
Fedora
Search
Apps
Fedora
Workflow
External
Workflow
Dialog Box Name
Text:
Text
Text
Service
Text
Sample Text Here Sample Text Here Sample Text
Here Sample Text Here Sample Text Here Sample
Text Here Sample Text Here Sample Text Here
Sample Text Here Sample Text Here
Text
OK
Cancel
Help
Text
Sample Text Here Sample Text Here Sample Text Here Sample Text Here
Sample Text Here Sample Text Here Sample Text Here Sample Text Here
Sample Text Here Sample Text Here Sample Text Here Sample Text Here
Administrator PolicyBuilder
FIRE Client
Web-based
submission and
basic workflow
Fedora Development Priorities
2006-2007
• Fedora Framework Services
• Federated Repositories
– “Fedorations” with name service
– Federation with other repositories (DSpace, aDORE, arXiv)
•
•
•
•
•
•
• Cornell/LANL NSF Pathways project
• InterDisseminator
“Content Model” Specification Language
Advanced Object Creation Workbenches
Tools for RDF browse and graph traversal
Scalability/Performance – very large repositories
Web services security and Shibboleth
Code Refactoring
• Fedora as web app (.war)
• Fedora Showcase and News (on new website)
• Community Coordination and Co-Development
Collaboration:
Fedora Community Working Groups
• Preservation Working Group (Ron Jantz, Rutgers)
–
–
–
–
–
Requirements for preservation services
Define service APIs and technical integration with Fedora 2.1 +
Preservation metadata recommendations for Fedora
Prototyping of new services
Development plan for deployment of new services
Collaboration:
Fedora Community Working Groups
•
Workflow Working Group (Peter Murray, OhioLink)
– Sep 05: WORKFLOW WG chartered and begins work
– Oct 05: Submit "terminology and problem statement" document to fedora-users
for review
– Nov 05: Submit modeling diagrams, workflow process descriptions, and
recommendation for workflow engine to fedora-users for review
– Feb 06: Release alpha-quality version of ingestion workflow engine
– Apr 06: Release beta-quality version of ingestion workflow engine
– Aug 06: Release production-quality version of ingestion workflow engine
– Nov 06: Revise documents based upon implementation experience
– Feb 07: Release alpha-quality version 2.0 of ingestion workflow engine
– Apr 07: Release beta-quality version 2.0 of ingestion workflow engine
– Aug 07: Release production-quality version 2.0 of ingestion workflow engine
– Sep 07: Close or recharter the WG
Sample Workflows
Ingest-oriented process
Validate
bytestreams
Ingest
to
Repo
Link to
Simulation
Service
Assign
Access
Policy
Index
and
Register
Edit
Review
Assign
Policy
Publish
SIP
Review-oriented process
Submit
Review
thesis
Ingest
To
Archive
Preservation-oriented process
Diagnose
Problems
Digital
Object
Format
Migration
Object
Versioning
In Repo
Make
Copies
Ingest
To
Archive
Collaboration:
Fedora Community Working Groups
• Outreach Working Group (Linda Langschied, Rutgers)
– Improve content of Fedora web site
– More user-oriented information (currently technical focus)
– Community Showcase – demos, graphics
– Survey database with simple web form to profile users
– Collaboration Environment
– Wiki, Confluence, other?
• Content Model Working Group (under charter)
–
–
–
–
Formalization of notion of Fedora content model
XML schema to define content models
Investigate ontology-based content model definition
Round up existing content models and publish to promote reuse
Fedora Community
• Fedora Advisory Board
–
–
–
–
Vision
Commission Working Groups
Prioritize Development
Define Sustainability Model
• Collaborative Development Opportunities
• Share Tools via www.fedora.info
– User-contributed Tools, Apps, Services
Fedora Community (a sampling)
• General questions
• Hot topics
–
–
–
–
–
–
Workflow
Digital object typing
Rdf and relationships
Search and indexing
Collaboration models
other
• Demos
– Encylopedia of Chicago
– NSDL
New Fedora Web Site!
www.fedora.info