The NSDL Program Stephen Griffin National Science Foundation The NSDL Program Vision articulated by NSF's Division of Undergraduate Education National Research Council workshop Preliminary grants through.

Download Report

Transcript The NSDL Program Stephen Griffin National Science Foundation The NSDL Program Vision articulated by NSF's Division of Undergraduate Education National Research Council workshop Preliminary grants through.

1

The NSDL Program

Stephen Griffin National Science Foundation

2

The NSDL Program

1996 Vision articulated by NSF's Division of Undergraduate Education 1997 National Research Council workshop 1998 Preliminary grants through Digital Libraries Initiative 2 1998 SMETE-Lib workshop 1999 NSDL Solicitation 2000 6 Core Integration demonstration projects + 23 others funded 2001 1 large Core Integration System project funded 2002 More than 60 independent projects funded

3

NSF-funded Research Programs

NSF Solicitation

New ideas

Proposals

New ideas Research

4

The NSDL Program NSF's objective

Build a comprehensive digital library for all aspects of science education

NSF's approach

Solicitation encouraged wide diversity of proposals divided into general categories Best 60+ proposals funded -- more to follow Grants allow projects flexibility

Result

A splendid set of projects

A challenge in interoperability!

5

The NSDL: The Challenge of Scale

William Y. Arms Cornell University

6

Core Integration Philosophy Scientific and technical information Materials used in education Materials tailored to education

7

Core Integration Philosophy

It is possible to build a

very large

digital library with a small staff.

But ...

Every aspect of the library must be planned with scalability in mind.

Some compromises will be made.

8

How Big might the NSDL be?

All branches of science, all levels of education, very broadly defined:

Five year targets

1,000,000 different users 10,000,000 digital objects 10,000 to 100,000 independent sites

9

Resources for Core Integration

Budget Staff Management

Core Integration

$4-6 million 25 - 30 Diffuse How can a small team, without direct management control, create a very large-scale digital library?

10

Collections: the Basic Assumption

The Core Integration team will not manage any collections

11

Collections

The NSDL program funds only a fraction of the relevant collections.

12

Every Collection is Different

13

The Core Integration Task ...

... to provide a coherent set of collections and services across great diversity.

14

Interoperability The Problem

Conventional approaches to interoperability require partners to support agreements (technical, content, and business But NSDL needs thousands of very different partners

... most of whom are not directly part of the NSDL program The Approach

A spectrum of interoperability

15

Levels of interoperability

Level

Federation Harvesting Gathering

Agreements

Strict use of standards (syntax, semantic, and business) Digital libraries expose metadata; simple protocol and registry Digital libraries do not cooperate; services must seek out information

Example

AACR, MARC Z 39.50

Open Archives metadata harvesting Web crawlers and search engines

16

Searching

What to Index?

When possible, full text indexing is excellent, but full text indexing is not possible for all materials (non-textual, no access for indexing).

Comprehensive metadata is an alternative, but available for very few of the materials.

What Architecture to Use?

Few collections support an established search protocol (e.g., Z39.50)

17

Broadcast Searching does not Scale Collections User interface server User

18

The Metadata Repository

Services Users

Metadata repository

The metadata repository is a resource for service providers. It holds information about every collection and item known to the NSDL.

Collections

Search Architecture

Portal Portal Portal SDLIP Search and Discovery Services OAI http

Metadata repository

Collections 19 James Allan, Bruce Croft (University of Massachusetts, Amherst)

20

Support for Service Providers The Metadata Repository as a Resource

Records are exposed through Open Archives Initiative harvesting protocol.

Core Integration team will provide some services based on the metadata repository.

The architecture encourages others to build services.

21

Metadata Strategy

Metadata is expensive The NSDL cannot afford to create it manually

22

Metadata Strategy

• Support eight standard formats • Collect all existing metadata in these formats • Provide crosswalks to Dublin Core • Expose records in the metadata repository for others to harvest • Concentrate on collection-level metadata • Use automatic generation to augment item-level metadata

23

Collection-level Operations

Material in the NSDL is selected and managed as collections: • Alexandria Digital Library • Cornell course web sites • JISC Resource Discovery Network • Joe's web page Human effort will be used to select and integrate major collections.

Automated methods (e.g., web crawling) will be used to identify and integrate additional collections.

24

Quality Control

The Problem

Material in the NSDL should be relevant.

But we cannot select each item individually.

The Approach

Most selection and quality control decisions are made at a collection level, not at an item level. Information about quality will be maintained in a collection-level metadata record, which is stored in a central metadata repository.

This metadata is made available to NSDL service providers.

User interfaces can display quality information.

25

User Interfaces The Problem

Cannot handcraft every web page Must be usable on a very wide range of equipment and with a very diverse group of users

The Solution

Data driven portals using a channel architecture.

Interfaces guide the user to understand the library. One library, many portals.

26

27

28

The Mortal behind the Portal

[This space left intentionally blank.]

29

Where is the Center of the Universe?

Alexandria Library of Congress Elsevier

NSDL

Informedia Math DL Joe's Pictures

30

Where is the Center of the Universe?

British Library Internet Archive Elsevier Library of Congress OCLC Harvard

NSDL

31

Where is the Center of the Universe?

Google

email Office Course web sites Bill Arms Directories News and weather Technical documentation

NSDL

32

Acknowledgement and Disclaimer

The NSDL is a program of the National Science Foundation's Directorate for Education and Human Resources, Division of Undergraduate Education.

The NSDL Core Integration is a collaboration between the University Center for Atmospheric Research (Dave Fulker), Columbia University (Kate Wittenberg) and Cornell University (Bill Arms). The Technical Director is Carl Lagoze (Cornell University).

33

The NSDL: The Challenge of Scale

William Y. Arms Cornell University