MARC Machine Readable Cataloging

Download Report

Transcript MARC Machine Readable Cataloging

MARC
Machine Readable Cataloging
& MARC family
Reference: Rebecca Guenther (2004) New and traditional descriptive
formats in the library environment
Timeline comparing creation of MARC to major developments in software, networking, and
data representation between 1960 and 1980
Jason Thomale. 2010. Interpreting MARC: Where’s the Bibliographic Data? Code {4} Lib Journal. Issue 11, 2010-09-21
http://journal.code4lib.org/articles/3832
1. MARC21 Bibliographic Format
MARC - 1960s --> USMARC, CANMARC, UKMARC, etc.
UNIMARC - 1977
MARC21 – 1997, Harmonization of USMARC and CAN/MARC
Translations in several languages
MARC21 Concise Format for Bibliographic Data is available at:
http://www.loc.gov/marc/bibliographic/ecbdhome.html
MARC21 Formats are available at:
http://www.loc.gov/marc/
Good introduction to the use of MARC: Understanding MARC
Bibliographic, http://www.loc.gov/marc/umb/
3
MARC 21
Parts of a MARC record
Leader: identifies the beginning of a new record, type of
record
Directory: think of it as the index to the record. Identifies the
position and length of each field
Control Fields: coded information about the resource
described, standard/control numbers, dates, language, etc.
Some are called fixed fields due to their fixed length
Variable fields: more detailed description of the recourse,
fields have variable length
4
MARC 21
Content designators
Types of codes used to indicate content of a record:
tags: 3-digit numbers (001-999) to encode fields
e.g. 100 = personal name main entry
indicators: 2 possible positions for each field,
special information about that field
e.g. 100 1_ = surname as the entry element
subfield codes: combination of a delimiter and a
lower case letter or number, to encode subfields
e.g. 100 1_ |a = name
5
MARC21
groups of fields (by hundreds)
Bibliographic format
0XX
1XX
2XX
3XX
4XX
5XX
6XX
7XX
8XX
Control information, numbers, codes
Main entry
Titles, edition, imprint/publication
Physical description, etc.
Series statements (as shown in the book)
Notes
Subject added entries
Added entries other than subject or series
Series added entries (other authoritative forms)
6
Parallels in MARC formats
X00
X10
X11
X30
X40
X50
X51
Personal names
Corporate names
Meeting names
Uniform titles
Bibliographic titles
Topical terms
Geographic names
1XX Main entry
4XX Series statement
6XX Subject heading
7XX Added entry
8XX Series added entries
Exercise: If Steve Jobs is the subject of a book, what field number
should you use to indicate that "Jobs, Steve, -- 1955-2011" is the
'subject' of the book?
Exercise: If "Apple Computer, Inc." is the subject of a book, what
field number should you use to indicate that?
7
2. MARC Family
2.1 MARC XML
2.2 MODS, Metadata Object Description
Schema
2.3 MADS, Metadata Authority
Description Schema
…
8
New needs




Need to take advantage of XML
 Establish standard MARC 21 in an XML
structure
Need simpler (but compatible) alternatives
 Development of MODS
Need interoperability with different
schemas
 Assemble coordinated set of tools
Need continuity with current data
 Provide flexible transition options
9
10
2.1 MARC 21 evolution to XML
11
MARC 21 in XML – MARCXML

MARCXML record
XML exact equivalent of MARC (2709)
record
 Lossless/roundtrip conversion to/from
MARC 21 record
 Simple flexible XML schema, no need to
change when MARC 21 changes
 Presentations using XML stylesheets
 LC provides converters (open source)
 Adopted by OAI to replace oai_marc


http://www.loc.gov/standards/marcxml
12
Uses of MARCXML and related tools





Standardize MARC 21 across community for
XML communication and manipulation
Open MARC 21 to XML programming tools and
presentation style sheets
Standardize MARC 21 for OAI harvesting
Standardize transformations to and from other
standard formats (DC, ONIX, …)
Basis for evolution while maintaining
standardization
13
MARC 21 (2709) record
(machine view)
00967cam 2200277 a 4500
001000800000005001700008008004100025020005300229040
001800282050002400312082002100336100003000357245007
400387260004400461300003500505440001200540500002000
552650004200572651002500614
347139419990429094819.1931129s1994 wauab
001 0
eng a 93047676 a0898863872 (acid-free, recycled paper)
:c$14.95 aDLCcDLCcDLC 00aGV1046.G3bG47
199400a796.6/4/09432201 aSlavinski, Nadine,d196810aGermany by bike :b20 tours geared for discovery /cNadine
Slavinski. aSeattle, Wash. :bMountaineers,cc1994. a238 p.
:bill., maps ;c22 cm. 0aBy bike aIncludes index. 0aBicycle
touringzGermanyxGuidebooks.
Exercise: Can you explain how a machine can tell where is 100 field
which carries the title and responsible body information?
(Hint: textbook page 24). Can you explain another chunk of the digit?
14
MARC21 (2709) to MARCXML
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00967cam 2200277 a 4500</leader>
<controlfield tag="001">3471394</controlfield>
<controlfield tag="005">19990429094819.1</controlfield>
<controlfield tag="008">931129s1994 wauab
001 0 eng </controlfield>
<datafield tag="020" ind1=" " ind2=" ">
<subfield code="a">0898863872 (acid-free, recycled paper) :</subfield>
<subfield code="c">$14.95</subfield>
</datafield>
<datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">DLC</subfield>
<subfield code="c">DLC</subfield>
<subfield code="d">DLC</subfield>
</datafield>
<datafield tag="050" ind1="0" ind2="0">
<subfield code="a">GV1046.G3</subfield>
<subfield code="b">G47 1994</subfield>
</datafield>
<datafield tag="082" ind1="0" ind2="0">
<subfield code="a">796.6/4/0943</subfield>
<subfield code="2">20</subfield>
</datafield>
<datafield tag="100" ind1="1" ind2=" ">
<subfield code="a">Slavinski, Nadine,</subfield>
<subfield code="d">1968-</subfield>
</datafield>
15
MARCXML record (continued)
What does this
set tell you?
<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Germany by bike :</subfield>
<subfield code="b">20 tours geared for discovery /</subfield>
<subfield code="c">Nadine Slavinski.</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="a">Seattle, Wash. :</subfield>
<subfield code="b">Mountaineers,</subfield>
<subfield code="c">c1994.</subfield>
</datafield>
<datafield tag="300" ind1=" " ind2=" ">
<subfield code="a">238 p. :</subfield>
<subfield code="b">ill., maps ;</subfield>
<subfield code="c">22 cm.</subfield>
</datafield>
<datafield tag="440" ind1=" " ind2="0">
<subfield code="a">By bike</subfield>
</datafield>
<datafield tag="500" ind1=" " ind2=" ">
<subfield code="a">Includes index.</subfield>
What does this
</datafield>
set tell you?
<datafield tag="650" ind1=" " ind2="0">
<subfield code="a">Bicycle touring</subfield>
<subfield code="z">Germany</subfield>
<subfield code="x">Guidebooks.</subfield>
</datafield>
</record>
16
MARCXML to DC
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdfsyntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Germany by bike : 20 tours geared for
discovery </dc:title>
<dc:creator>Slavinski, Nadine, 1968-</dc:creator>
<dc:type>text</dc:type>
<dc:publisher>Seattle, Wash. : Mountaineers,
</dc:publisher>
<dc:date>c1994.</dc:date>
<dc:language>eng</dc:language>
<dc:subject>Bicycle touring</dc:subject>
</rdf:Description>
17
2.2 MODS
Metadata Object Description Schema





Bibliographic element set
Initiative of the Network Development and MARC
Standards Office, Library of Congress
Uses XML Schema
Specifically for library applications, although could
be used more widely
A derivative (and subset) of MARC elements
18
Why MODS?



XML based, web friendly, transportable,
processible, configurable, sufficiently descriptive
without being too complex, extensible
Benefits over MARC: MARC isn’t XML based and
can’t easily be output from web forms. Requires
special “cataloging” knowledge and systems to
implement
Investigating XML as a new more flexible syntax
for MARC element set
19
Why MODS? (cont.)



Need for rich hierarchical descriptive metadata in
XML but simpler than full MARC, especially for
complex digital library objects
Benefits over Dublin Core: DC doesn’t have
sufficient specificity. DC doesn’t specify a syntax
and is inconsistently applied. DC isn’t extensible
Need compatibility with existing library
descriptions
20
Features of MODS






Uses language-based tags
Elements generally inherit semantics of MARC
MODS does not assume the use of any specific
cataloging code
Reuse element descriptions throughout schema
Not intended to be round-trippable
Not intended to be a MARC replacement
21
MODS high-level elements










titleInfo
name
typeOfResource
genre
originInfo
language
physicalDescription
abstract
tableOfContents
targetAudience










note
subject
classification
relatedItem
identifier
location
accessConditions
part
extension
recordInfo
22
MARCXML to MODS
What does this
set tell you?
<mods xmlns="http://www.loc.gov/mods/">
<titleInfo>
<title>Germany by bike : 20 tours geared for discovery /</title>
</titleInfo>
<name type="personal">
<namePart>Slavinski, Nadine,</namePart>
<namePart type="date">1968-</namePart>
<role><roleTerm type=“text”>creator</roleTerm></role>
</name>
<typeOfResource>text</typeOfResource>
<originInfo>
<place><placeTerm type=“code” authority="marc">wau</placeTerm>
<placeTerm type=“text”> Seattle, Wash. :</placeTerm>
</place>
<publisher>Mountaineers,</publisher>
<dateIssued>c1994</dateIssued>
<issuance>monographic</issuance>
</originInfo>
<language>
<languageTerm type=“code” authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription>
<extent>238 p. : ill., maps ; 22 cm.</extent>
</physicalDescription>
<note type="statement of responsibility">Nadine Slavinski.</note>
<note>Includes index.</note>
23
MODS
(continued)
What does this
<subject>set tell you?
What does
authority='lcsh' mean?
<subject authority="lcsh">
<topic>Bicycle touring</topic>
<geographic>Germany</geographic>
<topic>Guidebooks.</topic>
</subject>
<classification authority="lcc">GV1046.G3 G47 1994</classification>
<classification authority="ddc" edition="20">796.6/4/0943</classification>
<relatedItem type="series">
<titleInfo><title>By bike</title></titleInfo>
</relatedItem>
<identifier type="isbn">0898863872 (acid-free, recycled paper) :</identifier>
<identifier type="lccn">93047676</identifier>
<recordInfo>
<recordContentSource>DLC</recordContentSource>
<recordCreationDate encoding="marc">931129</recordCreationDate>
<recordChangeDate encoding="iso8601">19990429094819.1
</recordChangeDate>
<recordIdentifier>3471394</recordIdentifier>
</recordInfo>
</mods>
24
http://lcweb4.loc.gov/
•MODS descriptions for each web site (but not each capture)
•Transformation from XML to HTML display
•Links to web archive
25
Could you
pair the
displayed
info with the
MODS
statements?
26
Differences between MODS and
Dublin Core



MODS has structure
 Names
 Related item
 Subject
MODS is more MARC-like so more
compatibility with existing descriptions
 Semantics
 Conversions
 Relationships between elements
MODS includes record management
information
27
Choosing MODS for descriptive
metadata
MODS is particularly useful for
 compatibility with existing bibliographic
data
 embedded descriptions in related item
 Rich, hierarchical descriptions that work
well with METS structural map
 “out of the box” schema; can use
<extension> for local elements and to
bring in external elements from other
schemas
28
3 Transformation tools

MARC toolkit
Converter from MARC 21 to MARCXML
 Transformations between metadata
formats

 MODS
 Dublin
Core
 ONIX

http://www.loc.gov/marcxml
29
More development




Changes from version 3.4
http://www.loc.gov/standards/mods/mods.
xsd
MADS ontology developed
http://www.loc.gov/standards/mads/
(Metadata Authority Description Schema)
MODS User Guidelines (Version 3)
[updated 04/06/2010]
Bibliographic Framework Initiative
http://bibframe.org/
http://www.loc.gov/bibframe/
30