Batch changes in Voyager an overview of the possibilities

Download Report

Transcript Batch changes in Voyager an overview of the possibilities

Batch changes in
Voyager
an overview of the possibilities,
processes and tools
GUGM 2014
Adam Kubik, Clayton State University
Susan Wynne, Georgia State University
Selected Gary
Strawn programs
Caveat emptor
● The Strawn tools are very powerful and do lots of things
● Require a working ODBC connection to your database
● Operate directly on records in your database -- There is
no undo button!
● Some tools are well documented, some not so well
● We don’t pretend to have tested, used or understand
every possible option and setting
● All of these tools can be run in preview or test mode.
Use it and start small
VgerSelect
● An alternative or complement to Access
Reports
● Can output results as either:
o
o
Tab-delimited files of selected elements, or
Records in MARC or XML format (to work with in
MarcEdit, etc.)
Creating tab-delimited output file
● Limit to sound
recording format
based on record
type (LDR/06)
● Export full MARC
records
● Or choose fields,
subfields and/or
fixed field byte
positions to
export
● Here the bib ID
number, 007,
008, 047 and
048 are chosen
Output can be filtered in Excel
Filtering with VgerSelect
● You can also ask VgerSelect to
filter its own output, but...
o Jobs might run for a while
o Fewer options than Excel
● If you want to play with the data
in multiple ways it is faster to
output all the data once and
use Excel to manipulate it
Bib Delete
Bib Delete Reports
● Problems--any records the program could not
delete (usually will be suppressed)
● List of records modified and what was done
(suppressed, unsuppressed, deleted)
● Report--total number of records read
● List of OCLC numbers of deleted records, in
case you need to delete holdings in WorldCat
Recent uses of Bib Delete at GSU
● Suppressed most print gov docs records for a weeding
project
o Needed Location Changer for another part of this
project
● Deleted obsolete Duke ebrary records due to change of
platform
● Deleted a group of ETD records that had been reharvested with better mapping & new OCLC control nos.
Recent uses of Bib Delete at Clayton State
● Suppressed Films on Demand titles from OPAC using
vendor supplied sets of MARC records representing
titles that had been removed from the collection
● Removed sets of locally bulk imported records when
something went awry with the load (e.g., MFHD records
not created as expected)
● For these jobs, we just used the set of bib records as
the input file and the option to match the 001 in the
vendor set against the 035 in Voyager
Location Changer
● Like Bib Delete, can (un)suppress or delete
MFHDs, items and bibs
● Can also automate multiple changes to other
data elements in MFHD and item records
● Can work on records one-by-one or in
batches
Selecting records to change
● Select records to change by using criteria in the
records:
○ MFHD or item location (perm or temp)
○ bib record type
○ item status
○ call number range
● Or, input a text file of barcodes, item IDs, MFHD IDs, or
bib IDs
Selecting records to change
Use record criteria
Use an input file
Use in a weeding project at Clayton State
● Specify records with an input file of scanned
barcodes generated by student assistant as
items are boxed
● Run job to change to permanent location (in
MFHD) to withdrawn, suppress holdings,
and reset the item status to withdrawn
File of
barcodes
Change location
Change item
status
Use of Location Changer in weeding project
● If the last item attached to a MFHD is
suppressed, the MFHD is also suppressed
● If the last MFHD attached to a bib is
suppressed, the bib is also suppressed
● Location changer also generates a report of
OCLC numbers from suppressed bib records
so that holdings can be batch deleted from
OCLC as well
Use of Location Changer in weeding project
● Since Location Changer can’t adjust MFHD
holdings statements (866) -- or create new
MFHDs -- some cleanup is still required
● Titles with complex holdings will need to be
adjusted manually if volumes on a MFHD
were only partially withdrawn
Recent uses of Location Changer at GSU
● Flipped a subset of the obsolete Rare location
to a new Treasures location
● Flipped the rest of the Rare items to general
Special Collections location
● Applied a temp location to a group of items
being digitized
● Suppressed print MFHDs only for gov docs bibs
with multiple locations/formats attached
URL Changer
● Change URL stems
● Change or delete 856 $x, $y, $z and/or $3
● Works on URLs in both bibs and MFHDs!
Recent uses of URL Changer at GSU
● Changing proxy prefix
● Changing URL stem when the name of our
institutional repository changed
Recent uses of URL Changer at Clayton State?
● Adding proxy prefixes and other URL
changes made using a different method:
validation rules in Cataloger’s Toolkit
● URL Changer is less flexible but much more
user friendly
Strawn’s RDA Conversion tool
● Use to make batch changes to pre-RDA
access points in bib (or authority) records
o
Changes are the same as those made during the
“Phase 2” changes to the LC authority file
● Can also generate reports on other access
points that may require review
Changes that can be made to access points
●
●
●
●
●
●
●
●
●
●
●
Expanding abbreviations (“arr.” “acc.” “unacc.” “Dept.”)
Expanding or deleting “O.T.” and “N.T.”, as required
Replacing “violoncello” with “cello”
Replacing “Koran” with “Qur’an”
Changing “Selections” in $a or $t to “Works. $k Selections”
Shifting $k “Selections” before $l and $f
Adding or removing parentheses for certain instances of $c in personal
name access points
Expanding abbreviations of months
Replacing “fl.” and “ca.” with “active” and “approximately”
Replacing “b.” or “d.” with open hyphenated dates
Expanding dates (e.g., changing “1090 or 91” to “1090 or 1091”)
Changes that can be made to access points
● The RDA Conversion tool allows some
“options” for access points…
● But you will probably want to follow standard
practice
Didn’t want to
change dates in
subject access
points.
Reports generated for review
● Access points that might not be RDA compliant, or that
need human review to appropriately correct
o
o
o
o
o
o
Musical ensemble terms in $m
“Polyglot” or “&” in $l
“Libretto” or “Text” in $s
Treaties
Conferences
Personal name access point $c terms that might be improper
● Access points flagged due to possible typos, or errors in
subfield coding
Record Reloader
● An alternative to bulk import with webadmin
for overlaying existing Voyager records
● Can use VgerSelect to output a MARC file,
make changes, and then overlay with
Record Reloader
Record Reloader main screen
Considerations for options & settings in
Strawn programs
● Do you need to limit to one or more owning
libraries?
o
Select the owning library/ies you’re interested in
● When making changes to the database, do
you want the changes to be reflected in the
Universal Catalog?
o
Choose the appropriate “happening”/ “cataloging”
location
Selected options in URL Changer
Global Data
Change
GSU’s experience with Global Data Change
● GSU and GSU Law have run a total of 3
successful data change jobs in 2013
o
o
o
change 049 GLLM to GLLO
add 007 fields to DVD records lacking them
delete an obsolete 590 note from selected microform
records
● GDC is not currently working for us!
o
o
“record export failed”--error logs
have tickets in with USG and Ex Libris
Basic steps for GDC
● Create a record set
● Create one or more data change rule sets
● Create a data change rule set group with one or
more rule sets
o
keep it simple!
● Preview
● Run data change job
o
everything works until we run jobs--they fail
My Add DVD 007 rule set
GDC preview of a “replace string with string” example
Cool things about GDC
● Save record sets, rule sets, rule set groups for reuse
● Multiple ways to create a record set:
o From a list of record IDs previously identified
o From a search of the same indexes as in the
cataloging client
o Scan an existing record set or the entire database
for desired criteria (set up scan rules & run scan jobs
within GDC)
● Preview
Not so cool things about GDC
●
●
●
●
IT DOESN’T WORK (for us *right now*)
Creating rules is not intuitive
Takes some trial and error (this is why you preview)
Mysterious “validation errors”
o Missing 005?
o Corrupted or obsolete data in LDR/008?
o These errors don’t cause the job to fail--these records are
just skipped in a successful job
Validation rules
in Cataloger’s
Toolkit
Validation rules in Cataloger’s Toolkit
● Validation rules are extremely versatile at
making batch changes to bibs and MFHDs
● Like GDC, validation rules will make
changes directly to your database
● Like GDC, this method can be unintuitive...
Using custom Validation Rules
● Rules are contained in plain text files that
can be edited in a text editor
● Rules have their own peculiar syntax
● Cataloger’s toolkit comes with a default set
of canned validation rules
● You can add new rules to the file or create a
supplemental file
Sample Validation Rules
● A rule to add the specified 007 to every bib
record that lacks an 007:
#4=BDFMPSU 007! F <37:007,,cr_cn_||||||||>
● A rule to perform a find and replace in every
MFHD 856 $u:
#4=H 856/u F <43:856/u,http://xxxxxx/,http://zzzzzz/>
Defining the records to be changed
● Rules are normally run on sets of records
specified by bib ID or MFHD ID in a text file
● These files can be generated with some
other tool, such as VgerSelect
Examples of simple Toolkit validation
changes
● Batch changes to e-resource vendor records
o
o
o
o
o
Adding and/or fixing 006-008 fields in bibs
Adding and/or fixing GMDs
Adding genre headings
Adding and/or fixing 007-008 fields in MFHDs
Standardizing SMD in 300 $a and deleting 300 $c
● Added proxy prefixes to our NetLibrary URLs
More complex projects using Toolkit
validation rules
● Copied classification portion of call number
from bib to MFHD for our ebooks
● Added local call numbers in the form
“Streaming Video (Title)” to our streaming
video bibs and MFHDs
● Adding genre headings to our feature films
based on data compiled from IMDB by a
student assistant
IMDB feature film genre heading project
● Used VgerSelect to extract bib IDs and all
655 fields from our VHS and DVD records
● Used Excel to filter by “Feature films”
heading (1246 resulting bibs)
● Student assistant searched for titles in IMDB
and added genre information to spreadsheet
IMDB feature film genre heading project
IMDB feature film genre heading project
● Used Excel to filter for particular genres
● Then filtered out records that already had
the corresponding genre heading
● Ran a validation rule on the remaining bib
IDs to add the corresponding heading, such
as:
#3=BDFMPSU 245 F
<37:655,_7,|aWar_films.|2lcgft>
MarcEdit by
Terry Reese
Possible uses of MarcEdit
● Edit a file received from a vendor or other source
before batchloading
● Use the Connexion Bib File Reader plug-in to work
with a group of WorldCat records
● Use the WorldCat APIs to search, update holdings,
create or replace records, and more
● Extract a MARC file from Voyager with VgerSelect,
make changes, and overlay
Recent uses of MarcEdit at GSU
● Clean up ETD records harvested from the IR into
WorldCat before loading into Voyager
o I have 15+ MarcEdit steps, but can pre-set up and save most of them
in Task Lists!
● Add 9XX field(s) to most vendor batchloads to identify
different e-resource sets, e.g.,
o SFX MARCit
o YBPeBookApproval
ETDs & MarcEdit
● Create a Connexion local save file of harvested records
since last Voyager load
● Use Connexion Bib File Reader plug-in to bring into
MarcEdit
● Basic cleanup, including:
o
o
o
o
o
add 007
delete unwanted fields generated during harvesting
add a 970
RDA Helper to add 33X
create local headings for the discipline using Swap Field
=LDR 00000cam 22000003u 4500
=001 868162197
=005 20140226122043.0
=008 140214s2012 xx
om 000 0 und d
=040 \\$aGSU$cGSU$dGSU
=042 \\$adc
=049 \\$aGSUU
=100 1\$aBrien, Spencer T.
=245 10$aThree Essays on the Formation and Finance of Local Governments$h[electronic resource].
=260 \\$bScholarWorks @ Georgia State University$c2012-01-06T08:00:00Z
=500 \\$aapplication/pdf
=502 \\$aThesis / Dissertation ETD
=653 \\$aproperty tax
=653 \\$acontract cities
=653 \\$alocal government
=653 \\$atax relief
=653 \\$agovernment outsourcing
=653 \\$atax exemption
=655 \4$atext
=786 08$nPublic Management and Policy Dissertations
=856 40$uhttp://scholarworks.gsu.edu/pmap_diss/37$3Item Resolution URL$xThis 856 field was generated using the
WorldCat Digital Collection Gateway$yView online$iPut this Resolution URL in a web browser to view this item.
=856 40$uhttp://scholarworks.gsu.edu/cgi/viewcontent.cgi?article=1036&context=pmap_diss
=029 0\$aGSU$boai:scholarworks.gsu.edu:pmap_diss-1036$chttp://scholarworks.gsu.edu/do/oai/
publication:pmap_diss$tDGCNT
Harvested ETD record
“before” MarcEdit
Harvested ETD record
“after” MarcEdit
=LDR 00000cam 22000003i 4500
=001 868162197
=005 20140226122043.0
=007 cr
Added basic 007 using Add/Delete Field
=008 140214s2012 gau om 000 0 eng d
=035 \\$a(OCoLC)ocn868162197
Created 035 from 001 using Swap Field & Edit Subfield
=040 \\$aGSU$cGSU$dGSU$beng
=042 \\$adc
=049 \\$aGSU1
=100 1\$aBrien, Spencer T.
=245 10$aThree Essays on the Formation and Finance of Local Governments.
Added 260 $a, edited $b with Edit Subfield, stripped extra characters
=260 \\$aAtlanta, Ga. :$bGeorgia State University,$c2012.
from $c with a regex in Find/Replace
=300 \\$a1 online resource.
=336 \\$atext$btxt$2rdacontent
=337 \\$acomputer$bc$2rdamedia
Added 33X fields with RDA Helper (requires correct 007)
=338 \\$aonline resource$bcr$2rdacarrier
=500 \\$aapplication/pdf
=502 \\$aThesis / Dissertation ETD
=653 \\$aproperty tax
=653 \\$acontract cities
=653 \\$alocal government
=653 \\$atax relief
=653 \\$agovernment outsourcing
=653 \\$atax exemption
=690 \\$aDissertations $xPublic Management and Policy.
Created a 690 from the 786 using Swap Field & Edit Subfield
=786 08$nPublic Management and Policy Dissertations
=856 40$uhttp://scholarworks.gsu.edu/pmap_diss/37
Deleted unwanted 856 subfields with Edit Subfield
=029 0\$aGSU$boai:scholarworks.gsu.edu:pmap_diss-1036$chttp://scholarworks.gsu.edu/do/oai/ publication:pmap_diss$tDGCNT
=970 \\$aETDBulkMarch2014
Added 970 with Add/Delete Field (so I can identify records if things go wrong!)
Don’t fear the RegEx
● Ask the MarcEdit list for help (or search the
archives)
● My First RegEx!
o
o
I had a bunch of dates in 260 $ c like this:
 $c2013-05-01T07:00:00Z
 $c2011-05-07
Total length varied, but I just wanted to keep the first
four characters
My First RegEx!
MarcEdit recommendations
●
●
●
●
Ask for help on the MarcEdit listserv
Save your original file before making changes
Save frequently after *successful* edits
There is an Undo button, but it will only undo your
last change
● MarcEdit is great when you need to apply multiple
changes to a group of records
● Just. Try. It. (Trust me!)
An alternative to GDC
● Output a MARC file from Voyager with
VgerSelect
● Make changes with MarcEdit
● Overlay the records with Record Reloader
● (VgerSelect & Record Reloader require
ODBC drivers)
Putting it all
together
a Special Collections project at GSU
Splitting the Rare location
● Special Collections staff separated the *really* rare
items from the existing Rare location for a new
Treasures location
● Most items in Rare were integrated into the
existing general Special Collections (Spec.)
location
● ~1000 records to Treasures
● ~1200 records to Spec.
Lots of reports on the Rare location
● VgerSelect or Access Reports to generate bib IDs for all records with a
Rare MFHD attached
● VgerSelect to retrieve the existing 590 notes and more
● Access Reports to find *all*the attached MFHDs
o Some had existing Spec. copies or multiple Rare, requiring some
extra attention to 590 notes
o I created a new table of the bib IDs & ran query to retrieve MFHDs
● Also needed MFHD ids & suppression status
o need Access Reports to get suppression status
Location Changer to update MFHDs & items
VgerSelect + MarcEdit + Record Reloader to update bibs
● Output a MARC file with VgerSelect using
the appropriate list of bib IDs
● MarcEdit
o
o
Edit Subfield to change location codes in 049
For Rare to Spec lists, Add/Delete Field to add a
standard 590:

Special Collections copy: Rare Book Collection.
● Record Reloader to overlay bibs
Bibliographic maintenance projects
● Use VgerSelect and Excel to locate invalid, obsolete,
contradictory or missing data, such as:
o
o
o
o
“N/A” (or other invalid codes/typos) in 008/Lang
Sound recordings, videorecordings, maps, microforms lacking 007s
records with a 240 $l and no 041 (or an 041 0X)
incompatible 008/DtSt, 008/Date, 260/264 $c combinations
● List of possible maintenance projects at:
http://www.carli.illinois.edu/products-services/ishare/cat/Cat-maintpriority
Bibliographic maintenance projects
● RDA conversion of descriptive fields
o Gary’s RDA conversion program
o MarcEdit RDA Helper
● Upcoming changes to music subject headings
o Musical forms will be moved from LCSH to LCGFT
o LCMPT terms in 382?
● Changes needed to facilitate data migration?
Strawn’s RDA Conversion tool
● Can be used to make RDA style batch changes to data
other than access points, as well
o Generate 336, 337 and 338 fields from data in record -- or -- force
336, 337 and 338 to certain values for a given set of records
o Expand certain abbreviations in certain descriptive fields (see Toolkit
documentation)
o Remove GMDs
o Reformat a 502 into RDA style
● Tool reports when selected data cannot be converted
o For example, if data such as 007 and SMD are missing, ambiguous or
contradictory
MarcEdit RDA Helper
● Similar capabilities to Gary’s RDA Conversion tool
● Will also generate the new 34X fields
● Option to generate GMDs on RDA records if
desired
● GSU uses to add 33X to ETDs
o
o
limited testing on other files
explore using on selected or all vendor batchloads?
RDA Helper default options
Recommendations
● Start with easy programs, like Bib Delete or Location
Changer
● Test on small groups of 5-10 records
● Get comfy with reports! (Access Reports and/or
VgerSelect)
o Generate lists of record numbers to work with
o Check your results
Recommendations
● *Always* preview/test
● Give yourself time to experiment and learn
● Ask questions…
o MarcEdit-L
o Voyager-L for Strawn programs
Coming attractions in Voyager 9
● Some Location Changer functionality incorporated into
Pick and Scan
● Options to delete, suppress or unsuppress added to GDC
● Option to export record sets and files added to GDC
● See Voyager product update from ELUNA and Voyager 9
release notes (requires login to Ex Libris Documentation
Center)
Links
● Gary Strawn’s programs
o His excellent documentation is also available here
o RDA Conversion program
● Voyager-L
ELUNA archives (login required)
o good source of presentations on GDC and Strawn programs
MarcEdit download
● MARCEDIT-L discussion list
● Terry Reese’s YouTube videos on MarcEdit
● Fear No Longer Regular Expressions
Contact
Adam Kubik
[email protected]
Susan Wynne
[email protected]
https://www.facebook.com/susan.wynne.94
https://gsu.academia.edu/SusanWynne