Acquisitions/Bibliographic Control, Chester Fritz Library         

ALA Annual -- Chicago, IL -- July 5-12, 2000


WORKSHOP

Metadata: Libraries and the Web - Retooling AACR and MARC21 for Cataloging in the Twenty-first Century. July 6-7.

This was an excellent workshop. A very large program book with pages for notes was produced for the workshop. My notes are recorded there. Libraries are taking steps to create data and operate in the digital environment of the Web. Many projects were described. Most were able to address repeatable data by using computer operations, but the intellectual work particularly for title, author, and subject access still rests with the librarian I missed the last part to attend the WWPUG meeting. . General contents:

Keynote: Metadata and access to information resources

Session 1. Methods of providing access to web resources

The Delsey report, the Cardinal Principle, and (ER) Harmonization: AACR complexities, necessary and otherwise

MARC 21 as a metadata standard: a practical and strategic look at current practices and future opportunities

Seriality: Seriality and AACR2; ISSN: link and cross-link for data and metadata; Struggling toward retrieval: can alternatives to standard operating procedures help?

O brave old world: using AACR2R to catalog Web resources

Session 2. Methods of providing access to web resources

CORC and Dublin Core: new methods of minimal-level cataloging

Visionary or lunatic: one CORC participant's psychiatric evaluation

Working toward a standard TEI header for libraries

Resource Description Framework (RDF) and its markup language, eXtensible Markup Language (XML), Libraries and future of the Semantic Web: RDF, XML and alphabet soup

ISO standards development for metadata

Cooperative development of subject-oriented web pages and web pages created from catalog records (and visa versa): Anticipating the deluge: the INFOMINE project and its approach to metadata; Abracadabra! From a web site to a MARC record

Session 3. Growing your own digital library at home

Navigating LC's cartographic treasures

VARIATIONS: creating a digital music library at Indiana University

Dublin Core for Digital Video clips

Interpretive encoding of electronic text using TEI-lite

Session 4. Metadata farther afield

Developing the using of metadata at the National Library of Medicine: from decision-making to implementation

Integrating bio-collection databases: metadata for database semantics

A picture is worth a thousand words: metadata for art objects and their visual surrogates

DDI, the Data Documentation Initiative - an introduction to the standard and its role in social science data access

Archival finding aids as metadata: Encoded Archival Description

Government data, state efforts - Colorado Digitization Project

Instructional management system (IMS) standard: solutions for interactive instructional software and dynamic learning

Session 5. Looking toward the future: what's ideal? What's next?

Vision for the future: accessing webpages

Metadata, a view from the trenches

Metadata: hype and glory

 

FRIDAY, July 7, 2000

WWPUG Minutes

http://www.und.nodak.edu/dept/library/Departments/abc/ala.htm

ALCTS. SAC Subcommittee to Study Metadata and Subject Analysis

I attended the SAC Executive Committee with Lori Robare to help explain the program proposal. Hence I missed much of the Subcommittee meeting.

FAST. OCLC is making progress only it is a little slower than they thought. They had to identify forms that were in subdivisions. They are still working on building cross-references for geographic headings and creating authority records. They reviewed 7 million 650's and 1 million 651's. They are now validating trying to split into facets; then they will move to authority work. The idea is that everything would be an established heading - no patterns.

OLAC meeting.

Came in the middle of the meeting (after the SAC meeting).

Report from the Subcommittee on Source of Title Notes for Internet Resources.

We need to use the journal title from the first journal issue represented on the site, not the "home page". There are problems like, first screen and initial screen mean the same thing. "Menu" is a fuzzy term because it is a level up but the ojbect has no title; e.g. WebSpris has a list you can pick titles from but when you do, you get a search screen. The rule for title source terms will probably be: pick the best title from the whole resource (just like videos) and say which one you picked. A "page" is what you see AND can scroll up and down with.

Discussion of CC:DA Task Force on an Appendix of Major and Minor changes

( http://www.ala.org/alcts/organization/ccs/ccda/tf-appx1.html

The appendix will contain guidelines for when to make a new bibliographic record. You will ask: is this the same thing? Minor - when the difference don't make a difference, either ignore or just mention in a note. Major - significant, make a new record. John Attig wondered why area 7 and 8 are not mentioned and when in the actual text will you refer from the text to this appendix. Musical presentation statement should be a major change. We need to focus on rule-based or cataloger wording vs. actual text on the item. E.g. 516 can be worded several ways but the document is the same, but the creator's name on the item IS his name. A difference in distributor (AV) is a major change.

Discussion of: Revising AACR to Accommodate Seriality: Rule Revision Proposals

( http://www.nlc-bnc.ca/jsc/ch12.htm )

Ch. 12 and Ch. 9. The scope of serials to address continuing, i.e. integrating resources is intended to address loose-leafs in print and databases and web sites (ch. 9) A big thing to address, is how you define how it changes. On the web, the record will describe the most recent and the old stuff will move to notes. Whereas, with a looseleaf, you used the original and added notes about new stuff; it was pointed out the latest manual now says to use new title. Jean Hirons included many serials changes incorporating CONSER practice and some international practice and policy. If something is electronic, by definition you have to look at ch. 9. Chief source is the biggest issue - for a journal it is important to identify the tittle; it is not so important for a web site. You have to decide if something is integrating - there has to be an intent to change, not just because the web is a changeable entity. LC has proposed getting ride of area 3 - file characteristics and also remove the footnote about description that would take away place to say if it has sound and color.

MARBI. #u may be just fine, but sometimes a URL includes authentication info, hence #y has been proposed (DP 119)

SATURDAY, JUNE 8, 2000

Ex Libris Breakfast - Key to Successful Digital Libraries

See http://escholarship.cdlib.org/rtennant/presentations/ala2000/exlibris

Roy Tennant's presentation.

Interoperability. There is a lot of information on the Web being put out by separate libraries, but users don't know where to go. The ideal would be a search that would go to all kinds of places, de-dup all the responses, and present it to the user. See http://searchlight.cdlib.org as a small attempt

Facets of interoperability.

Technical: TCIP/IP, HTTP, XML, SFX for linking

Semantics: agreement on meaning (we often use multiple labels for one meaning, e.g. author or creator); possible solution is thesauri with crosswalks

Political / Human: will require lots of communication

Intercommunity: e.g. libraries, museums, etc.; different descriptions, different terminology; need to focus on common problems and provide for customization

International interoperability: there are different ways of looking at things; language and script differences make it difficult to match meanings

U.K. interoperability, see: http://www.ukoln.ac.uk/interop-focus/

Union catalogue projects, see e.g. http://www.nlc-bnc.ca/resource/vcuc

Methods of achieving interoperability. Have to adhere to some standards; protocols; guidelines.

    1. Standards - hard to get agreement and commitment to follow
    1. Guidelines

Our users deserve our efforts to bring things together

SFX presentation by Herbert Van de Sompel

We face:

SFX tries to solve:

SFX framework provides interoperability linking between and parties and provides context-sensitive linking

You could look at it like a new module or linking server in your automated system. It is a hub. The user always goes to it first; the user is identified as are the resources he can go to.

All the parties (OPAC, full-text, etc.) have to communicate; proposes Open URL to solve this. If the user has the right to access a resource, it will create a URL with metadata about the object. It is a just-in-time linking service. The user clicks the URL and gets the information the wanted. Because the URL contains metadata, the hub can obtain information about use.

Why use this hub? SFX? It is a neutral resource you in your library can control vs. the vendor defining access. The only think you ask of the vendor, is to offer the Open URL which should be created following a standard. You get consistency in your linking service. You get control You can switch provides and your users will never know. It is on-deman transportation of metadata from a resources to a context-sensitive service. See http://www.sfxit.com/OpenURL SilverPlatter is an Open URL provider. E.g. DOI proxy server -- link dx.doi.org/---- VS. Open URL http://sfx.aaa.edu/menu?id=doi:----

Features of Open Url:

MARBI

MARBI web site has link to UK harmonization discussion.

Jean Hirons on seriality

This is a three-step process: 1) change rules; 2) encode it; 3) make it work with CONSER

The proposal is the use the 008 for serials, but use a different code

Main concern is what is "integrating" and what is finite or what is continuing. With this proposal it says you just have to decide if it is integrating - don't need to decide if it is finite or not. "s" would be "seriality" rather than "serial"; "I" would be "integrating"

Propose new Freq: k for continuously updated.

For type of serial, propose adding "l" for loose-leaf. Should we add "w" for web site, "d" for database; could use "n" for none of the above; blank is already annual serial

Successive / latest 0/1. Propose adding "2" for integrating and then use 1 record for those that the web site loses the older titles; you keep one record and change the code and rearrange or add new title.

Integrating resource - means what the web is doing

Integrating entity - means how you're encoding your bib and what fields you choose

246 or 247 ?? at issue

For integrating, latest is 246

Looking at an internationally accepted uniform title - what field?

So if we catalog a journal as integrating, then it would have 245 with newest and 246 or 247 with earlier title.

Publishing - multiple 260's with indicators or publishing info in 550 - area 4 vs. area 7. Maybe 260 # original, 3 current 4 intervening

Do not pay attention to change of place; pay attention to change in publisher

Propose to say 245 is newest and also 260 and add "note previous" whether 246/7 or 550; reason for multiple 260's would be for union listing

How to connect 260 and 362 - could use #3 to say which number or year you got something from the new publisher

SAC Subcommittee on Metadata and Classification Forum

http://www.ala.org/alcts/organization/ccs/sac/metaclassfinal.pdf )

See handouts.

The Formation of the Subcommittee on Metadata and Classification / Mary Dabney Wilson, Texas A&M University Libraries

Developing a Questionnaire to Evaluate the Use of Classification to Organize Web Resources / Julianne Beall, Library of Congress.

Speakers discussed the formation of the subcommittee, the rationale behind its charge, and the methodology used to develop the metadata and classification site questionnaire and subsequent analysis. In addition to discussing the major findings of the committee, the presentations discussed the successes, problems, and lessons learned in the evaluation process.

They asked what subject rich elements were being used and found very few full descriptions about sites. They found that sites used a variety of ways to do hierarchies and a variety of levels of hierarchy and some did nothing. A chief benefit of classification is that it is language independent, except that captions were usually in one language. Classification is usually thought of for shelving, but when used as a browsing mechanism it puts sites that are related together. Should have an arrangement that puts previous/next AND up (broader) and down (narrower) arrows on the screen to navigate.

Recommendations: need more specificity in assignment, should display the number of resources included in each main class, provide ability to move between resources in the same class, display captions, make caption terms searchable, provide means of limiting large retrieval sets, resources surrogates are desirable - i.e. bib record data

The Application of the DDC classes in NetFirst and in the Cooperative Online Resource Catalog (CORC) Project / Diane Vizine-Goetz, OCLC

Diane discussed the evolution of the use of Dewey to provide subject access to electronic resources in NetFirst and CORC. Netfirst records all have DDC in them; some have several. They were created by OCLC staff. They used a table to correlate LCSH subjects with DDC. Netfirst will have all CORC and Intercat records. Dewey subjects are at 3rd level Dewey categories (captions). Dewey facilitates browsing. The WebDewey product is a browser-based interface to the full version of DDC. LCSH is linked intellectually and statistically to DDC. It is a powerful tool for well-developed hierarchies, rich network relationships, and finding well-defined categories. OCLC is working an XML presentation of DDC. It is not user-friendly yet, but it can encode it with language, display, printing and authentication

 

Future Applications of Metadata Encoding Schemes For Classification / Frank Cervone, DePaul University

( http://www.depaul.edu/~fcervone )

XML is important for data interchange and application interchange.

Metadata provides descriptive information; it is not meant as a transformative technology. There are lots of kinds of metadata (MARC, DC, TEI, EAD, GILS, PICS, GEM, IMS). How do they relate? XML can help us do the relating. XML is transformative technology. It enables a commong methodology for constructing crosswalks. Z39.50 is designed just for cross-system searching XML is much more versatile, especially using RDF which defines the relationships between elements in different metadata. Because of its nature, XML enables hierarchical arrangement: browseable hierarchy, cross referencing related fields, advanced limiting. It enables the use of multiple classification systems (and/or those linked to subjects). It blurs the lines for the user because it can cross metadata types. It can operate in the background. Sometimes classification or subjects just don't map somewhere else - we shouldn't get carried away with the technology when the result might yield nothing.

MAGERT - Acquisitions of geospatial data

Issues in acquiring data: costs of both governmental and commercial international data, multiplicity of formats, licensing and use, integration with extant data.

Issues with the data: metadata, compatibility, convertibility, attribute data, permissions for modification, updates, relevance to need, longevity, space requirements, vendor support

The US government drops stuff in your lap, the rest is hard to get. It need aggressive and knowledgeable collection development

E.g. http://www.ecai.org Electronic Cultural Atlas Initiative. It has a focus on metadata with temporal and spatial attributes for all data. It has a time map which is hard to include in GIS, built to work with ArcView shapefiles.

John H. Long from the Newberry in Chicago talked about their project, Atlas of Historical County Boundaries. County boundaries have changed 4-5 for many counties. They decided to use current state boundaries to discuss counties. Parts of Minnesota were under 9 different entities and historical boundaries often include North or South Dakota.

Local data is usually done in more detail and serves immediate needs. It is often created for the agency's purpose or need and needs to be manipulated to library users' needs. We often have to wade into political arenas to obtain GIS data. Local materials are often not free. Contractors may retain right to sell the data; orthos are more likely to cast than remote photos. Freedom of Information Act (FOIA) does not guarantee digital access.

To obtain local data: starrt small - work with your faculty; facilities management may create maps; survey agencies you can think of to see what they have - maybe you can work out an exchange; you might be able to catalog and organize data for them in return for getting a free copy; were are in a world where we have to build contacts and contracts to obtain data. Statewide research contracts may result in large amounts of data. The library may be able to obtain a copy of and/or participating in organizing, creating portions of the data for the project in return for obtaining a copy of or access to the data.

If you are looking for free data, remember it takes work so it isn't free. What is free on the web is often old; new stuff costs money. Sometimes what you find is just USGS stuff just repackaged.

SUNDAY July 9, 2000

OCLC Breakfast

See handout

OCLC seems to be encouraging CATME or web. CATME is a windows product, not just a terminal emulator. To insert a line, press insert at end of line, press enter, create SOME for new line. They are trying to figure out how to add NACO work. Can lock in batch and replace records and upgrade. Passport will not be enhanced. It is hard to go to authority files to verify headings while doing copycataloging; it forces you to go to the local system. With PALS that means the base record is bad, PALS doesn't allow cut-n-paste easily, takes more steps to process, more segregation of books by how hard they are to catalog, you have to open passport to find out if your locked record replaced.

Is MARC dead? RUSA program very well attended.

Joe Matthews, Martin Dillon, Karen Coyle, Jerry Campbell

MARC standard - Joe Matthews

Martin Dillon

See http://www.oclc.org/institute.index.htm - Select: Martin Dillon's ALA Annual presentation 'Is MARC dead?'

What does it mean to say MARC is dead? Do we keep AACR but use XML instead of MARC; or, keep MARC in new formats like DC, but let's abandon AACR2; or, both are dead and full-featured bibliographic systems are no longer desirable.

But right now we have a LOT of MARC records. Should it be dying? MARC-AACR will continue to be significant as long as paper is significant to scholarship. Microsoft predicts 50% of book reading will be electronic, yet US book production trends show a decline in 1994/95 but now it is higher than it was in 1992.

The Web is out there now. There are 48 million sources (see Nature July 1999, p. 107-109). Web content doubles every twelve months; the arrearage is already incredible.

MARC-AACR vs. Dublin Core. Keyword is low richness and low effectiveness. Dublin Core is higher for each; and MARC-AACR is high for each - is there something better? MARC has many strenghts; Dublin Core is based on XML, is international and web-based.

    1. Libraries need a form of bibliographic description appropriate for the web
    2. Dublin Core is becoming a standard
    3. MARC community would benefit from library's version of the Dublin Core
      1. training methods are already in place
      2. existing staff can do Dublin Core
      3. Thousands of catalogers could be doing it right now

Karen Coyle

Karen tried to address the complaints. Is our problem the container or contents? Is it the MARC record or systems. We are talking around the problem.

    1. MARC is not used beyond the library community. It is not a problem, because it is highly coded and content is unbelievably good - it could be in any format. If the content is good and marked according to a standard, you can move it to any format
    2. MARC - it is a format for bibliographic data, not anything else; but for that it is good. Its purpose is to share and avoid duplication of work. It isn't meant to do other library functions. We have problems like 856's on records with local library access modes; issue of multiple vs. single record which then is not shareable.
    3. Often limited library systems, not the format
    4. We have sometimes clugged things into MARC that may have better ways to be handled, e.g. Community Information Format which is now better as a database on the Web

She explained we are working with the wrong model. Why do we force users to get through the catalog to the web? It is overwhelming. Each person should have their own information space.

We need a core bibliographic record in some format so we can share records, whether MARC, XML, or whatever.

Don't shoe-horn other stuff in the bib record.

This is a question about library systems.

Let's use the right tool for the job.

OPAC WEB A & I db e-journals

| | | | |

Overwhelmed user

 

 

OPAC Web A & I db e-journals

|

Remote interaction (e.g. Z39.50, COBRA, SFX)

|

Enhancements

|

Tools

|

Happy User with own information and profiles for preferred searching, saving, etc.

Jerry Campbell

MARC should have been a failure. It is a failure because we have been unimaginative using it. We didn't reconceptualize it for the digitial world.

MARC was developed to embody AACR which at the time was a fundamentally brilliant achievement.

MARC succeeded overwhelming to the point of being a religion - i.e. without question.

He believes AACR is the villain.

    1. cost to create
    1. changes are coming

Let's follow Dublin Core; keep it simple; do not turn it into MARC

Our online systems ought to be able to handle XML, EAD, images, and possible MARC-AACR

Comments following presentations:

Libraries are not up to the standards of dot.com 's

Short of creating a better Web presence, we will lose our patrons; we are not the only game in town

One point we have vs. business - is we provide information without being watched (which marketers do)

Users see libraries as a place of last resort; they are confused by stuff in OPACS

We refuse to change and we are losing our customers; we need visionary leadership

What about future users? What will they ask, that if we don't put it in the record now, it will be lost? We still need to get stuff cataloged faster, forget the unnecessary stuff

We should be able to see hierarchies

Quality issue has been held high by libraries; there is a change in the environment.

Managers in libraries don't put money behind catalogers to allow them to try new things. We need to demand of directors to try new things. Catalogers feel managers won't support them. Managers feel catalogers are "stuck in the mud" which is what happens when your only priority is get the books out

Time spent cataloging is not always the fault of AACR or MARC - it the trying to figure out what on earth you have in your hand and what you should do with it

We need much more continuing education. We have a major revolution to undertake. We need education. We can start with Dublin Core/MARC records.

Are we supposed to lower our quality to meet the dot.com 's? Are we the place of last resort because we are good or because we are bad?

The number of reference questions answered is declining

Users are willing to trade a bad, quick answer for a good, long one.

Mr. Kilgore: sounds like it did 30 years ago; use the computer; he claims we will all do known item searches - at least the final search - but you need to do some other search to get there

Authority Control Interest Group

See handouts

David Miller at Curry College

He has been copying 150 records into his 155 index and occasionally gets rid of 680 and 053 fields.

Genre authorities: his index in his online database are 655, with LCSH, GSAFD, MIM (now MIGFG moving image)

He is working on a project to convert 650 candidates to 655's

III can't change tags globally so he has to copy them and change them one-by-one

When he ends up with a copied record with 155, it is still valid in the fixed fields.

Adam Schiff at University of Washington

He also uses III

They proposed genre terms be in a separate index.

LCSH/AC they converted all to LCSH

They indexed #v as genre

It still isn't indexed in the OPAC for the public

Because III can do rotated headings, when the user searching "fi", the list starts with "fiction" but it isn't the beginning of the subject heading, it is the #v

In OCLC they write over 150 with 155, then export to their system and it just goes to their database index for genre

They decided to add #v Juvenile to headings when they got rid of LCSH/AC

GSAFD - relationships were not done carefully; RT's and BT's were either missing or incorrect

They have a problem with multiple thesauri and #v

They have added 670 fields with thesauri indication.

They are trying to include authoritative thesauri for other 655 and 670's; decided LCSH and GSAFD are primary, but when then don't address the subject, the use authoritative thesauri.

155 Poems

655 Poetry

670 LCSH ...

670 RBGENR

Need to educate reference librarians

Check http://staffweb.lib.washington.edu/monos/catdocs/ArtistsBooks.htm

Check http://catalog.lib.washington.edu/search/...

 

Brigham Young University

They are doing 655 but not #v.

They have a glitch - the #2 shows up in the OPAC.

There doesn't seem to be a way to search them to find them

SIRSI good at allowing you to create records for authority control

They converted or created records.

155 Historical fiction #2 gsafd

555 Biographical fiction #2 gsafd

555 #w h #a Edwardian novels #2 gsafd

670 [they don't commonly use]

Had to decide which fields they would call valid so they wouldn't report as errors: #2, #y #z

They also use free-floating subdivisions unless inappropriate

They use LCSH as a genre and do whatever with it you would if it was LCSH

They made an authority record in the subject index that says see the genre index.

There are 5 genre lists valid for archives - had to decide on the term from one thesauri therefore they need only one autho record for the idea.

150 Diaries is in the subject index

155 Diaries #2 is in genre autho index

They use LCSH first, the, use other thesauri for certain topics of materials unless they have used some other term heavily in their library.

They are working on RBMS thesauri to reconcile conflicts (5 are listed in LC's list, but they are really subsets of rarebook thesauri)

AAT eg.

155 Architectural drawings #2 aat

455 Architectural drawing

555 #w g #a Drawings #2 aat

555 #w h #a Mechanical drawings ...

667 non public note

155 Diaries #2 rbgenr

455 Journals (notebooks) #2

555 #w g Record-keeping works #2 rbgenr

555 #w h Overland journals #2 rbgenr

Check: http://www.lib.byu.edu/~catalog/people/rlm/bsc/home.html

Gary Strawn at Northwestern on MeSH

They have 800 qualifiers (i.e. subdivisions), 19,778 main headings (mostly topical), 478,687 combinations and precious few genre

All their genres have [publication type]

008 says a language record is language otherwise they look like topical headings

MeSH geographic are not the same as LCSH

There is almost a complete lack of cross-references; instead they have codes for hierarchies

He has written a program to go through the tree structure to find broader and narrower terms and then generate 550's

MeSH doesn't allow a word to be either topical or form.

Ann Della Porta of Library of Congress

SAC - Poor people's policy work has created 15 new headings

GSAFD on individual works of ficiton - will be regular work on new acquisitions of American and English language literature

Still working on creating form authority records - 70% done and 2/3 done on form subdivision records

Authority records - still not displaying. Plan to display by the end of the year via Z39.50. Endeavor can identify in OPAC difference between MeSH and LCSH but neither deals with conflict

120,000 authority records have been corrected to indicator 1; no bib records are corrected

Pinyan. OCLC will convert 1XX fields in name authorities, personal, corporate, geographic, uniform, series. They are coordinating subject authorities and classification changes. OCLC will identify subfields in Wade-Giles - if fields already have Pinyan nothing will happen to them. LC will manually review records that appear to contain Pinyan AND Wade-Giles or something else. They have to watch for already converted names so they aren't changed again. In the 008/07 - romanization c=pinyan

Afro-Americans will change to African Americans - 580 subject records. They will start changing bibs in 2001

PCC Meeting

The Automation Committee is looking at better BM procedures. Standards is considering Core Record for Collections. Training Committee - several programs are progressing

BIBCO - decided that only records with AACR2 compliant access points are labeled as BIBCO

Series - standardize captions in non-English languages; and series numbering should be displayed in the OPAC in numerical order

CONSER is working on bib level "I" and publication patterns

NACO training manual is being translated into Spanish, Chinese, etc.

SACO proposals were delayed due to online; still testing web

Carol Hixson - ALCTS has some training initiatives for continuing education that they want to coordinate with PCC

Joan Schutima - Standards Committee really looking at the cross-reference recommendations from last Midwinter; they want to figure out a way to evaluate if core records are adequately effective in the online environment

Two new Task Forces: 1) 042 - maybe a place to say an old record has been updated to current AACR2 headings; 2) multiple manifestations of electronic resources (John Reimer)

Karen Calhoun - Automation Committee. Still working on aggregators. Plan a forum at Midwinter; "jake"
- there is a jake to MARC utility available to convert info from jake to MARC. BFM - don't report changes on new changes to OCLC because WLN is working on it. OCLC is looking at adding NACO work in CatME

BIBCO is working on guidance for 78X linking fields on monograph records

Jean Hirons - reported on "seriality". The AACR revision for "I" has lots of support; CC:DA set up a committee to solicit comments. The question is, who will be in charge of integrating resource bibs, URLs. etc. Publication patterns (891) are nearly ready to take off. VTLS has been able to use it and is able to predict checkin immediately.

Cataloging Productivity (Production standards) by Sever Bordeianu.

At his library, they wanted to provide catalogers (para-professional) with completely flexible schedules so they wanted to know what to set as an expectation. There are all kinds of reasons for not setting quotas, but still people want to know what is reasonable. They had to try 2 different surveys to get information from other libraries. Since professionals do much more advanced and original work and things like NACO they could not get any consistent enough data to make any conclusions. They decided to assign points to tasks and they had to earn their points quota and were virtually free to go home. See Technical Services Quarterly v. 16, no. 3, 1999 and v. 17, no. 3, 2000.

There has been lots of controversy because professionals can't operate under the point system since they are faculty. The professionals also get points, but they are also evaluated on things like ALA, etc.

They don't have anyone checking quality. They still need to set that up.

Productivity has not gone up much at all; they just got trusted flexibility and better morale.

Compensation: if there is no money, it doesn't make any difference; if there is, the numbers ensure more money for those people with higher numbers

They have a gifts backlog to 1977, but no new books backlog. Gifts are not seen as a priority.

 

MONDAY, July 10, 2000

CORC Under the Hood

CORC is a SiteSearch application rewritten from C to Java where they are able to define classes. JASSI embeds language in web pages. Mantis is on top of JASSI. XML is used to control pages and displays of XML

XML - template to rapidly do different data elements è populate è populated template è display èhtml èedit (back to populated template). From populated template, can do data extraction è databases

When you ask for a form, XML generates the HTML form, posts the form, parse the form, makes application changes

Template THREE.xml - top level hooks

Data is stored in BER format. Element - qualifier - scheme - data

Everything is stored in Unicode

Crosswalk philosophy is that it be bi-directional. It is fairly extensive - everything can't be controlled by tables - tables are Excel. They match on display - tag must match in tables, indicator regular expression match, optional match specified.

Title processing - have to deal with: single 245 in MARC but Dublin Core allows more; have to decide how to determine which is 245 and which is 246; have to decide what to do with indicators

DC date goes in fixed field

OCLC genre - control over bib lvl, record type - how much validation do we need?

Challenges: subfield order, table functionality, punctuation, validation (right now coming through PRISM in the background because there are so many rules), additional formats (EAD, CIMI, etc.)

MARC in WorldCat is pretty simple; in CORC it is MARC on steroids.

Can CORC be used for other materials? It CAN be, particularly for museum objects, images, etc.but still focusing on electronic.

CIMI has established a version of Dublin Core on it's page you could use in CORC

You can include CORC records in a pathfinder -- search CORC, mark for pathfinder, create a pathfinder and it is incorporated

Microsoft and Windows 2000 embrace XML

Z39.50 isn't dealing with Unicode. They are working on it.

Vendors

Liz Bishoff said she would come to any group for travel expenses to explain or demo Colorado Digital project.

Checked with Demco, etc. for U-matic video cases; also label covers for Special Learner Collection

I got a personal demonstration of Content for managing digitization projects (University of Washington product incorporated with OCLC SiteSearch)

I got a demo of Illiad - looks like OCLC has contracted out ILL

CC:DA - MARBI joint meeting

Presentation by Dick Miller

( http://xlmarc.stanford.edu/ALA_2000_Chicago/index.htm)

Frustrations: cluggey intersystem links, multitude of incompatible interfaces, limited ILS interface flexibility

We can't continue redundancy just because we want it in a different format

XML is the langua franca of the Web

Libraries have a history of: being impartial, confidential, thorough, lack of commercial intervention, and trustworthy

NLM disseminates its Web resources in XML

There has been a dramatic decline in libraries of print material

MARC is an impediment to effective interface to Web resources

XML supports Unicode and is extensible. Entities can be named and stored separately; can contain hierarchies and nested elements; containers can contain text or elements; expresses granularity and hierarchy of data.

XML has adjacent standards - XSL (XML style sheet) for display

XML problems - element names can't begin with a number, Unicode character representation (font) may conflict, etc.

Search access and presentation are being worked on.

Example of a MARC problem: coverage and beginning dates - create and update dates have different format

MARC encodes things multiple times

Form is in 006, 007, 655

Mixing elements and their properties makes data more difficult to deal with: 700 = surname, ind. 1 type of name

MARC codes are redundant

MARC has some relation codes; XML is inherently hierarchical

Cataloging is intensive. It is an intellectual investment (not to be wasted). Some simplification seems less than simple. There is overemphasis on description. There is under-emphasis on relationships. Access is critical.

Access. Don't strain for consistency. Mixing languages in headings seems dubious and hard to share internationally.

There is lots of pre-coordination - time intensive and inconsistent; why are some dependent on other fields?

Software could provide users option to pare ideas together

Why do we give titles so much emphasis and many identities? Plus we add fields for made up titles

Why don't we adopt a formal title only occurring in the bib record?

Lots of relationships aren't even coded.

It isn't always clear why a records shows up from a search.

Bib-bib : series link to main or authority

Bib - authority : link to host - host to item

Authority - authority : link to a bib - title as subject (Ex Libris ) XML can handle the link once

Bib - Auth

Works (titles - relationships)

Aspects (access - authorities)

Subject - word or phrase (UMLS already does this)

XML - analyzing relationships is essential; should there be 1 format for bibs

Types - relation, analytic, subunit - we don't show well

Authority records are underutilized: why not pub publishers in authority records?

Bib with formal title with qualifier àrelation.

|___ Title with version qualifier à relationship

|___ Item

|___ Item

Paul Wiess repsonse

It is not XML vs. MARC -- it is AND

MARC gets used for:

Input - except for those for whom MARC is easy, a better interface would be better

Storage - is no longer a big deal

Communication - MARC is okay, but XML is an industry standard we should be paying attention to; how can we take advantage of it?

Display - we don't like raw records for users (even if it makes sense to some of us); XML isn't pretty either, but we would redisplay it too

Processing - a lot of library systems use parts of MARC to do things; still could but might get more and take less work with XML

We ought to look to places we haven't standardize or implemented yet; e.g. ISO for Circulation

Let's experiment - be we need to evaluate and change as we go

MARC go away? - no - they will exist. What XML tells us should lead us in the right direction for discussion of what we do with MARC. We could just wrap MARC in XML

XML - also need to use DTD's, XSL, etc. to make it work

What other data besides bib data do we want to share?

Maybe we should decide on a standard DTD for bibs

We need to experiment, but we still need to work through the MARBI process which has been working well.

It takes time to learn and train people to use something new so that doesn't mean we don't do this, we need to plan the future

Matthew Beacom response

Sees it a as a new life for MARC; it's not a replacement for MARC and therefore who needs cataloging anymore.

Saying MARC is dead - its like like saying spaghetti is being replaced by pasta

Just cuz we use a DTD doesn't mean we are going to change the content ( -- except the content may be structured very differently)

With XML it will be more appropriately expressed for our users, by our providers

The environment is very different

XML is the structure into which we put our fill

We might have a change to get stuff from the information industry rather than the narrow niche of ILS vendors

Cataloging and catalogs FRBR and cutter principles still have value

Our records may be contained in the material itself

Access // description: definition separation makes sense in the Web environment and XML can address it if we try it out and find a new way to do things

Mulver might be easier to deal with because XML is inherently hierarchical vs. MARC which is flat

We often are including URL's referring to related items; this makes a lot of sense in the XML environment

510's - instead we could go to "jake"

 

Discussion

We can use the same tools the dot.com 's use but do a better job of providing quality information (Dick M)

We need to remember the value of what we've already done; semantics is what's valuable about what we put in MARC; syntax (structure) could be anything

We take MARC and AACR's direction of what we include for granted

The marked up content can be ported to another structure

Many industries are trying to create what we already have

Many can work on a small scale but don't know how to scale up

We have to be careful to not get carried away with relationship stuck in records that would make the records harder to port; MARC worked hard to make each record independent and that has value (there is a beginning and an end (John Attig)

DTD stuff is on LC's Web page; we need to standardize one or several DTD tag sets but as long as they crosswalk, that is okay (Sally McCallum)

We are not a rich community - we need to edge in the direction of XML, but we even have no guarantees the Web will there in 10 years

If we want to link to the world out there, since we can't catalog the web, we need to do something; if data is not there, there is nothing to port in the future (Paul W). For example, having done detailed subfields for geospatial stuff has helped us with the ability to port, so as we develop DTD's we need to design them carefully

You could populate an element from external sources if you didn't have it originally; maps being bi-directional and hierarchical make a variety of restructuring possible; just putting MARC in XML without changing the structure is not a solution - we need to decide what content to put where how. Dick is working on making DTD schemas (Dick M)

How much time to we have to make big fancy authority records? We can still link out of it to another resource

You can use filters so some stuff is already done when you import the item

If each institution did their own institution's records or is their library's specialty, and we share them, we can all benefit

What about historical data? (John Attig) Authority records for old dead faculty?

We have to be careful about the manipulation of data, particularly that which is hierarchical. Some of our search categories are extensive and get at hierarchical data faster than a database can reconstruct if for display (Mary Larsgaard)

Let's make sure we know what relations user's want (Paul W)

Independence of records and relations of records - what are our expectations?

We need to look at Functional Requirements for Bibliographic Records (FRBR) - if you talk about modularization, you need to talk about relationships

We need the tools around XML (Karen Coyle); LITA has an open source group that might be interested in creating software. You need a Unix system to get the current editors to run - we need a nice PC-based one for catalogers

They are working with 3 models. They believed Oracle when they said they could handle XML last Sept. You need an in/out mechanism, then you need a database management system (they are frustrated with Oracle); they are experimenting with OZONE which is object oriented (Dick M)

Multilingual Task Force. How can you adjust your OPAC to match language choice of the user? What actually happens? Just transliterate or translate?

TUESDAY, July 11, 2000

CORC Pathfinders

Of the % of adults 18 to 24 years usage: 84% use the net; 16 % use the library. But, .02% of the Web is useful information, hence there needs to be a tool to find the good stuff

What is CORC? It is a tool for librarians to: select, describe, present the Web

CORC pathfinders are:

Electronic subject bibliographies

Avatar of the professional librarian on the Web. We need to create a present on the Web which people will know to use

Maybe we can regain some of the market share with the "library brand"

One librarians' page says: technological and print solutions to meet your educational, professional and personal needs

Pathfinder issues: time consuming to create, get 404 messages, static, not integrated with local issues, duplication of effort

CORC benefits: leverage benefits of selection by other librarians. E.g. LC has Doing business on the Internet. University of Minnesota is one of the most active pathfinder creators - biomedical library

You can put your own logo on the pathfinder if you have a cascading style sheet to apply to the page

Promotes integration with local resources. E.g. tribes of the plateau region (U of Wash; Content software); e.g. Archival practices pathfinder is CORC cataloged stuff and print stuff; e.g. one library put together information of interest to legislators and emailing web addresses to legislators

Can create a dynamic search in a pathfinder, save in CORC, whenever the search is initiated it will search the database with updated information

The librarian who created the CORC pathfinder will be notified to fix the URL in CORC if it fails a checker. The checker only checks CORC records' URL's not other sites you might have listed w/o CORC records. When the one library fixes it, it is fixed for everyone using the CORC database. They are looking at something similar to bib notification to let us know of a change when we've exported the pathfinder to our server

Responsive to new content

You can use the pathfinder query builder (of CORC resources) and you have Boolean options; it is a way of putting a search into a pathfinder; e.g. one library made a pathfinder with stuff at the top they selected then put the query on the bottom to lead users on to other stuff

Hennepin County gets about 10,000 hits a week.

CORC bibs right now have to export to a file; but it will replace passport

OPAC-centered approach - export in MARC to a file, upload; can include a 956 which is a locally viewable URL and it's private

Gateway-centered approach - if I license or buy, I'll catalog; if it's free, I'll create a pathfinder, not even bother to export, instead you select Export Link - you get a little browser box, you past that into your homepage so the user is take to the CORC database (for which you will be charged)

In other words, choice is: 1) export pathfinder as a whole but info is static, or 2) export the link

Web-Z - capture Dublin Core records and put into a database or your OPAC

So you start in CORC - costs you 40 cents a search, mark (tag) the records you want to put in the pathfinder ("create") - that doesn't cost; export link (costs like a bib export - 30 cents); monthly you pay 10 cents per pathfinder to keep it in the CORC database where they will maintain the URLs. If the user searches and finds it, it is a search charge, but if you included a query search, those don't get charged. If you don't want to pay the 10 cents, you can export the pathfinder but you lose dynamic searching, updates, and URL checking. If you create a bib (MARC), you get a creidt; you get 1/2 credit for a Dublin Core record. If you upgrade a record, you get a credit.

What about the subject heading Afro-Americans changing in CORC records and then pathfinders? When you do a record and you select "control", it will dynamically fix the CORC record and hence pathfinders, but you search query will still be wrong.

NetFirst fate? The database is still in FirstSearch; CORC is in WorldCat and in Netfirst - who knows about the future

The pathfinder you create is yours and you can add to it

For your library, you can clone some one else's and add to it (even if all you wanted to add was your logo) and export to your webpage; you could be nice and mention whose record you copied, but it isn't necessary any more than we tell people whose bib record we used.

Fixed-fee applies to use of pathfinder searches.

WEDNESDAY, July 12, 2000

CORC Users Group Meeting

Message from last meeting: connect with WorldCat and do pricing similar to bibs because we understand that

Last 6 months work: new crosswalk to MARC, CORC with WorldCat, billing, production mode, new user interface, now online as a resource for libraries

Plans: extending and expanding capabilities, foundation of extended WorldCat, our input

Gary Houk on OCLC emerging strategies

Web-based System Interface

CONTENT DISCOVER FULLFILMENT METADATA

Extended WorldCat Resources

    1. Content
    1. discovery and navigation
    1. Fullfilment
    1. Metadata

- services

Library services portal

EXtended WorldCat

Example: if you send me an ISBN, I will provide the service of sending you a table of contents

See http://names.oclc.org - open name services where an agency can register; includes SFX and Open URL's, in addition to ISSN, SICI, Handle

They are still looking at putting patron authentication in the front.

Idea or purpose - link to service, rather than store on system; some vendors may need educating that they need to continue to store, so you might have to store, but at least you don't have to type

The idea is to use a PURL; e.g. on a names resolution server; OCLC would send a PURL with name for server, ISSN number to ISSN server and bring back a link to an e-source if available and the ISSN metadata

Service Model Goal - to develop Internet service model of global library cooperation

Summary of seven "R's"

Passport - will it last? Not forever; but need good solutions on web for macros, etc.

"I pledge to OCLC and the library cooperative for which it stands, one product, integrated, Web-based, serving every library and patron and subscription pricing for all."

Lynn Kellar discussed some technical details.

Presentations by CORC users

University of Chicago

They have done very traditional selection and cataloging. He told references librarians to select 50 sites; they would catalog them within the next 2 months; then start a new 50. At the same time, other materials got backlogged. Cataloging time took longer that traditional cataloging because some of the process is usually done by other staff in his library, but with this the catalogers did the downloading, duplicate checking, etc. Catalogers trained the reference staff.

Jackie Shieh - University of Michigan

See handout

Cooperation and innovation is needed. They found they had to convert CORC to MARC since their system couldn't use DC. They utilized the "in process", "in review", downloaded into their system to make sure it worked, then "set complete". They put the Dublin Core in local web sites.

W. Sukantarat - UVM & CORC

They still have unresolved issues. Need policies. They are short of staff already; don't know what they will do. Decided to put Dublin Core in a separate database and use Z39.50 to search it. Their policies are on the Web.

Especially with trying to get reference librarians to do some of the work, the biggest issue is subject headings - what to use? Authoritative or keyword? No one was interested in authority work. They don't want to do call numbers of any sort.

FAST status

OCLC is taking LCSH and allowing it to be manipulated into it's parts.

Still investigating one at a time; trying to develop algorithms

 

Authority Breakout Session

Any thoughts, send to Rebecca Dean rebecca_dean@oclc.org

Current functionality

Future

 

Back to ABC Homepage | Technical Services Homepage

shelby_harken@und.nodak.edu

Chester Fritz Library, University of North Dakota

Last update: 8/10/00