ALA Summer, Atlanta, June 2002
THURSDAY–-June 13
1:00-5:30
Congress Center -- A403Maps Cataloging: Learning to Describe Cartographic Materials
– TrainingThis was an excellent course. My notes are in the workbook prepared for the conference. I will be able to use it for training A/BC staff to catalog maps.
FRIDAY--June 14
8:00AM -5:30
Congress Center -- A403Maps Cataloging: Learning to Describe Cartographic Materials
– Training
8:00-10 PM
Hilton -- GwinnettSAC Subcommittee on Semantic Interoperability –
meetingThis was an organizational meeting.
Discussion:
Lois Mai Chan has prepared a paper investigating a) integration of terminology, and b) methodology of projects dealing with interoperability. It is at: <
http://www.ifla.org/IV/ifla68/prog02.htm#5 >Daniel suggested an approach could be end-user defined thesauri with the end goal of providing semantic interoperability. Other approaches could include crosswalks; multi-lingual lists between two or more languages; classification and subject matching or linking, e.g. DDC and LCSH, HILT; thesauri-to-thesauri searching; front-end and searching language technologies. Lois said what she has found is that the heart of interoperability is mapping. Then you have to determine a) how to define equivalence, b) then, one keeps finding partial equivalence, so how do you construct those relationships, c) then, how do you do linking, d) then, link management / maintenance, so where and under whose control do these reside? e) where do you store these records, how are links made, and who will do it?
Various methods have been suggested. One idea might be to have one record chosen with links/references to others. This can be done using concordance tables where you chose one, and map to others, but this doesn't handle partial matches very well. Alternately, there could be multiple records allowing users using the cross-references to chose which to use because it is valid in their chosen context. This assumes we would be able to display this in a way that assists the user in knowing which term IS appropriate to their context.
We will begin by doing a survey of related projects. As we survey, we will identify certain characteristics: 1) stated purpose, 2) concepts involved, 3) where is it, 4) who is responsible, 5) what methodology is used, 6) what is the status of the project.
We actually need to define what we mean by semantic interoperability for subjects. Broadly, the purpose of semantic interoperability appears in Dublin Core.* We need to look at semantic interoperability among access points, subjects, topics, vocabularies, and how they work together. We need to look for ways to facilitate user searching, and facilitate broadcast searching. We need to see how indexing languages, or information languages would help us. We should determine whether we want to focus on controlled or free text as we see it on the Web. [We didn't continue to discuss this, but it would seem to defeat the purpose of semantic linking]
How are we going to define "subject"? Are we only going to address topicals, or are we going to include place/geographic names, personal names, corporate names, genre, titles? -- "history" without "United States" would be useless.
Similar work is going on that we should be aware of. Lois is working on "metadata schemes interoperability" dealing with issues like one scheme will use the term "name of resource" while another scheme will use "title." Does applying Unicode impact our issue? Could we make use MARBI's work that added 750, 751, 755 to authority records to allow linking to other thesauri provide an option? The Library of Congress has done crosswalks between MARC and Dublin Core. The Cataloging Group in IFLA is working on semantic interoperability in similar, but slightly different context.
This Subcommittee needs to remember its focus: topical, genre/form and what is subject related, i.e. what reflects the content of the resources. Geographic, personal, and corporate names are often not discussed in projects. So might we need to define the parameters within which we will include them. We also need to determine if a "best practice" would be one big project or something by category. Mapping gets very difficult when you mix things.
Analysis will come later, along with a clearer definition of semantic interoperability, how to retain semantic relationships, and finally if possible a "best practices" guideline. So in summary, we will look first for compatibility, harmonization, crosswalks, integration, interoperability, etc. in relation to subjects.
Other items;
Mary Charles mentioned that the last OCLC Steering by Standards addressed subjects to some extent, indicating it is the weakest link in digital archives and stressing the need for training, sharing, and standards. A person working with ONIX talked about structured, controlled vocabulary and semantic interoperability.
David explained the link with the Library of Congress Action Plan and the ALCTS Task Force for item 2.3. SAC's Subcommittee on Semantic Interoperability is a logical extension of the SAC Subcommittee on Metadata and Subject Analysis committee and SAC would have probably had the subcommittee anyway, but since ALCTS has a task force to work with LC's Plan 2.3, this subcommittee is the logical one and the chair is the logical person.
Meeting time was discussed. Friday is okay, but people preferred 6-8 PM.
Shannon announced the Authority Control Interest Group's presentations, half on interoperability from a systems view and half addressing subjects.
ACTION ITEMS:
We need to first find what all is out there on the topic; do some research. Lois has a bibliography at the end of her paper, at: <
http://www.ifla.org/IV/ifla68/papers/008-122e.pdf >Shannon will send copies of papers from the IFLA satellite meeting at Dublin last Aug. to those who marked the attendance sheet "send". Shelby will send her the addresses.
Develop an operational definition for subject semantic interoperability.
Compile a bibliography of pertinent works.
Compile a list of projects. Shannon, Tony, and Shelby will work on.
Comments, ideas, project names and citations, would be helpful from any committee member.
Next meeting at Midwinter, Friday, 6-8 PM.
* An Overview of the Dublin Core Metadata Initiative <
"The focus of DCMI is the development of semantic metadata standards to support resource discovery. A supporting technical architecture, however, is essential to enable deployment of such standards."
" Supporting the evolution of the semantics with metadata registries is both an important operational responsibility as well as an evolving research topic for the community."
" The characteristics of the Dublin Core that distinguish it as a prominent candidate for resource discovery systems fall into several categories:
Simplicity: The Dublin Core is intended to be usable by non-catalogers as well as resource description specialists. Most of the elements have a commonly understood semantics of roughly the complexity of a library catalog card.
Semantic Interoperability: In the Internet Commons, disparate description models interfere with the ability to search across discipline boundaries. Promoting a commonly understood set of descriptors that helps to unify other data content standards increases the possibility of semantic interoperability across disciplines.
International Consensus: Recognition of the international scope of resource discovery on the Web is critical to the development of effective discovery infrastructure. The Dublin Core benefits from active participation and promotion in some 20 countries in North America, Europe, Australia, and Asia.
Extensibility: The Dublin Core provides an economical alternative to more elaborate description models such as the full MARC cataloging of the library world. Additionally, it includes sufficient flexibility and extensibility to encode the structure and more elaborate semantics inherent in richer description standards."
SATURDAY--June 15
8:00-10:00
Hyatt Regency – Piedmont RoomProject Muse breakfast --
rsvp'dThey discussed their new titles and new services.
9:30-12:30
Renaissance -- Georgia WestMARBI
- http://www.loc.gov/marc/marbi/an2002_age.html - Report Sunday at SAC on Proposal 2002-13: Changes for Faceted Application of Subject Terminology (FAST) Subject Headings, and DP 2002-08: Multiple Versions, FRBR and MARCProposal 2002-13:
FAST development is a result of work done by the SAC Subcommittee on Metadata and Subject Analysis committee. Some of OCLC's goals: retain vocabulary in a simplified way, reduce the "red books", all headings would be established, except chronology. Authority and bibliographic record formats both need revision for topical, geographic, form, and period/chronological headings. The proposal suggests X48 (or any acceptable number) for 148/748 for authority records and 648 for bibliographic records. They expect a number of splits so they would like to retain in a note, the ability to indicate the type of replacement and indicate the type of processing that would need to be done with the record. The hierarchy present in pre-coordinated headings will continue, so $x will continue to be part of the heading."d" as proposed was still a concern. "d" usually means delete to vendors, so despite the fact OCLC says these records with "d" wouldn't get distributed, the MARC format is a distribution format, so let's not confuse vendors.
Discussion ensued on purpose and value of FAST. It is intended to be used by metadata creators who might embed these headings in metadata. Catalogers could also use it. It would be accessed via a simplified display. Access to OCLC - how will others get access who are not OCLC members? How does the end user get helped? - It is simpler. Isn't it just another thesaurus? - Yes, but simpler. Each heading is a facet and there is still some pre-coordination. You could end up making a different combination than you do with LCSH, but that is the point. It is intended to fall in between keyword and full pre-coordinated strings. The front-end isn't there - yet. We were reminded that the proposal was to meet an OCLC need for MARC, not to evaluate how well it may be used. OCLC was looking for a way to a) identify headings that totally replace, and b) identify those that need judgement, e.g. nurses and nursing.
If we use "d" or "o", the $w still is used to determine the automatic processing; all processing possibilities would be too many codes for the LDR, so $w will continue to be used. Proposed to change the proposal definitions (in essence): x = obsolete, replaced by another heading; s = obsolete, replaced by split headings; d = deleted, drop the record; o = obsolete, the heading is considered obsolete within the thesaurus of that heading.
Proposal accepted as amended in the LDR and the actual tag number may change.
Discussion Paper 2002-08: Multiple version, FRBR and MARC. Jennifer Bowen gave some background information. Discussion followed. There may be mulutiple solutions that may halppen simultaneously, perhaps with an authority record that would get you an expression level. Questions: what IS a work; can we make this clearer?, what IS our goal? How IS this to help our users? Can we do this in the MARC record? What about other mechanisms? We need an operational definition of "work". AACR already says something but can we make it better. This may solve two things: multiple versions and complex relationships - need better matching so we can help users more easily identif and find (FRBR) what they want. MARC is for communication, then there is the ILS or whatever that could use MARC identifiers to create a useful display for users. We are trying to show users bibliographic relationships. Let's think about the catalog first, then what ways can we create data a machine can handle. Thom said in Florida when they tried to address FRBR they found each library had followed different policies - some with separate records, some logically include multiple formats, and some just glommed together. FRBR should allow separate cataloging for those libraries and publications that don't need the specificity or have no relation to another work, etc. We need to keep in mind, a user may place a hold on a title they can't check out, e.g. a rare book; can FRBR help us? CONSER is also looking ad FRBR. "Work" may end up being definded differently by different constituencies - serials vs art works vs books vs AV.
OCLC has two projects and has run into various issues. It doesn't fit into cataloging workflow as done today. It requires that you read and follow the rules. It requires that you review other literature, or data in online catalogs. The conceptual model does work but it has holes; but, you can work with the records we now have. Are works and expression eligible for authority data? Does everything need an authority record? - many things are singeltons. We need to pay attention to labeled displays. We could look at Dublin Core. There needs to be data in a record that allows the machine to do algorithmic work to create the hierarchy. If it is non-existent or not in added entries, it is nearly impossible. Isn't the authority record that source for linking and doing the algorithmic work? We also need to consider what goes on in the holdings record. The authority record approach allows linking when neede, but nothing is necessary for the 80% of publications that a single works.
Proposal 2002-06R passed. The paper proposed adding additional codes to field 008/07 to identify additional methods of acquisition. The paper also proposed to rename field 008/08 to "Expected End Date." Should be able to track consortial payments.
11:30-12:30
Hilton -- Clayton RoomALCTS-CCS Research Discussion Group
-- The FRBR Entity-Relationship Cataloging Model: a case studyEd O'Neill from OCLC described a project to test the ability to apply FRBR and to retrieve an appropriate set of records from the OCLC database. OCLC intends to develop a FRBR-ized database with goals: accommodate different views, improve retrieval, improve navigation, and improve cataloging productivity. Currently we catalog the physical, single entity. With FRBR, one catalogs in context of other works.
FRBR entities.
Group 1. Publication - product of intellectual creation
Group 2. Responsible parties (author)
Group 3. Concept, object, event, place.
Group 1.
<abstract>
Work. Intellectual creation
Expression. Realization of the intellectual creation
<physical>
Manifestation. Publication
Item. Copy.
What makes the work different? Paraphrases, adaptations for children, abstracts, digests, summaries, revisions, updates, abridgements, enlargements, translations, etc. Manifestations that are the same will have the same physical and intellectual characteristics. Different manifestations will have different publishers, type font, binding, ISBN, etc. An item may be the same as a manifestation, or it may differ from another item by being signed by the author, rebound after publication, be hand annotated, etc.
OCLC worked from 32,000,000 records. They found 1.5 manifestations per work on average, but that is skewed, because 80% have a single manifestation and single expression. Ca. 30,000 (less than 1%) have over 20 manifestations. They found these types of changes: augmented - with illustrations, notes, additional prefaces, but the core text is the same; revised - updated, new content in the body of the publication; selected/collected - that contain multiple works - not sure if FRBR intends collections by one author as a work or separate works. Augmented by title was the smallest percent, but the largest group and most complex manifestations. So very few titles have lots of versions.
Goals:
1. examine benefits and problems of trying to FRBR-ize
2. better understand relations between bibliographic records and the objects they represent
3. test to see if there is enough information to do this in the records ?
4. create a useful test database
A single work/expression/manifestation doesn't benefit from FRBR-izing. OCLC picked the Humphrey Clinker book because it seemed to exemplify a number of identified problems. OCLC was trying to organize objects, the bib records (need to ignore typos, etc.) OCLC's search for Clinker got 179 records, with some translations, etc. They physically looked at 38 books and photographed parts of each book. They found very few minor changes - type font, shifting placement of dates with the letters, bibliographies were added, chapter titles added, etc. - are these expressions? OCLC didn't count what they deemed minor. If you strictly follow FRBR they would probably be new expressions except for those with type font changes, paper size, etc. Illustrations, notes, bibliographies, glossaries, etc. added would mean it is a new expression. They even found pagination the same, but the bibliography was longer on one edition than an older one. You'd never know that from the bibliographic records or current cataloging rules. Of the 179, 51 were simply different printings.
Are expression level records needed to make something discrete and identifiable? ?? There seemed to be too much information missing to reliably identify what they were looking at. None of the MARC tag descriptive fields overlapped with Delsey's table. Only thing that did was 245 $c or 7XX tags. In some cases we could add more information to books we already have. It would have helped to have things like the role of contributors, illustrators, editors, etc.
Conclusions. This type of assessment requires a study of many types of records and works. "Works" is a valuable concept. Works can easily be identified; expressions can not be easily done - only with physical handling was one able to identify different expressions which matches FRBR but may not matter to every user. A huge part of the OCLC database could not be amplified, helped, whatever in any way that the catalog for users would be helped. VTLS has attempted a possible display method. At what point does "minor" matter? Aren't we going to have to define for "us" what an expression is - but at what level is "us" - at the library level you'd never be the same; at the lesser community level "us" might have to make decisions but it would have to apply across countries, national libraries, utilities, etc. to work for record sharing. Despite where the JSC work started, they are getting away from trying to catalog something as an expression, particularly because Delsey put things like illustrators as additional identifiers; you still almost have to start with a manifestation and then consider if there is a need for either an authority record or additional fields. <
http://www.oclc.org/research/projects/frbr/ >
2:00-5:30
Mariott Marquis – Marquis Ballroom IVCC:DA
– LiaisonVotes of the last six months were approved. Feedback was sent to ALA publishing. CC:DA is sponsoring a program here re: serials and the new rules and program in Toronto on consistence and AACR2. The ONIX Task Force has been discharged. CC:DA will be working with LC on its action plan on electronic resources.
6. Barbara Tillett. See Handout. LC's Cataloger's Desktop will have the 2002 amendments, updated blue pages for NACO, yellow pages for NACO, LCRI's, and PCC decisions. Issue #4 will have final LCRI's for the 2002 amendments. Library of Congress's authorities will be available July 1 on trial at <
http://authorities.loc.gov > One will be able to search, display, download, but it will still have faulty diacritics (vendor hasn't solved), doesn't include Z39.50.7. FRBR discussion. FRBR includes locating (description and access standards) and collocating (controlled vocabularies for precision searching). [SACSEM ?] See Handouts [recursive
¶ means works refer to other works] "Work" can be: sound, image, movement, music, notation, alpha-numeric characters, etc. Expression: once we record the work. Manifestation: we have a physical item in/on a carrier; multiples of this all share the same characteristics. Item: individual characteristics; used for circulation; specific notes - autographed, etc. probably belongs in the holdings record.JSC Task Force on Format Variation. They discussed cataloging expression-based records - it was very difficult. They discussed the use of citations, updating terminology in AACR - not only "expression" but also names of access points, e.g. main entry vs. name entry or the like. Within Group 1 there are certain types of relationships: some relationships are inherent and they are passed down - work is realized by an expression. There are transitive relationships - some qualities or characteristics carry through from work to expression to manifestation. Expression is embodied in a manifestation. Manifestation is embodied by sibling relationships among all copies. There are whole-part relationships. Examples of within: aggregators, anthology, multi-part series, loose-leaf, sequential. Examples of between/among: accompanying - do we (should we) make links between? Companion - supplements, concordances, etc. Then there are specific shared relationships, e.g. everything in a certain language. All manifestations are in "Iceland" or have "lang: ice". Author is a FRBR entity, NOT and attribute of the work. In Group 2, we try to show relationships, e.g. created by, realized by, produced by, owned by. In Group 3, we have all the things that can be subjects. We control Group 3 entities with authority records. [SACSEM relation?] There are several authority control models, e.g. OAI Virtual International Authority File (e.g. <
http://www.oclc.org/research/projects/viaf/index.shtm > - each agency would maintain its own authorities. Then we could put this together with other "ontologies" < nature and relations of being ; see also Semantic Web < http://www.w3.org/2001/sw/ > and web search engines. SEE ALSO: FRANAR (Within IFLA an effort within the Universal Bibliographic Control and International MARC core program is the Working Group on Functional Requirements of Authority Numbering and Records (FRANAR). "…confirmed its conclusion to formally take up a new view of UBC, abandoning the older concept of a single heading for a person or entity that would be used in catalogs worldwide. Instead, the emphasis will be on the use of headings that are appropriate to the needs of end-users (script, language, cultural focus) while still providing control. This is made possible by current technology. Authority records representing these headings would be linked by numbering (the persistent, unique authority record control number is suggested for now) that would identify the authorized form of heading that represents the same person or entity in a linked national authority file. The FRANAR group also began discussions of an entity-relationship model drafted by Barbara Tillett." < http://www.ala.org/alcts/you/ifla/cataloguing4.html > IFLA in 2003 will be sponsoring workshops on cataloging rules, Paris rules, where we need to retain unique characteristics, and where we can collaborate to meet the objects of the catalog (FRBR and its user tasks) FRBR is mostly for showing relationships - questions arise like: what changes are necessary to the rules? What changes to authority records will be required? What changes in fields or subfields will be needed? What algorithms might ILS's do to create a meaningful display for users and what do the need to do that?9. ALA Publishing. Lots of AACR changes have been made. Hope to be contemporary and show currency; using a newer type face, etc. It will be 8 1/2 x 11 cm., 3-hole punch or international 4-hole. They will look into a professional binder and tabs. Amendments will be called updates and the year will be on the footer and available separately. There will annual updates with a new index. Prices
10. Matthew Beacom JSC report. See Handout. 1) Constituencies should give initial responses by Aug. for strategic plan - not a formal report. 2) Matthew, Jennifer, VTLS and OCLC met to discuss tests of FRBR with real records and hence be able to address format variation and effectiveness of FRBR displays for the user. 3) FRBR terminology - has a problem finding a better word for "item" - the real world thing we are trying to catalog. A group will review the document draft and get to Matthew by Aug. 4) 5) AACR21 rule of 3 - going to review the chapter as a whole . 6) LC to draft a Part 3 to AACR. 7) Michael Chopey's email "philosophical treatise" is well thought out and we may want to review it. We still can't solve where the eliminated area 3 should go and there is not enough time to give a response because a number of communities would need to review a document they have not yet seen. 9) GMD - there may be a document coming out of JSC's fall meeting for us to look at.
SUNDAY--June 16
8:00-9:30
Marriott Marquis -- Ballroom IVBowker breakfast
- rsvp'dPrimarily they discussed new features of Books In Print online. I talked with Thom from Florida about how migration is proceeding to Ex Libris.
9:30-12:30
Renaissance Atlanta -- Atlanta BSAC
- General Meeting
I gave a report on the MARBI decision on FAST and FRBR discussions. Will we end up with more uniform titles for more things? Where would subject analysis take place? - at the expression level? - maybe we could once get the best subjects on a records that would apply downward. 18th ed. of Sears and 14th ed. of Dewey will come out at the same time so DDC can include Sears.
LC report. See Handout. <
GSAFD. 155 7 $2 should be changed to 0 and no $2 - so what is LC's plan? It is a massive job to write new records. It is important for the FAST project. SAC's recommendation was to eventually include GSAFD in LCSH.
Dewey DC and Forest Press. Web Dewey: updates are in through Apr. 2002. If you are a subscriber, OCLC displays it across the top. Have met with Gary Strawn to take his software and make it work with Web Dewey for all of us. (see midwinter SAC 2002 notes)
FAST. Lois reported from Ed O'Neill's power point. It can be used alone and in addition to any other thesaurus. The vocabulary should be easy to apply, intuitive, does not require training; it should be logical, scalable, suitable to metadata, retain LCSH richness, enable machine validation. It follows through with Airlie House simplification: topic, geographic, form, chronology. It is working through three phases: 1) create initial files, 2) validate the headings, 3) create authority records including cross-references. In addition to topic, geographic, form, and chronology, they are need to address person, corporate body, conference, and uniform title headings. OCLC so far has completed topical headings (379,009 records) plus all red book headings; 2564 form headings; 51,498 geographic - US places. Personal names are being validated 1XX, 6XX, 7XX = 13,000,000 which includes typos, etc. - they filetered out erros and got over 423,000 headings used as subjects. Additional coding was approved so LC can begin work on coding headings. FAST is LCSH faceted. It will then be able to match Dublin Core: coverage, geographic, chronological, and topic. There will be a linking field $w and 7XX to keep headings in sync with LCSH changes. The chronological ..th century will be acceptable no matter what. Defined periods like 1861-1865 for Civil War would be a heading. This is probably the most difficult and it is the last thing being addressed by the FAST team - 1861-1865 is not logical for French history, but it is for US. An IFLA group is researching different metadata schemes. So far they find topic, form, and geographic are separate identifiers.
David's report. ALCTS Strategic plan. We need to tie work and reports of SAC to the ALCTS strategic plan. We need to prioritize our work and SAC's web site. We could submit reports electronically and put on the web page.
12:00-1:30
Congress Center – Room B405EBSCO Luncheon
-- Update on electronic journals serviceThey discussed a new service EJS. Partly it is a new administrative interface. Additionally for a fee, there are services for getting statistics, managing non-EBSCO titles in the same database, etc. Still didn't appear to get me bib records, but might save Helen some work - provided the EBSCO data is correct.
1:30-5:30
Congress Center - A305ALCTS/LITA Authority Control in the Online Environment IG
-- Real World Steps to Interoperability between Electronic EesourcesLITA's website will have some URL's related to the presentations.
William Moen, Texas Center for Digital Knowledge, University of North Texas.
The idea is to get systems vendors to follow developed (-ing) standards to implement interoperability. Assume: a) your users will interact with other resources, b) you system will interact with other systems, c) others will interact with your system, d) systems were designed to be used for local or proprietary use.
Interoperability. System definition: ability of two or more systems to exchange information and use the exchanged information without special effort. User definition: user's ability to search and retrive or otherwise interact with information in a meaningful way and use the information with little effort. It is about: communication/software; it is about: protocols to interchanges data/bytes/bits. Protocols support different activities. Z39.50 - it is to support information retrieval. Protocols are building blocks for interoperability. We need to separate the protocols from the data and its mark-up (MARC, XML, EAD, etc.)
Factors affecting interoperability: multiple systems, protocols (HTTP), formats and syntax (MARC), schemas (Dublin Core), vocabularies (MeSH, LCSH), languages and character sets. There are communities (libraries, museums, archives), etc. that generally want to provide information and have somet things in common in information. Interoperability could work because we have standards for (Z39.2 - MARC, Z39.50, controlled vocabularies). If we can reduce the differences and find the commonalities, we can get closer to interoperability. Z39.50 is a "language" to express searches and express results. It is a technical foundation for resource discovery, resource sharing, and integrating access to diverse resources; it is a strategic tool for our business transactions. Z39.50 provides a "vocabulary" for creating searches with attribute sets: use, relation, position, structure, transaction, completeness; and with values both paired and used for searching. E.g. something near [position] water [value]
What threatens interoperability? Differences in implementation of Z39.50, differences in local information retrieval systems, differences in local indexing policies. Wouldn't we benefit from agreement among disparate library systems? The goal of a profile is to improve interoperability within a particular application environment, e.g. Bath Profile is for information in libraries.
What are acceptable levels of interoperability? What are appropriate measures of interoperability? What is conformance to a profile? What does it mean? To what extent are each of us willing to agree on certain elements to be indexed?
ZINTEROP is a "free" (so far) testbed you could ask your vendor to check its version or your data against.
Carrol Lunau, Bath Profile Maintenance Agency, NLC - Profiles : what are they and why do I care : real world steps to Interoperability
Why a profile? Why do we need Bath if we already have Z39.50? It's a standard that allows interpretation. The minute you make "local" choices, you reduce interoperability, you lose trust. A profile is a community concensus, it identifies specifications to support requirements, improve search and retrieval sets, provides specs for vendors, and makes it easier for users. The Bath Prolfe is the lowest common denominator; it is what they could get common consensus on internationally. They tried to define searches very precisely as desired by the user, with the objective of increasing predictability and precision in searching library catalogs.
Bath is modular.
A = basic bibliographic search and retrieval (library OPACs)
B = holdings
C = cross-domain searching
D = authority record search and retrieval
Each has conformance level specs.
A) Basic Search and Retrieval.
All six attributes must be specified for any of the 29 searches.
Level 0 - 4 searches all keyword
Level 1 - 15 keyword & phrase; browsing with 3 scans defined
Level 2 - 10 keyword and phrase searches
It is cumulative; to do level 1 you must also do level 0. Types of searches: author, title, key title, subject, standard identifier, date published, format/type, language, prociessing institution. Can do basic OR precise search
Example:
|
Attribute Type |
Attribute Value |
Attribute Name |
|
Use (1) |
4 |
Title |
|
Relation (2) |
3 |
Equal |
|
Position (3) |
3 |
Any position |
|
Structure (4) |
2 |
Word |
|
Truncation (5) |
100 |
No truncation |
|
Completeness (6) |
1 |
Incomplete |
If the server on the other end is not properly configured, you get no hits, or visa versa, if the client doesn't structure the search right, the server can't do anything. E.g. if a phrase gets coded as a word, it won't work.
Bibliographic vs. holdings: bib is MARC, but holdings is XML. Bib cmumlates in conformance levels, holdings do not cumulate. Holdings occur in three situations: location only (central union catalog with no detailed holdings), locations and summary information (summary only), summary copy level holdings (individual items with circ information possible)
Cross-doman searching is similar to bibliographic conformance and level structure; it also requires SUTRS and XML records synataxes.
Summary: It defines 29 searches. You must be able to specify all 6 attributes. US version: level 0 - 4 keyword, 15 same as Bath plus 5 addiitonal, level 1 - 3 scans (same as Bath) level 3 - 7 Bath plus 40 additional (e.g. Mesh, LCSH, etc.). US element names for holdings: title level, summary, extended piece. Profiles won't solve problems, but it will enhance precise specifications for searches to enhance interoperability.
Matt Goldner , Fretwell-Downing - vendor's perspective
The question is, do customers really understand Z39.50? What do they expect from it? Z39.50 can have lots of services. Just one service is searching, then one kind of searching is scan, then there's various types of scan. There is lots of translating going on. Customers need to ask for actual needs, then match those needs to the profiles, e.g. we want all 6 attributes, at level 0 with 3 extra searches. Considering proprietary (vendor) vs. open (Z39.50), we need to determine if our needs are still identical wo what the former proprietary software did. Vendors themselves have to work with many other ILS systems. It used to be easy, they just designed a system for you to use internally. Now, for example, Fretwell-Downing has to test with 32 different ILS systems. Customers are asking for: OpenURL (SFX), NCIP, Bath Profile. See the Z39.50 maintenance agency site < http://www.loc.gov/z3950/agency/ > There is an effort to develop a web service for searching. ZING is the project investigating either SOAP or HTTP in a like environment using current tools - SOAP, XML, etc.
Larry Dixon , Network Development & MARC Standards Development Office, Libreary of Congress - Authorities
Reasons you specify a profile in an RFP: a) provide a stable specification for search and retrieval requirements for the vnedor, and b) for procurement specs. LC has done comparison studies of Functional D analysis (i.e. authorities) with the Bath Profile, US NISO AV Profile, LC requirements, MARC21 Authoritiy Format, and UNIMARC. They decided certain search are not to be in the core/Bath/international profile: LC control number, LC call number, subject sudivision, FRANAR. A server should support a Z39.50 receiving and delivering of MARC21 (Z39.2) and Simple Unstructured Text Record Syntax (SUTRS) i.e. painting of the screen of the client. The Z-client must send all 6 Bib-1 attributes. The Z-server must receive and process all 6 Bib-1 attributes.
Bath does NOT specify indexing. Libraries need to agree on something basic to make interoperability work. Bath does NOT specify how the server should behavie with a search it gets, especially if it is incorrect or lacking data.
|
Level 1 - Name Title Subject |
|
|
Keyword |
|
|
Keyword |
With right truncation |
|
Keyword |
Exact match |
|
Keyword |
1st characters in field |
|
Keyword |
1st words in field |
|
Level 1 - scan (3) |
|
|
Name |
Exact match |
|
Title |
Exact match |
|
Subject |
Exact Match |
|
Level 2 (first must be level 1 compliant) |
|
|
Personal name |
|
|
Corporate name |
|
|
Conference name |
|
|
Uniform title |
|
|
Geographic name |
|
|
Topical subject |
|
|
Genre/form subject |
|
|
ISSN |
|
|
Remote system record number |
|
|
Title |
Unanchored phrase |
|
Subject |
Unanchored phrase |
|
Name |
Unanchored prhse <it will probably be order adjacent word> |
|
Notes |
Keyword |
|
|
Keyword with right truncation |
|
Any (includes names, titles, subject) |
Keyword |
|
|
Keyword with right truncation |
|
Level 2 scan (includes level 1) |
|
|
Name |
Keyword |
|
Title |
Keyword |
|
Subject |
Keyword |
|
|
|
Future. Revise in 2002 and distribute to Bath and ZIG to ISO TC46. We should also follow of things like FRANAR
Glenn Patton, Metadata Standards & Quality Division, OCLC. International efforts to improve interoperability
International efforts to improve interoperability - there are lots of projects going on in Europe, especially interoperability of record content. Issues in Europe: multiple languages between and within, scripts, cataloging rules vary, subject access traditions (some LCSH, some home-grown, some national lists), financial support IS there to cross and combine communities in different cultures so they're also interested in museums, archives, etc.
AUTHOR A project in the mid 1990's to try to re-use author names.
http://www.bl.uk/gabriel/cobra/author.pdfMACS Multilinguage Access to Subjects. It combines LCSH, Rameau, and SWD (german)
http://wwwinfolab.kub.nl/prj/macs It does cross database searching using mapped lists. Human intellectual interpretation was done to determine. E.g. actors - acteurs - shcauspielerMALVINE Completed in 2001.
http://www.malvine.org/ Manuscripts and Letters Via Integrated Networks in Europe.LEAF A follow on project of MALVINE. Linking and Exploring Authority Files.
http://www.crxnet.com/leaf/ Cross database searching of authority and bibliographic records in OPAC's AND other databases, e.g. Saur. Some is based on user searching and hence not authorized form. They are just starting, but needs watching. The way IFLA is looking at names - different forms of names linked together when it’s recognized as being the same person or corporate body.AFNOR Cross community cultural heritage effort involving French libraries, museums, archives, documentation centers, etc. It is sponsoring a group of experts to study authority metadata. They found they not only didn't have common understanding of terms, they didn't understand what each other does. They realize this is a first step.
INTERPARTY Successor to Interoperability of Data in E-commerce Systems (INDECS). It continues to build a "directory of parties" for rights transactions in e-commerce. It proposed to build a register of unique identifiers for personal and corporate bodies relative to rights transactions. It has partners like BookData, EDItEUR, British Library, Royal Swedish Library, KOAISTO (sp?), IFLA.
MLAR Completed in 1998 Minimum Level Authority Record. Involved essential authority data elements exchange.
http://www.ifla.org/VI/e/p1996-2/FRANAR
Functional Requirements of Authority Numbers and Records. It is a conceptual model for authority data. It builds on the FRBR model - why we do what we do. "number" - do we need a number like and ISBN for a person's record?FRANAR extensions to FRBR
|
Real world |
<-> |
FRBR |
<-> |
Access points in authority records |
|
|
|
|
|
|
|
Individual |
|
Persons |
|
|
|
Group |
|
Family |
|
Heading |
|
Event |
|
Corporate body |
|
Target |
|
Content |
|
---- |
|
Tracing |
|
Object |
|
Work |
|
|
|
Concept |
|
Expression |
|
|
|
Time |
|
Manifestation |
|
|
|
Place |
|
Item |
|
|
Tony Olson, Northwestern University. Integration of Information Languages and Interoperability.
Indexing languages
Natural languages
Information languages
Class systems
DDC
LCC
Controlled vocabulary
Thesauri
MeSH
ATT
Subject heading lists
LCSH
Rameau
By their very nature, controlled vocabularies are incompatible because they are designed for specific communities, and hence not designed to be compatible.
Problems.
1. Conflicts in cross-references in one vocabulary and and headings in another
2. No links or references between established headings in different controlled vocabulaires
3. Differences in syntaxes of construction of headings
4. Pre-coordinated vs. post-coordinated.
5. Differences in semantic relationships, e.g. one in LCSH may be five in MeSH
6. Identical headings in multiple vocabularies in one OPAC -- this his heightened with web and Z39.50 access of many libraries' OPAC's; hence there are groups studying how to address the problem.
UMLS Integrates 60 biomedical vocabularies and classifications. It includes syntatic definitions and semantic definitions used.
MACS Incorporates three languages, maintaining autonomy of the headings, using mapping. The database is outside of the three authority files.
HW Wilson has mapped 12 different vocabularies into a single metathesaurus.
One can also use a reference language, e.g. Dewy DC is the core, then corresponding terms link back to it, e.g. MeSH, LCSH, etc.
HILT maps to LCSH, Unesco, AAT, DDC, UDC - mapping offered the best option. Mapping service should be set up to link to DDC as the backbone.
Renardus is local class systems mapped to DDC and can be cross searched and/or cross-browsed.
LCSH/MeSH. Each vocabulary remains autonomous. Instead of a separate database to link, mapping (linking) is done with data in the authority records. This could apply to more vocabularies and other databases beyond OPAC's.
Northwestern Project. The project establishes equivalencies between terms (i.e. maps information languages by humans). It utilizes mapping of data to integrate information languages in merged catalogs, databases or other information systems. They have developed a combination of machine and human work. 7XX (750's and 788 (complex linking entry data)) fields are used as linking to 1XX fields in the other thesaurus. They continue to update it.
Example
150 -0 Breast $x Cancer
750 -2 Breast neoplasms
150 -0 Art therapy for children
750-2 $8 $w b $a Art therapy
750 -2 $8 $w b $a Children
788 -2 $'i' Search also under …. $a Art therapy $'i' and #a Children
One of the most difficult things to address is the one to multiple relations. They decided to make one mapping because if you pick the right level, that record will have NT and BT the user can pick from in the syndetic structure. The syndetic structure is not explicit because of the tree strcuture included in 072 ($a subject category code and $b subject category subdivision). Gary Strawn wrote software to correlate back to the 072. Syndetic structure is clear in the "red books" because all references are listed there, but LC doesn't distribute the records fully; and not all subjects have the possible syndetic structure outlines.
Another problem is to integrate in the OPAC - enhancements need to be done yet: 1) index the 7XX entry fields in the authority records (no system currently does this, hence they can't provide the see reference and can't do catalog maintenance), 2) provide see also references between corresponding headings, 3) deal with duplicate retrieval from identical headings, 4) resolve conflicts between references in established headings. No one yet (their vendor isn't paying any attention - Endeavor) so they have nothing yet to present to their users. Until then they can do research on user behavior and benefit.
Other uses of Northwestern's work. Mapping data could be used to find headings in opposite thesauri. It could be integrated into UMLS or HILT or other efforts to aid in development of reference languages. They hope to have files available at Northwestern. It is still hard to do anything because their bib records are stored in binary large object format (BLOB)
9:00P-12
Hyatt Regency -- Room ??Indigo Girls -- Scholarship Bash
MONDAY--June 17
8:00-12:30
Renaissance Atlanta – Z Atlanta BallroomCC:DA -- Liaison
12 & 13. MAGERT proposals a) capitalize E in earth and b) add a comma after cm. in the SMD (MARC 300) to clarify for visual understanding in the OPAC. Approved.
14. Rule 7.0B discussion. The proposal tries to address ambiguity for chief souce for videos and motion pictures with the intent of reducing duplicate records in utilities and make it clearer for users. There were concerns about the negative sections - prefer positives. The glossary terms suggested don't all relate back to the proposal and if something is generally found in a specialized dictionary in the field, the word shouldn't be in the AACR glossary. The proposal was withdrawn and they will bring it back later.
15. Task Force on Appendix of Major and Minor Changes. They are going to publish a separate document. They need to give ALCTS right of first refusal. If they aren't interested, Barbara Tillett offered LC. Communities have until Nov. 15 to comment. This is really "when to create a new record" rather than "major and minor changes". Approved.
16. Report of the Task Force on Specific Characteristics of Electronic Resources. Along with removing area 3 for computer files, it is also appropriate for moving and sound items along with also using conventional names, e.g. 1 sound file (mp3). Again there is the issue of whether to put it in area 5 and/or area 7. The cartographic community still has a need for putting area 3 stuff in area 5. We have now brought this up for ch. 3, 6, 7, and 9 - what about ch. 8? We seem to be merging smd and conventional terminology. It is tricky when using other chapters because if it is electronic, we are to ALSO use ch. 9. The cartographic community (ch. 3) have always stated content and carrier information in are 5 (MARC 300) e.g. 1 map on 3 sheets. We are still waiting for comment from JSC.
17. Task Force on Consistency across Part I of AACR. Work has been done on areas 2 and 3. Work on 5 is beginning. Problem so far is consistent numbering of rules from ch. 1 throughout the rest of the rules. There was a proposal that a new numbering system be developed. Matthew Beacom will take it to JSC as a "heads up". The existing task force will look at how much consistency there is across rules as well as duplication across chapters. For 250 edition statement, they are going to be matter of fact; if there is a statement on the item, transcribe.
8. Program in Toronto. Pre-conference on Knowledge with Boundaries.
19. Program on FRBR approved by ALCTS and Program Committee. Logistically the time should not conflict with CC:DA or MARBI (that leaves SAC) Speakers will include Barbara Tillett and Matthew Beacom. They may publish the proceedings.
21. Webmaster - John will include FRBR stuff.
10. It was moved that the dates proposal to include dates of incumbency in them parenthetically for heads of state main entries (24.20C1) for their documents be accepted in substance but that it be optional. Passed.
22. How to Submit a Rule Change Proposal to CC:DA was approved with minor modification and will be forwarded to JSC.
23. CC:DA Procedures. Decided we need one month lead time for proposals. Approved and will be forwarded to CCS.
25. Review of ISBD Review Group task force document on the use of multiple ISBDs. It was moved that a task force be formed to address multiple ISBD work being done by IFLA. ISBD(G) never says to look at any of the other ISBDs. The GMD (MARC 245 $h) wording also needs to be addressed, e.g. (content/carrier/form of issuance) ?
Joint meeting with MARBI on regarding Functional Requirements of Bibliographic Records (FRBR) and MARC21
Tom Delsey discussed his Functional Analysis of the MARC 21 Bibliographic and Holdings Formats <
Glenn Patton discussed the rules and FRBR. There has been some discussion, research, etc. that in a way brings us back to the "catalog" and relationships that we included with filing rules and cross-reference cards, etc. "It's the relationships, stupid!" This includes journal title changes, edition change notes, added entries, uniform titles, linking fields especially in journals, etc. The relationships have not been made an integral part of the online catalog - we assumed they would help the user. OCLC studied Harry Potter and was able to identify fairly easily relations, pick out identifiers, etc. They found older materials cause problems from brief cataloging, inconsistent use of uniform titles, and discontinuance of 7XX fields for other names of people contributing in some way to the expression/manifestion, i.e. roles and functions. The attribute form in Z39.50 - our GMD's and perhaps SMD's - suddenly become more important to correctly create searches and obtain correct retrieval sets. The way we catalog is expression/manifestation. The way AACR is arranged focuses on description, rather than access leaving references until the last chapter. FRBR will require that in your head you create and expression level record to determine access points or check for possibly related records.
Tom's response to - why do the analysis? It is an exercise in semantics (meaning or relationships of meaning). At one level, it is a simple way to understand data in the format. It has been evolving for 30 years and our understanding has continued along as we've added formats, communities, conventions. We have moved from the card catalog which is what was created to produce OPACs and how interoperability with outside databases and providers (commercial vendors e.g. Gale and publishers e.g. ONIX) and communities (e.g. museums). There is interest in importing/exporting records with others that have different structures; we need to be able to explain to "others" and stand honestly behind it. It is a semantic exercise to deal with the issue of migrating to new data structures, whether to a new ILS or new conceptual structures.
Conclusions. There is a substantial correspondence between MARC and FRBR. There are 2300 data elements in MARC, excluding LDR and tag numbers themselves. 1200 of them mapped quite directly. 200 data elements could be mapped to FRBR, so he extended FRBR. 150 "wild cards" were too loosely defined to map to anything via machine; most could be intellectually mapped. Some values apply to multiple points (indicated by wavy equal sign in the document). He just jammed some things into subfields which part of the text fit into but the rest didn't. For some, it was hard to tell from the MARC documentation for the field if it's where the text really belonged. Those fuzzy things will be problematic in any conversion/migration/FRBR-ization. He had to add entities to "work" that weren't in FRBR originally but the data is in a MARC record, e.g. Project. Relationships could be expanded extensively but it wasn't done and there doesn't seem to be room in MARC for all those relationship types. The 006/008 to structure classes of materials is hard to deal with logically. The headings in MARC do fairly well reflect AACR so sometimes it does transit to FRBR entities, but there are times it doesn't. We don't do a good job of articulating relationships.
Discussion. (J Attig) - The concept of authority records needs to be added to discussion of FRBR for modeling. (J Esply) - We need more clarification as to when an expression a new one. We need guidance on 1st paperback or 1st American which don't appear different according to FRBR, although 1st edition seems pretty clearly different. Version is an attribute of the expression. Edition is an attribute of the manifestation. It is only a state that may or may not reflect anything about the content; it may just assist in identifying the expression. What about a movie and the script used by the actors? Journals title changes - are they, aren't they; very pesky is things like Field and Stream mid-west edition vs. southern edition. VTLS did try pulling data via programming and it worked sort of, but he did have to hand manipulate the records. The boundaries around express. (T. Delsey) - The boundaries around expression will probably end up being defined differently. "Work" is not well dealt with in AACR2 once you go beyond the one author/one title simple work/expression. Manifestation is really addressed mostly in AACR. We have to remember differentiation has to be meaningful or desired by the user. (J Attig) - We have to decide how "right" we're going to do this. We need research; we need to decide what we're trying to do.
An article Glenn Patton will send out expresses concern about things list variations in performance even by the same performer - over time it changes. One of the expression level's roles is to be a display organizer for the user. Then do you organize a musical recording by date, by type of format, by version, or do we let the user do something? (M. Turitz) - are we trying to collocate together with uniform titles or are we trying to differentiate titles? In books they usually try to pull books together, but in journals we use them to differentiate. Some uniform titles are addressing the expression level and others are at the manifestation level or even item (e.g. Bodleian Library. Manuscript $n 32) on a shelf. (P Wiess) We also have to remember other things we do with our OPACs. We should be able to prevent a user from even thinking of placing a hold on a rare book; assist ILL in selecting "any" or a "specific ed."; acquisitions/collection development - which one do we want or is any okay?; preservation of what? - the expression or manifestation. Do we create some other field/authority record? a new citation field? Can we do something so our displays don't appear to diminish the role of a second author or have a title display with the first author even though you searched the second author. The Format Variation Working Group is looking at a "citation" uniform title or author/title. Work and expression and manifestation are relative concepts and communities need to define them and we probably need to find ways to map once we have definitions and constructs for each.
2:00-4:00
Congress Center - B313BSAC
-- General meetingLC Action Plan Report (Diane Dates Casey). See handout. Updated information and presentations, etc. will be on LC's page in a few weeks. Up to 2003 the goal is to develop a plan. In 2003 they play to start programming.
Reference structures. Report to be turned in at Midwinter 2003. They are developing a set of recommendations for systems designers, people working to modify their systems, or for people to request of their vendors: a) reference structures, b) navigation, c) display, d) modifying existing practices. The program in Toronto will include recommendations and implications. Canadian records already have 7XX fields to link between languages. How ought the guidelines be published? - LRTS might be a place to put a paper both in print and on the web.
Fiction. Partly they are trying to set up "best practices" guidelines for application of GSAFD headings. Question: does SAC continue to be the sponsoring agency? Guidelines will include discussion and application of issues which is not really guidelines: introduction, subject approach to individual works of fiction, genre approach to individual works of fiction, analyzing in comparison to other major related thesauri. Their charge says "recommendations" not guidelines. David thought if the subcommittee had a problem with wording of the charge, it could be dealt with.
Subject Analysis Training Materials. The group is working closely with the Library of Congress's PCC SCT Task Force on Subject Analysis Training Materials. They have an outline for course content. Check <
http://darkwing.uoregon.edu/~chixson/subj/subject.html > They are following the format of the tools the serials training group is using. The program will show how PCC and ALCTS can cooperation and who a part of the training to entice practitioners to want the whole training.Semantic Interoperability. Shelby (see above)
David brought up a topic about how individuals actually internally do subject analysis and classification activities (in daily life), hence how does that affect what they DO when they come in a library. Do users approach a catalog in a logical way? (S Harken) - another ALCTS Subcommittee on Metadata found that users are at once, an expert and a novice, e.g. they may know their subject but don't know your catalog. Much of the talk of FRBR thus far has included mention of the need for empirical research. (David) - Does cultural background have an effect. Do they apply common sense? Would semantic interoperability help? Does that help get them the vocabulary that most matches their need? Maybe it's not "logic" but orientation both by creators of subject/classification and by users. Check for Hope A. Olson's work in general and for IFLA's Satellite program in Dublin, Ohio 2001. How does the users use of core record vs. catalogers use differ? At PCC there was a study but it only asked if the basic record was okay. But a user wouldn't know what was left out. How do we deal with local needs vs. internationally shareable records? (A. Taylor) We can't catalog, classify, assign subjects for each expert and each novice user - all we can do is hope for the best subject/classification we can do in a controlled way (i.e. manageable way)
Arlene mentioned a book by Alenka Sauperl, Subject Determination During the Cataloging Process.