PROJECTS ADDRESSING OR RELATING TO INTEROPERABILITY ISSUES
(compiled by Lois Mai Chan and
Marcia Lei Zeng)
Subject coverage: no limit
Types
of
Languages involved: English
Website: http://www.alexandria.ucsb.edu/~gjanee/thesaurus/specification.html
Status: operational, prototype
Subject coverage: alternative and complementary medicine
Types
of
Languages involved: English, French
Website: http://circe.slis.kent.edu/mzeng/tmshome.html
Status: prototype
In an international collaborative
project CAMed, a comprehensive resource for
complementary and alternative medicine (
Subject coverage: mathematics, physics, social science
Types of
Languages involved: German, English
Website: http://www.mathematik.uni-osnabrueck.de/projects/carmen/
Status: Research and development
CARMEN (Content Analysis, Retrieval, Metadata: Effective
Networking), a specially funded project within the Global Info German Digital
Library Project in
Subject coverage: general
Types of
Languages involved: English
Website: http://classweb.loc.gov/
Status: operational
For years, the Library of Congress of the
Finnish project, Finland
Subject coverage: general
Types of
Languages involved: Finnish
Status: operational
Himanka and Kautto (1992) reported their
work to convert assigned class numbers based on the Finnish abridged edition of
UDC into General Finnish Subject headings
(GFSH). First, a dictionary is created
that maps UDC numbers to subject headings.
Secondly, the dictionary is mechanically applied to convert the
bibliographic databases.
HEREIN, Council of
Subject coverage: cultural heritage
Types of
Languages
involved: Spanish, French, English
Website of the project:
http://inf2.pira.co.uk/factsheets/inform/digicult/herein2.html
Status: operational through website:
http://www.european-heritage.net/sdx/herein/index.xsp
The HEREIN Project (European Heritage Information
Network on
cultural heritage policies) of the European
Heritage Net has developed an interlingua with no
direct reference to the terms or to the structure of any pre-existing
thesaurus. (http://www.european-heritage.net/en/index.html,
select Thesaurus) Most of the terms in
the thesaurus come from reports on cultural heritage policies in
In building the tri-lingual
vocabularies within the HEREIN Project, the thesaurus follows the five types of
equivalence relationships defined in the ISO (1985) 5964 standard: exact
equivalence; inexact equivalence; partial equivalence; single-to-multiple term
equivalence; and non-equivalence.
However, HEREIN deviates from the ISO 5964 standard in that it does not
designate a source language (Thérond, 2002).
The project originally involved
governmental services in charge of cultural heritage in six European countries,
but the network was later expanded. The www.european-heritage.net site was
opened in 1999 and was concerned with collecting information on national
heritage policies across
HILT
(High-Level Thesaurus Project),
Subject coverage: general and special
Types of
Languages involved:
multiple languages
Website: http://hilt.cdlr.strath.ac.uk/
Status: Research and Development
HILT (High-Level Thesaurus Project) Phase I was funded by
Joint Information Systems Committee (JISC) and the Research Support Libraries Programme (RSLP) of the
After completing a series of surveys on the main literature, stakeholders, and machine solutions and interfaces, a stakeholder workshop was held in 2001 for the purpose of reaching consensus on the best approach to address the issues. A clear consensus emerged that the “best way forward” was to establish a pilot mapping service. The proposed approach is to “map key schemes like LCSH, UNESCO, DDC, UDC (Universal Decimal Classification), AAT, and perhaps user and regional terminologies and local adaptations of standard schemes, perhaps using one of them such as DDC as the central spine of the approach” (Nicholson, Wake and Currier, 2001, Nicholson and Wake, 2003). HILT1’s conclusions are: many different subject schemes and practices were in use; cross searching by subject was considered of value to users and staff; and an online terminologies route map or TeRM that would map subject schemes to user terminologies and to each other was the preferred solution. HILT1 also concluded that there was a strong consensus favoring a project to create a pilot TeRM and investigate these issues. HILT Phase II moves Phase I process into the 'Pilot Project' stage, focusing on terminology and thesauri requirements at the collection level, but also bearing in mind the need to extend this in due course to the needs of item level retrieval. The initial illustrative TeRM would be based on the RDN terminologies, on terminologies available as part of the Wordmap taxonomies set, which include, in particular, a set of terms used by general internet users, and on selective subsets of LCSH, DDC, UNESCO, and AAT. OCLC will provide an LCSH – DDC mapping, and may also be able to provide a DDC to Conspectus subject headings mapping. The aim would be a selective mapping sufficient for the purposes of the pilot in the first instance – i.e. not a comprehensive terminologies map (Nicholson, 2002).
Subject coverage: general, medicine
Types of
Languages involved: English
Status: Operational
LCSH (Library of Congress subject headings) and MeSH
(Medical subject headings) often
co-exist in online public access catalogs (OPAC). In an attempt to facilitate cross-vocabulary
searching, the Northwestern University Libraries in the
In
mapping terms, one-to-one correspondence is preferred. These include identical/co-extensive
headings, main heading to main heading/subdivision, and main heading to cross
reference(s). Since not all terms can be
mapped precisely between the two vocabularies, various degrees of
correspondence or matching are also recognized.
These include one-to-two and two-to-one correspondences (Olson, 2003).
MACS (Multilingual Access to Subjects),
Subject coverage: general
Types of
Languages involved: English, French, German
Website:
http://infolab.kub.nl/prj/macs/
Status: operational
MACS is a European project designed to allow users to search across cataloging databases of the partner libraries in different languages: English, French, and German for the moment. The partners are: the Swiss National Library (SNL), project leader, the Bibliothèque nationale de France (BnF), The British Library (BL) and Die Deutsche Bibliothek (DDB). The project is running under the auspices of the Conference of European National Librarians (CENL). It aims "to provide multilingual subject access to library catalogs," by establishing equivalence links among three subject headings lists: SWD/RSWK (Schlagwortnormdatei / Regeln für den Schlagwortkatalog) for German Rameau (Répertoire d'autorité-matière encyclopédique et alphabétique unifié) for French, and LCSH for English. The method employed for mapping consists of comparing subject headings in three monolingual lists and checking the consistency of bibliographic records retrieved with these headings. The links were analyzed on three levels: terminological level (subject heading), semantic level (authority record), and syntactic level (application). For creating and maintaining link equivalences among the three vocabularies, a "link management" interface was developed. It contains a classification field currently based on about sixty broad domains. The use of a classification ensures the creation of homogeneous groups of headings by subject (Freyre and Naudi 2003).
Megathesaurus - H.W. Wilson Company,
Subject coverage: general and special
Types of
Languages involved: English
Status: operational
To facilitate multi-file searching
across
, multiple indexes simultaneously, or the
combined indexes in the multi-file OMNI
Index in a transparent manner, by using search terms based on any of the
source vocabulary (Kuhr, 2003).
Subject coverage: cultural heritage, art and architecture
Types of
Languages involved: English,
French
Website: http://www.culture.fr/documentation/merimee/accueil.htm
Status: operational
For the purpose of indexing
complexes, buildings, and built works described in the national database "Merimee" about the French Heritage, The Thesaurus of Architecture (Le thésaurus de l'architecture) was created and mapped to the Art and Architecture Thesaurus (AAT, http://www.getty.edu/research/tools/vocabulary/aat/,
published by The J. Paul Getty Trust) and the English Heritage Thesaurus (http://www.rchme.gov.uk/nmr.html,
published by The National Monuments Record (NMR)). When mapping from Merimee’s
Thesaurus of Architecture to the AAT and NMR, Boolean operators “AND” and “OR” are used to indicate the
equivalence, in addition to the conventional equivalence types (exact and
partial). (See statistics reported in Doerr, 2001 and http://www.culture.gouv.fr/documentation/thesarch/pres.htm).
MSC and DDC 510 schedule, State University of New York in Albany, USA
Subject coverage: mathematics
Types of
Languages involved: English
Status: Research
A project which maps the American Mathematical Society (AMS) Mathematics Subject Classification (MSC) to the DDC 20 edition Schedule 510 (mathematics) was conducted at the State University of New York in Albany, New York The mapping rules included: exact matches, specific to general, general to specific, many to one, cyclic mapping, no matches, and specific and broad class mapping. These mapping strategies are examined in an object-oriented, frame-based analysis for implementation in the expert system shell software (Iyer and Giguere, 1995).
Polish
Subject coverage: general
Types of
Languages
involved: English, Polish
Status: research
At the Institute for Scientific,
Technical and Economic Information (ISTEI) in
Renardus,
Subject coverage: general
Types of
Languages
involved: multilingual
Website: http://www.renardus.org/
Status: operational,
research
Renardus is an EU project (coordinated by the National Library of the Netherlands with partners from Denmark, Finland, Germany, the Netherlands, Sweden, and the UK) with the purpose of producing a cross-browsing feature based on the Dewey Decimal Classification (DDC) and improved subject searching across distributed and heterogeneous European subject gateways. The initial investigation included the use of classification systems by Renardus partners’ gateways, general mapping approaches and issues, the definition of mapping relationships, and information on technical solutions and the mapping tool. The approach adopted by the project is a harmonization process that maps local class schemes to a common scheme, thereby enabling users to browse a single subject hierarchy. DDC was chosen as the switching language and common browsing structure.
Each DDC class in Renardus presents links to "related collections" which enable the user to jump to the mapped classes in the participating local gateways and to continue browsing in the local classification structure there. In addition, a virtual browsing feature allows the merging of all local related records from all mapped classes into one common Renardus result set (Koch, Neuroth , and Day, 2003).
SAB and DDC,
Subject coverage: general
Types of
Languages involved: Swedish, English
Status: operational
In Sweden, a concordance between Klassifikationssystem för svenska bibliotek (SAB) 7th edition, the classification system used by the Royal Library as well as most university libraries and all public libraries, and DDC 21 was presented in year 2000 in the format of a booklet and an online database (IFLA, 2001:34).
Subject coverage: science, engineering
Types of
Languages involved: English, French, German, Russian, Spanish
Website: http://metadata.sims.berkeley.edu/GrantSupported/unfamiliar.html
Status: prototype, research
The project "Mapping Entry
Vocabulary to Unfamiliar Metadata Vocabularies" was conducted at the
UMLS Metathesaurus, National Library of
Subject coverage: medicine, health, biomedicine, and related areas
Types of
coding system, list of controlled terms
Languages involved: multiple languages
Website: http://www.nlm.nih.gov/pubs/factsheets/umls.html
Status: operational
UMLS (Unified Medical Language System), led by the National Library of
Medicine in the
Subject coverage: general
Types of
Languages involved:
English
Website: http://www.oclc.org/dewey/products/webdewey/about.htm
Status: operational
In WebDewey, produced by OCLC, DDC
numbers are linked to assigned LC subject headings in MARC records
intellectually or statistically where feasible.
Such linking facilitates particularly the subject cataloging and
classification process by requiring only the identification of either the
appropriate class number or subject heading(s) for each document. These links, however, do not appear in LCSH (WebDewey website, 2002).
REFERENCES
ADL Homepage.
Chan, L. M. & Pollard, R. (1988). Thesauri Used in Online Databases: An Analytical
Guide.
Chan, L. M., Childress, E., Dean, R., O'neill,
E.T., & Vizine-Goetz, D. (2001). A Faceted Approach to Subject Data in
The
Chemical titles. (1960 --).
Cleverdon, C. (1967). The Cranfield
tests on index language devices. Aslib
Proceedings, 19: 173-192.
Dachelet, R. (1997). Multilingual querying and multilingual thesauri in Aquarelle, Technical Report, INRIA-Aquarelle, March. (Indirect source, see Doerr 2001).
English Heritage. (1999). National Monuments Record Thesauri
homepage. Available:
http://www.rchme.gov.uk/thesaurus/thes_splash.htm (Last accessed
Ferrari,
R.C. (1999). The art of classification: alternative classification systems in
art libraries. Cataloging and Classification Quarterly, 28(2):73-98.
Foskett, D.J. (1980). Thesaurus.
In: A.Kent et al. Ed.: Encyclopedia of Library and
Information Science, Volume 30, pp.416-462.
Freyre, E. & Naudi,
M. (2003). MACS: Subject access across
languages and networks. In: I.C.Mcllwaine ed.: Subject
Retrieval in a Networked Environment, Proceedings of the IFLA Satellite Meeting
held in
Garfield,
E. (1955). Citation indexes for science. Science,
(122)3159:108-111.
Getty Research Institute. (2000). Vocabulary Databases. Available at:
http://www.getty.edu/research/tools/vocabulary/ (Last accessed
Gilreath, C.T.
(1992). Harmonization of terminology—an overview of
principles. International
Classification 19(3):135-139.
HEREIN 2 website. European Heritage Network. Summary: Project Facts &
Consortium Info. Available: http://inf2.pira.co.uk/factsheets/inform/digicult/herein2.html
(Last accessed
HILT. (2000). HILT:
High-Level Thesaurus Project Proposal.
Available: http://hilt.cdlr.strath.ac.uk/AboutHILT/proposal.html. (Last
accessed
Himanka, J., and Kautto,
V. (1992). Translation
of the Finnish Abridged Edition of UDC into General Finnish Subject Headings. International
Classification 19(3):131-134.
Hudon, M. (1997). Multilingual thesaurus construction: integrating the views
of different cultures in one gateway to knowledge concepts. Knowledge Organization 24(2): 84-91.
IFLA (International Federation of Library Associations and
Institutions). (2001). Section on Classification and
Indexing. (2001) Newsletter Nr.24,
December 2001.
ISKO (International Society for Knowledge Organization). (1995). Recommendations
of the Research Seminar on Compatibility and Integration of Order Systems,
organized by the International Society for Knowledge Organization (ISKO) and
the Society for Professional Information (TIP), Warsaw, Poland, Sept. 13-15,
1995. ISKO Press Release,
ISO (International Organization for
Standardization). (1985). Guidelines
for the establishment and development of multilingual thesauri. ISO 5964.
ISO (International Organization for
Standardization). (1986). Guidelines
for the Establishment and Development of Monolingual Thesauri. ISO 2788.
Iyer, H. & Giguere,
K. (1995). Towards
designing an expert system to map mathematics classificatory structures. Knowledge
Organization 22(3/4):141-147.
Janée, G.,
Ikeda, S., & Hill, L. L. (2002). The ADL Thesaurus Protocol.
Jouguelet, S. (1995). Evolution of subject indexing practice
in
Koch, T., Neuroth, H., & Day,
M.
(2003). Renardus: cross-browsing European subject gateways
via a common classification system (DDC).
In: I.C.Mcllwaine ed.: Subject Retrieval in a Networked Environment, Proceedings of the IFLA
Satellite Meeting held in
Kuhr, P.S. (2003). Putting the world back together:
mapping multiple vocabularies into a single thesaurus. In: I.C.Mcllwaine
ed.: Subject Retrieval in a Networked
Environment, Proceedings of the IFLA Satellite Meeting held in
Kunz, M. (2002). Sachliche
Suche in verteilten Ressourcen: ein kurzer Überblick über neuere Entwicklungen [Subject retrieval in distributed resources:
a short review of recent developments] Paper presented at 68th IFLA
Council and General Conference, Aug. 18-24, 2002, Glasgow, UK. Available:
http://www.ifla.org/IV/ifla68/papers/007-122g.pdf English translation available:
http://www.ifla.org/IV/ifla68/papers/007-122e.pdf (Last accessed
Lancaster, F. W., & Warner, A.J. (1993). Information
Retrieval Today.
Library of Congress
Cataloging Distribution Service. (1999)
MARC 21 Format for Bibliographic Data Including Guidelines for Content Designation).
Library of Congress Thesauri website. Available:
http://www.loc.gov/lexico/servlet/lexico/tgm1/brsearch.html (Last accessed
Luhn, H.P. (1959). Potentialities
of Auto-Encoding of Scientific Literature. Technical
report RC-101.
Luhn, H.P. (1961).
The automatic derivation of information retrieval encodements from machine-readable texts. in A. Kent ed.: Information Retrieval and Machine
Translation. Vol.3, Pt 2, pp. 1021-1028.
NAS
(
Nicholson,
D., Wake, S., & Currier, S. (2001).
High-Level Thesaurus Project: investigating the problem of subject
cross-searching and browsing between communities. In Global Digital Library Development in the New Millemnnium:
fertile ground for distributed cross-disciplinary collaboration, edited by Ching-Chih Chen.
Nicholson,
D. (2002). Subject-based interoperability: issues from the High Level Thesaurus
(HILT) project. Paper presented at 68th IFLA Council and General Conference,
Nicholson, D. & Wake, S. (2003). HILT: subject retrieval in a distributed
environment. In: I.C.Mcllwaine
ed.: Subject Retrieval in a Networked
Environment, Proceedings of the IFLA Satellite Meeting held in
Niehoff, R.,
& Mack, G. (1985). The Vocabulary Switching System: Description and Evaluation Studies. International
Classification, 12(1):2-6.
NKOS (Networked Knowledge Organization Systems). (2000). Taxonomy
of Knowledge Organization Sources/Systems. Draft
NLM (National
Library of Medicine). (2003). Fact Sheet: UMLS ® Metathesaurus
® Last
updated:
NLM (National
Library of Medicine). (2001). Fact Sheet: UMLS ® Semantic Network. Last updated:
Noy, N.F., & Musen,
M.A. (2001). Anchor-PROMPT: Using
non-local context for semantic matching. Workshop
on Ontologies and Information Sharing at the
Seventeenth International Joint Conference on Artificial Intelligence
(IJCAI-2001), Seattle, WA, 2001.
Available: http://smi-web.stanford.edu/pubs/SMI_Abstracts/SMI-2001-0889.html
(last accessed
Olson,
T. (2003). Integrating LCSH and MeSH in information systems. In: I.C.Mcllwaine
ed.: Subject Retrieval in a Networked
Environment, Proceedings of the IFLA Satellite Meeting held in
Riesthuis, G.
J.A. (2003). Information languages and multilingual
access. In: I.C.Mcllwaine
ed.: Subject Retrieval in a Networked
Environment, Proceedings of the IFLA Satellite Meeting held in
Schweitzer, A. (1995). Subject access to
library materials in
Scibor,
E. & Tomasik-Beck, J. (1994). On the
establishment of concordances between indexing languages of universal or interdisciplinary
scope (Polish Experiences). Knowledge Organization 21(4):203-212.
Sparck Jones, K. (1981). Retrieval system tests
1958-1978. In: K. Spark Jones ed.: Information
Retrieval Experiment.
Sparck Jones, K. (1997). History.
In: K. Sparck Jones and P. Willett ed.:
Taube, M. and Associates, (1953-1959). Studies in
Coordinate Indexing,
VRA (Visual Resource Association). (2002).
VRA Core Categories. Version 3.0. A project of the Visual Resources Association
Data Standards Committee, last modified on
WebDewey website. (2002). Available:
http://www.oclc.org/dewey/products/webdewey/about.htm (Last accessed
WordNet
1.7.1 Database Statistics. (2002). In: WordNet
1.7.1 Reference Manual. Available:
http://www.cogsci.princeton.edu/~wn/man1.7.1/wnstats.7WN.html (Last accessed
WordNet Homepage. Cognitive Science Laboratory,
Zeng, L. (1992). Achieving
compatibility of indexing languages in online access environment. in A. Kent ed.: Encyclopedia of Library and Information
Science, vol. 50: 1- 24. NY: Marcel Dekker, Inc.
Zeng,
M.L., & Chen, Y. (2003) Features of an integrated thesaurus management and
search system for the networked environment. In: I.C.Mcllwaine ed.: Subject Retrieval in a Networked
Environment, Proceedings of the IFLA Satellite Meeting held in