Project Name: MACS (Multilingual Access to Subjects)
Project URL: http://laborix.kub.nl/prj/macs/
Project Description: “MACS aims to provide multilingual subject access to library catalogues. MACS enables users to simultaneously search the catalogues of the project's partner libraries in the language of their choice (English, French, German). The partners are: the Swiss National Library (SNL), project leader, the Bibliothèque nationale de France (BnF), The British Library (BL) and Die Deutsche Bibliothek (DDB). The project is running under the auspices of the Conference of European National Librarians (CENL).
This multilingual search is made possible thanks to the
equivalence links created between the three indexing languages used in these
libraries: SWD (for German), RAMEAU (for French) and LCSH (for English). Topics
(headings) from the three lists are analysed to
determine whether they are exact or partial matches, of a simple or complex
nature. The end result is neither a translation nor a new thesaurus but a
mapping of existing and widely used indexing languages.
On the basis of this approach, a prototype has been
developed by Index Data (Denmark) and Tilburg
University Library (Netherlands) which contains a small subset of data from the indexing
languages and the libraries' databases so that link creation and management and
subsequent searching can be explored and tested.” (MACS)
Provisional Checklist
Please
use the following form to evaluate semantic interoperability projects.
1. Types of
data being integrated
Does the project have:
|
(a) different
controlled vocabularies in same language?
(b) different
controlled vocabularies in different languages?
(c) different
classification schemas (e.g., DDC, UDC, LCC)?
If yes,
which ones?
(d) controlled
vocabularies combined with classification schemas?
(e) different
metadata framework schemas (e.g., XML, MARC, Dublin Core)?
If yes,
which ones? Subject
heading list in three languages English, French and German
(f) different
communication protocols?
(g) other:
|
2.
Autonomy and Integrity of Constituent Parts
|
(a) Is standardization, reconciliation,
or conversion of semantic data reversible?
(a.1) Can precoördinated
strings, once filtered or deconstructed for semantic matching, later be put
back together again?
(b) Is full complement of metadata and
indigenous subject hierarchies preserved?
If so, how? .
(c) Does project rely on principle of
least common denominator?
If so, many data sets
may be able to coexist in database, but given resulting stripped-down or ‘dumbed-down’ resource descriptions, the database may no
longer serve the interests of readers. (cf. recently cited problems with
Dublin Core20 How does the
use of least common denominator effect the quality of service?
(d)
How is data stored: gathered into a union catalog (e.g., American Memory
Project, NSDL), vs. distributed database?
Union catalogs
(e) How are metadata (including SI links)
stored? (e.g., via authority records, concordance tables, a central
switching language, semantic networks, lexical databases, semantic layers,
etc.) In
three subject heading lists
|
|
|
3. Reconciliation of heterogeneous vocabularies
|
(a) How is correlations established when a
single term in one source has no equivalent term in the other? through
linking
(b) Certain vocabularies are highly
structured and hierarchical, while others contain terms lacking any structure
at all aside from serial numbers or other unique identifiers. How are these
differences reconciled? not all differences are reconciled but
linking is used where possible
(c) How are conflicts
resolved when an established heading in one vocabulary matches a cross
reference in other vocabularies? (E.g., Tumors is an established LCSH
heading, but in MeSH it is a cross reference to Neoplasms; and vice versa) through
linking
(d) If multiple vocabularies are used in a
single bibliographic record, and the headings from such vocabularies
are identical (after normalization), how are duplicate retrievals handled? the
crossing link here is language not subject as idential
headings are unlikely
|
4. Effective
and Efficient Resource Discovery (Precision and Recall), Satisfying User Needs
|
(a) Does project provide high or
satisfactory levels of precision and recall?
unknown
(b) To what extent does project rely on precoördination?
If mostly post-coordinate, then:
i)
by what means is recall
maximized?
ii)
by what means is precision
maximized?
(d) Does project provide faceted approach
(facilitating polysemy) while retaining option for browsable hierarchy (facilitating navigation)?
(e) Are the following objectives and
functions supported in the S.I. environment?
i) Locate
entities in the system via surrogates (find)
ii)
Identify a surrogate that matches an entity (collocate)
iii)
Select an entity appropriate to a user’s need via surrogates
(choice facilitation)
iv) Obtain access
to the entity via the system and its surrogates (acquisition)
v) Navigate the
system and its surrogates (navigation)
(f) Has developer released beta version
for general testing?
(g) Have user satisfaction surveys been
conducted?
|
5. Ease of Use (this is actually part of our
definition, i.e., SI should function “without special effort by the user,”
(where “users” include information creators and managers, and end-users)).
|
(a) Intuitive interface for data entry,
searching, browsing, etc.?
(b) Automate validation, mapping,
metadata extraction, etc., as much as possible?
(c) Availability of documentation?
|
6. Long-term viability
|
(a)
Master plan for life-cycle management and data migration?
(b) Reliance on open-source international
standards versus proprietary standards?
(c) viable business model (e.g., not based
exclusively on research grant with likely expiration)?
|