ALCTS - Association of Library Collections & Technical Services

Committee on Cataloging: Description and Access

MARBI (Machine-Readable Bibliographic Information), ALCTS/LITA/RUSA

Joint meeting with CC:DA

July 8-10, 2000, Chicago, IL

Monday, July 10, 2000, 2:00-4:00 p.m.
Hyatt Regency Chicago, Grand Ballroom A

  1. XML and MARC: A Choice or a Replacement?: Dick R. Miller (2:00, 30 min.)

  2. Background statement:

    In Apr. David Dorman cited Lane Medical Library's (Stanford University) XMLMARC conversion software (announced in mid-Feb.) under the header "The End of MARC?" Earlier, in Nov. 1999, Dorman's "Making Progress, Part Four" elaborated more fully on why XML (or more precisely a suite of XML DTDs or schemas) will replace MARC. At about that time, the NLM was completing plans to convert the huge MEDLINE database to XML, a French government agency was nearing release of BiblioML to convert UNIMARC to XML, and Lane was putting the finishing touches on XMLMARC. These are indicators of a growing trend, beginning perhaps as early as LC's literal mapping of MARC to SGML from 1995-1998, followed by work in Hong Kong and Australia and other commercial mapping software. Perhaps common to all is the recognition of the limitations of the MARC formats in permitting effective deployment and integration of bibliographic data with other resources on the Web. Lane's investigation differs in advocating changes to MARC to take advantage of XML's strengths-- a permanent change to XML rather than another version used as an adjunct to "real" MARC.

    Dorman's citation of XMLMARC elicited lively discussion on several listservs. Since a misquote occurred, it is important to note that XMLMARC was developed partly as a feasibility study for converting MARC data to XML, but also to explore ways in which cataloging data could be restructured for greater economy and elegance, perhaps lessening recent quality erosion. The speaker received an invitation to write an article on XML for Library Journal's NetConnect supplement, which is scheduled to appear in conjunction with ALA. This article advocates not only XML replacement of MARC formats, but also XML replacement of proprietary "library information" formats used by ILS vendors (e.g. ILL, patron data, circ transactions, orders, checkin data) and predicts an XML-based ILS in the near future.

    The speaker believes that it is possible to recast MARC, leveraging untold person-years of effort in defining content, identifying relationships, and resolving problems and conflicts, producing a more coherent and eloquent version using XML. This could add luster to librarianship, engendering respect for librarians and needed technical underpinnings at a time when the profession is facing external as well as internal challenges. Courage is all we need to exercise leadership and seize this unprecedented opportunity.

    To elucidate related issues the speaker will address topics listed below and/or others which may be identified in the course of preparation for the presentation.

    Issues to be discussed:
    1. XML's suitability as a universal data format for the Web
      • Open standards and extensibility
      • Separation of content, presentation, linking
      • Computer platform and software application neutrality, interoperability
      • Unicode and data longevity
      • database interfacing

    2. XML synopsis
      • Entities, elements, attributes
      • Hierarchical structure/nesting
      • Document Type Definitions (DTDs) and schemas
      • Adjunct standards: XSL (XML Stylesheet Language), intelligent hyperlinking
      • Library-oriented examples

    3. XMLMARC software
      • Flexible data conversion (any byte string, algorithms, concatenation, conditions, etc.)
      • DTDs for authorities and bibliographic records
      • Maps to permit alternate versions, localization
      • Free availability for non-commercial use with over 300 licensees in over 40 countries
      • Should it be free for commercial use? (two requests so far)

    4. MARC problems (unordered; subject to change)
      • Segregation of bibliographic data
      • Accretion; unnecessary complexity-- continued with format integration
      • Requirements (primary format), repeatability (fixed fields vs. form/genre/format), etc.
      • Fixed field limitations
      • Blurring description, access, and relationships
      • Mixing data values and data properties
      • Relationships (omitted linkage; limited effectiveness of linkage; co-occurrence; content notes)
      • Excessive vs. insufficient subfields
      • Arbitrary, lacking non-filing indicators
      • Character set issues

    5. Selected cataloging problem areas (unordered; subject to change)
      • Functional requirements for bibliographic records (FRBR)/AACR2 revision
      • Delineation of bibliographic and authority records; granularity
      • Relationship of indexing and cataloging
      • Bibliographic relationships, analytics, component parts, sibling conditions, versions
      • Form/genre/format and GMD
      • Data independence (precoordinated headings; generic titles)
      • Title as identifier: titles, uniform titles, editions, series, serials
      • Digital materials (databases, websites) as collections
      • Limited role of authorities; grouping vs. instances
      • Citation and metadata relationships
      • Current publisher/publishing history
      • Geo-political organizations, geographics, structures
      • Central vs. distributed control; coordination; "local" data
      • Access (types; structured indexes; words/phrases)

    6. Problems with XML

    7. What next?

    8. W3C as model?