“data about data”

In DITA (Digital Information Technologies and Architectures) we have been looking at how information is organised and retrieved. In this post I will explore the use of metadata and bibliographic frameworks in library technologies.

Metadata is often described as “data about data”, but if this sounds too vague it is helpful to think of it in terms of something familiar, like a book:

Jeffrey Pomerantz (2015), Metadata MIT Press

The information given on the cover of this book is metadata: the title, its author, an image related to its contents, and often a brief description. By looking at the cover of the book, we can know what it is about without having to read all of the information held within its pages. In the digital world, the word metadata is increasingly being used to descibe the data attached to digital resources. Metadata is very important in an information society, as it allows us to organise information and be able to access it when we want to. This is an essential aspect of library and information science, and so I believe it is important to think critically about the ways in which we organise data, and whether these can be improved upon for the internet age.

Libraries use metadata in the form of a bibliographic record, which acts as a surrogate for the book (or other item such as a journal article, DVD, or online resource). Traditionally this metadata would have been held on a catalogue card, and is now in the form of a digital record to be accessed using a computer.

Most libraries use a standard called AACR2 (Anglo-American Cataloguing Rules, 2nd ed.) to determine what information about an item is recorded. Another standard is RDA (Resource Description Format), designed to be compatible with and eventually replace AACR2, but its adoption has not been universal.

MARC (Machine Readable Cataloguing) is the format used to communicate this bibliographic data to the computer (MARC21 is the most commonly used version today).  MARC uses a 3-digit number to identify each field, e.g. 100 to indicate the author field, 245 for title, 260 for publication information and so on. (See here for more detail on MARC.)

Having these standards allows bibliographic information to be shared between libraries. Unless they are working with very rare materials, it is common for a cataloguer to copy over an existing record created by another librarian and make adjustments to it, rather than having to create a whole new record from scratch.

By learning about the history behind these standards, I am beginning to understand their advantages and disadvantages in a modern context. For example, MARC was developed in the 1960s by Henriette Avram at the Library of Congress. As it was designed at a time when computer processing was much slower and memory was much more of a concern than it is today, it is excellent at using a small amount of memory to create stable records, like catalogue cards. However, this also means that it is not ideal for expressing relationships with external resources such as webpages.

Dublin Core was designed to be a simple metadata standard for use on the web. This makes it easier for people not trained in cataloguing to use, but in simplification it inevitably loses a lot of the detail which cataloguers have achieved in AACR2 over the years. Sacrificing detail may save time, but also makes it harder to facilitate a more nuanced search that will fetch the most relevant results.

BIBFRAME is in the early stages of development, and is intended as a replacement for MARC. In addition to creating a new way for communicating bibliographic data,  the BIBFRAME Initiative is also investigating new ways of bibliographic description. It focuses on the relationships between resources, rather than the self-contained records that MARC is so good at. This is more in line with the developing culture of Linked Data and the Semantic Web. As described on their website, “It is designed to integrate with and engage in the wider information community and still serve the very specific needs of libraries.”

I am interested to see whether a new bibliographic framework such as BIBFRAME will be embraced by the library community. I think that this would help to keep libraries active in engaging with new technologies, and that it would like be exciting to be a part of!


Library of Congress guide to MARC

Dublin Core


YouTube video about BIBFRAME


4 thoughts on ““data about data”

  1. A well written post, showing that you have understood the concepts of library related standards for metadata. You have read around the subject, and considered how standards relate to practice, and what they mean for LIS professionals.

    I like the style of your blog too! These things take time and effort.

    Gook work 🙂


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s