Wednesday, July 28, 2010

Format, Philosophy Of

I've been working with Clair Meade on how we've mapped formats in Horizon to Summon and have learnt things that make me ponder.

In advanced search Summon lists a bunch of formats you can limit by that our collections do not use. If a user selects one of these they would get zero hits, not because we had nothing that matched their requirement, just that we hadn't applied the same format name.

The format names are open to interpretation, overlap and ambiguity – for example ‘Government Document’ and ‘Report’ and ‘Web Resource’ could all be applied to the same document, but we can only map one to an item.

I’ve realised that many of the formats we haven’t used are listed because they are used in the ‘Beyond your library’s collection’, but as that’s not the default search target I smell misled clients. I've suggested via the Summon Clients list that the ‘Beyond’ box be above the format dropdown and the dropdown be populated with available formats live depending on whether ‘Beyond’ is checked.

What is a Format?

On the Summon Clients list here was some discussion about whether an ebook is a separate format or just a book that’s available fulltext online, and that's the approach Summon takes to journal articles (no distinction made between e and print). Obviously anything can be digitised so every traditional format will have ‘online’ equivalents.

See image in situ at
I think the basic problem is one of philosophical approach. We (‘we’ being librarians overseeing the profession in the evolution from physical to virtual information containers) have accidentally blended two approaches:

  1. The format of the information content (a picture, dictionary, moving image with sound)

  2. The format of delivery (for a picture that could be poster, painting, 800x600px gif; for a dictionary that could a book, a web site (OEDOnline), or an ebook (from say Credo Reference); and for moving image with sound that could be Super8 film, videocassette, DVD, streamed media, or a download of an MP4 file)

I guess I’m wondering how many of our users will be happy to know that limiting to ‘journal article’ will exclude web resources, government documents, and reports that have the same ‘content size and structure’ as a journal article.

I worry that we use one field to describe two very different facets of a piece of information.

Has anyone else pondered this and decided on an approach?


Alan @JCU Library said...

Jon Rochkind has talked in greater depth and more brains on this in his blog "print" format limit in a MARC-based catalog

bibwild said...

Has anyone else pondered this?

Oh yeah, librarians have been pondering it for quite a while. I am too lazy to go try and look up literature at the moment, but there have been numerous articles written on it and cataloging task forces attempting to address it.

It turns out it's possibly even more complex than your two-dimensions. The RDA effort wound up with THREE dimensions: 1) "Content type" (like your "format of information content", for instance "spoken word" or "text"), 2) "Carrier type" (kind of like your 'format of delivery', for instance 'audio disc' or 'scroll', but not sure if 'online' is one or not) 3) "Media type" (um, this is supposed to be another aspect of delivery mechanism, but ends up reading more like content: Audio, video, etc.).

The RDA effort comes closer to actually rationally identifying formats on indepenent axes, but as a result:

1) It is nowhere NEAR user's actual internal mental models of this stuff, because user's internal mental models are _not rational_, and

2) It STILL doesn't capture some of the stuff we're talking about here, it doesn't distinguish between an 'article' and a 'book' (both are text, perhaps on paper or perhaps online) -- that aspect of article vs book is more like a sort of "genre", yet a fourth dimension!

This stuff is tricky. Precisely because the concepts actual humans use actually are bundles of cross-cutting dimensions, not a rational taxonomy -- and because our concepts in a state of flux right now too! 10 years ago, a "book" was something made out of paper and bound. Now... what exactly is the difference between an "ebook" and a really long "web page"? Got me, except for what people consider it, and they'll probably consider the same thing different in another 10 years.

bibwild said...

Oh, and I'd dispute Alan that my original blog post has either more depth or more brains than yours.

But this is something librarian theorists have been pondering for a while -- too bad our library science literature is so hard to actually find! (And that there's been a dearth of it the past 20 years; librarianship has somehow lost it's theoretical arm). But this would be a great topic for some library student (or librarian!) to write a paper on, the _history_ of librarian examinations of controlled vocabulary for form/format/carrier/media/content type/genre.