Monday, November 03, 2008

Picture this - a way to improve the quality of our serial collections

This thought exercise began as I contemplated what visualizations libraries could possibly add to their websites as visualizations are all the rage now, don't you know. I was wondering if there was a logical place where it could be useful to add a sparkline or two.

So I set myself a challenge to develop a graphic that could help students decide which of our research tools would best suit their research need. So I created a graph comparing JSTOR and BioOne with an x axis of 'year' and a y axis of '# of titles'? Its pretty crude and far from a sparkline but here it is:

My ugly graph

While this graph does convey a potentially useful snapshot comparison between these two products, I do have one major reservation with endorsing the use of such a visualization: I don't want to imply that having a higher number of titles means that one research product is better than another.

What comes to mind is the research paper "Aggregated Interdisciplinary Databases and the Needs of Undergraduate Researchers" that I think didn't get the play it deserved in the library blogosphere:

Amy Fry, Julie Gilbert and I just published an article in portal (a self-archived copy is here) that had some surprising findings about the long tail in aggregated interdisciplinary databases: looking at use of one of the market leaders at 14 largely undergraduate institutions, 4% of titles accounted for half of downloads, and these were largely popular titles; articles in 40% of full text journals were not downloaded even once at all 14 institutions. We also found that, in aggregate, the number of articles downloaded fell from 2005 to 2006 by 10%, even though the database itself was growing. Curiously, a survey of librarians show they think these growing databases are about the right size and that more full text would be an improvement. Is more always a better investment? Really? [ACRLog]

No, more is not always a better investment. But its an easy sell and sadly, librarians are unthinkinly buying into it.

Here's an alternative that came to mind. Let's apply a new limitation to our online serial collections. We can get rid of the filler out of our indexes and aggregators by setting a minimum threshold by circulation count. A library could say that they only want to pay for titles that had over 12,000 readers. And voila! Those annoying regional newsletters that clog up your search results when you are searching a business database? Gone!

Yes this is decidedly populist approach and against the recommendations by Fister et al in their paper. But consider this - such a limitation would have many additional benefits. First, it would be a strike against the Cambrian explosion of new academic titles for every microtopic and splinter school of thought. And some existing fields could stand to have fewer journal titles to improve the quality of the work of the whole. For example, according to Ulrich's there are almost 200 scholarly titles dedicated to library science published in the United States alone - and this doesn't include the open access journals that are now find online.

Secondly, a circulation statistics filter provides a useful means of selling one product (with a moveable threshold) to many libraries rather than having to create a multitude of different research products to meet the needs of all the disparate library communities in the world. In this way, online collections would mirror their print ancestors: smaller, more general educational institutions have libraries with smaller, more general publications while larger research libraries have the budgets to subscribe to many more smaller 'niche' titles.

Yes, there are obvious problems with filtering by circulation statistics and no, I am not a populist myself when it comes to my reading choices. But until someone else comes up with a reasonable alternative, the sparklines that indicate number of titles paid for but never looked at is going to go up.

3 comments:

art said...

Is another option, heretical as it may be, to implement some sort of "pay per use" for the less used titles instead of preemptive licensing of content that rarely gets used? Ingenta, at least, has titles that require an ariel endpoint and there seems to be ways of using a common account. The pricing per article in this scheme can be quite high (I seem to remember an average of $30 US a few years ago), but I suspect this is still a far cheaper approach for many titles than direct subscriptions.

Mita said...

I dare say that a 'pay per use' scenario is much more likely than the one that I thought of.

It wouldn't surprise me at all if the for-profit scholarly publishers introduce a 'pay per use' structure during these times of fiscal constraint in academia.

art said...

I would still love to see an associated visual in the mix, Jon Udell has a great example on where oil comes from to supply the United States, it would be so cool to have some sort of equivalent for where article downloads come from. I think we even have a lot of this data from our resolver.