New Jack Librarian: May 2011

Thursday, May 26, 2011

Zotero Why Before Zotero How

Yesterday, I gave a short presentation at the 2011 University of Windsor's Campus Technology Day called "Building your library with Zotero".

The title is a little misleading because most of the 25 minute talk was dedicated to the recognition that it is difficult to change one's habits - especially research habits - that have served us well in the past.

I've squirmed through too many "talks" that were instead a walk-through of a software's given set of features and I didn't want to be guilty of the same crime. So I played the 3 minute Zotero quick start video for the audience and then did a quick 2 minute live demo of capturing a half-dozen citations from our course reserve system and dropping them, fully formatted, into a document. I think you need to trust your audience that if you give them good reasons to follow their curiosity, they will explore on their own.

Then I gave a personal testimonial that I was planning to use this summer to re-build my research habits and incorporate Zotero in them because I had reached the point why my own hybrid library of papers and pdfs were beyond my personal recollection skills and there had been too many webpages that I have gone back to read only to have found that they had disappeared. Judging by the immediate feedback I received after the talk, this framing of Zotero as a webpage capturing service generated the most interest in this audience.

I don't have the script for my ten slides, but in short, I presented three reasons to build a personal library using Zotero

Zotero allows the easy capture, storage, and backup of research material, and is, as such, is the easiest way to establishing a workflow for digital research
Zotero's ability to share annotated citations brings a visibility to research work which can be brought into the classroom
Zotero is open and extensible and makes possible a new type of scholarship that is invested in people and not products

Much thanks goes out to Jason Puckett who has made his Zotero teaching materials open for others like myself to build on and to Dan Scott who kindly repaired the COinS in our library catalogue even during the aftermath of an Evergreen upgrade.

(And I almost forgot to mention that I had only 5 minutes to set-up and had went in with severe anxieties that the projector would refuse to listen to my Ubuntu laptop but there were no problems to be had. I mention this because this particular scenario was the exact reason I had resisted investing in a Linux lappy until now. Please consider it one less reason not to invest in open source.)

Monday, May 09, 2011

code4lib north presentation: we're jamun and we hope that you like jamun too

On Friday, May 6th, I had the pleasure to present a project called that we are working on my place of work to the fine folks at Code4Lib North.

Jamun is the work of Art Rhyno and it is at the proof of concept stage. That being said, I'm hopeful that it will develop into our library's $100 discovery layer...

There was one point that didn't make it into the presentation slides that I regret not including: Our current version of Jamun doesn't include a pane search results from the text of the library website results but it is intended that we would prominently include one. As the results for a search for "primary source documents" from NCSU library nicely illustrates, such results can act as de facto *in context* help.

I'd like to also note that during the Q and A after my talk, there was a connection between my plea to libraries to stop throwing out course reading lists and Dan Cohen's recent release of million syllabus data set.

Much thanks to McMaster University for hosting Code4Lib and mad props to Nick Ruest and John Fink who were kind enough to host the wonderful event.

Monday, May 02, 2011

Making Links and Open Linked Data at The Great Lakes THATCamp

Usually I wait a couple days after a conference before I try to write down and share what I’ve learned. But because it will only be two short days until I’m on my way to code4lib North and because I have two deadlines to bravely confront before then, I have to write down what I have learned from this weekend’s Great Lakes THATCamp *right now* for better or for worse.

I was going to work on the aforementioned deadlines on Friday night as I would be holed away in alone in a hotel room in East Lansing, but the workshop that I had attended that morning - Jon Voss’s workshop, Intro to Linked Open Data in Libraries, Archives & Museums (#lodlam) - had triggered an epiphany...

About half-way through the morning's workshop, Jon had told a story of when he and some “computer people” had visited the folks at the Library of Congress. Jon and his friends had been collecting data by screen-scraping the Library of Congress website and the folks at LoC were surprised and asked them why they didn’t just collect the data through z39.50 or OAI-PMH. And, to paraphrase the response, they were all like wot’s dat?

I mention this because the same thing happened to me during Jon’s session. Because I was coming from a library context, I did not recognize the tools of those working with the semantic web. But once I was introduced to them, I recognized them. Let’s see if I can help you can recognize them too. One caveat: what follows is what I understand about open, linked data and that’s not necessarily reality. If I am mistaken about something, feel free to school me oh pedantic web.

Ok. Let’s start with z39.50. Generally a user will visit a library website to search a library catalogue, but there is a standard that exists called z39.50 that allows a user (or a computer program) to search a library catalogue from outside of the website. This is the protocol that allows users to search library catalogues from within citation managers such as RefWorks and Endnote and allows for users to search multiple library catalogues is such services as RACER.

Next, OAI-PMH. This stands for Open Archives Initiative Protocol for Metadata Harvesting. Instead of library catalogues, OAI-PMH is designed for Digital Collections, like institutional repositories. The idea is that if you make your individual repository open for “harvesting”, another service can gather information about your collection. You might know about one particular service that uses OAI to harvest metadata from repositories. It’s called Google Scholar.

Now, we are entering the exciting new world of linked, open data. And it’s early days - like 1997 all over again. And people and organizations are starting to provide open, linked data... for programs that haven’t been written yet.

Well, there are some programs that have been written... but I’ll get to them in just a moment.

If you’ve read through Jon’s presentation on Linked Open Data, you know that linked, open data relies on descriptions that are meant to be read by machines. If you want to earn 5 stars of open, linked data you can create a FOAF to introduce yourself to code.Here’s Ed Summer’s FOAF (firefox will display the file but chrome and safari might attempt to download it -- so I’ve provided a screenshot below).

(You can create one yourself using it as a template or you can find a FOAF generator to make one yourself).

Now, imagine that you are responsible for creating a staff directory for your institution of higher education. You *could* create individual FOAF’s for your staff. If you go this route, you would of course first check to see if there is an established way to semantically describe departments and programs in higher education, so you don’t re-invent the wheel and so you will make it easier for future machines to make connections between your staff and the staff of other institutions. Here’s one for UK institutions that I found via Patrick Murray-John’s post, Thoughts Toward a Giant EduGraph. Or, you could download an application that creates a directory of researchers and their research interests and works that automagically produces semantic descriptions for you. Like VIVO from Cornell University Library.

During the workshop, I asked Jon the question, “If you download VIVO onto your own server, will your information would be available to be searched like the other VIVO instances” and he replied that he thought so as, VIVO would presumably have a SPARQL endpoint. And I replied,wot’s dat?

And that’s when I learned SPARQL is the query protocol that is like SQL but for semantic information in RDF. In other words, it’s like the z39.50 that connects and queries library catalogues and OAI that connects and queries digital repositories.

Now, I know that it’s more complicated than that. I suspect that semantic data can be harvested without SPARQL. I think this because on Friday night I was fiercely determined to use my improved understanding of open linked data to find out how Drupal 7 incorporates RDF. And my persistence paid off because I found this illuminating screencast, The story of RDF in Drupal 7 and what it means for the Web at large. And by watching this one hour presentation, I learned of a tool called Sindice Inspector that makes visible the semantic descriptions of a given webpage. And using this tool, I found RDF descriptions in Drupal 7 that I had already been creating without even knowing it.

So the Next Action is now clear. I need to read up on semantic drupal and to muck about to find out how I can create sound semantic data using Drupal 7. I suspect that I will be writing up these adventures in my workplace blog about our website development.

I hope I will be able to write a little bit more about the other wonderful things that I learned at the Great Lakes THATCamp. The semantic web runs on love. And so does THATCamp.

Only connect.