Tuesday, July 30, 2013

Library as Copy Machine: Part Two: Libraries Are for Use

I was at a THATCamp workshop led by Jon Voss when he casually mentioned that he had found a particular map from 'crate-digging' in a library. 

Crate-digging is a term that DJs use to describe looking through old records for samples that they can mix and remix into new work.

But it's not just music where everything is a remix. Literature has been in a plundered, fragmentary state for a long time.

And the state of copying and its influence in the arts is only going to increase. I agree with Cory Doctorow when he says that "contemporary art that's not designed to be copied is not contemporary."

As such, libraries, by and large, are no longer contemporary institutions.

First Law: Books are for Use. And Use Costs The Reader 

Libraries provide access to words, sounds, maps, software, data, film, videos, etc. But, due to copyright and other instruments of intellectual property protection, only a relatively small percentage of what we offer can actually be used in a commercial work like an app or be re-mixed or re-interpreted for an artistic piece - at least not without getting permission from a publisher or paying a fee first. 

By and large, libraries don't go out of their way to tell our their users what they have that  in the public domain or creative commons and available to artists and entrepreneurs looking for inspiration or plunder. 

If the reader is lucky, each item from the library will be described with clear rights information. And ideally - a digital collection would have such rights information available as a facet so one would have the ability to only see what works can be re-used without having to ask permission first.  Barring that, it would be wonderful if our library catalogues and digital repositories had this metadata machine-readable for tools such as Open Attribute.

But more often than not, the right to use is traded away by libraries for a discount in access from entities that are not forthcoming about when and how much they will charge for use of the materials they make available.

And this is problematic for so many reasons. Restictions on data and software prevent our scientists from replicating experiements which is, you know, doing science. And it prevents our (digital) humanists from being able to make things with the very materials that we license for them on their behalf. 

Fourth Law: Save the time of the reader.  Unless it makes the publisher uncomfortable.

Here's another example. Supposed you are a professor of a course with 80 students in your class that's starting in a week. You decided that will have two assigned readings for each of the 12 weeks in the course, with some being journal articles and some being book chapters. You already have all the works as PDFs and because you want to make sure your students do the readings, you plan to upload them into the course management system to give them no excuse for coming unprepared to class. And then you are told that you can't because the library had signed a license agreement that gave that right away and thus, you are required to instead add "durable links with a proxy prefix" (whatever that means - you're a pretty savvy Internet user and you've never heard of that before) for some of the articles (and how were we supposed to know that?) And now you have to trust that each of your 80 students will find, use, and download or print those articles from these links. Once you re-find those links again.

After how many minutes do you expect our hypothetical professor to struggle with finding out how to make a durable link to the chapter she wants her class to read (see above) before that professor decides to toss it and to discretely give the 80 students a link to her Dropbox account?

Here's another question. Will budget-starved libraries continue to sell away every type of use of a document just as long as they can have access?

Second Law: Every person his or her book.  Even if they want to make money off of that book

I'm on the board of directors of one of the more recently created hackerspaces in Canada. and I'm proud of the group for many reasons - but one reaon is the group's interest and work in Open Data. Hackforge volunteers have already contributed to an Open Data CodeJam and just a couple weeks ago, we had our first meeting of an Open Data Special Interest Group. After the formal meeting, we had our-post meeting meetup, during which a local software developer complained that he wanted to make an app using some government produced geographic data that was tantalizingly readily available but was stricted to non-commerical purposes by its license.

Even thought I agreed with him, I took the role of the apologist. I tried to explain that many people who come from the non-profit sectors of government, social services, and academia (wait, we're still non-profit, right?) have an bias against commericalization and  think they are working through their good intentions when they make their works available but for non-commerical use (unfortunately without realizing that this will turn their contributions into orphan works).

I got an eye-roll in response, and I unfortunately can't remember the exact wording of the scathing retort but it was along the lines of 'oh so they want people to use their works unless it actually becomes valuable.'

I've found that software developers are more aware than librarians what the ramifications of licensing can bring about and pretty much all of them have a strong opinion on what's better, GPL or BSD.  Furthermore, there's a growing understanding that the success of apps that require local information may only be sustainable if they can replicated across communities, which means that having common licenses are increasingly important. Luckily, there is movement to adopt such licences among the governments in Canada.

But what about libraries?  Well, I'm not the only one who wonders if libraries can really be effective advocates of Open Data. 

This is one of those questions that I ponder ever now and again, because I wonder how effective libraries really can be as open data advocates when our current practice demonstrates that we don’t fully believe in the concept.  Well, I should qualify that – we have no problem believing that other people have a moral obligation to make their research and data open to the world using the most permissive (CC0) licenses available, but we have an extremely difficult time doing the same.

Fifth Law: The library is a growing organism. But the Internet is much much bigger and grows much much faster

There's is much derision around the phrase of Web 2.0 but I don't think we should be completely dismissive of its promises. Personally, the Web 2.0 We Lost bit that I miss the most was this :

BitTorrent thus demonstrates a key Web 2.0 principle: the service automatically gets better the more people use it

In Publishing 2.0, Tim O'Reilly provides other examples that can fit in a Library 2.0 context. Here's a brief summary of that talk from 2008:

  • Google. With Google, every time a user makes a link to another site, Google uses that hyperlink to better inform its search algorithm.

  • Amazon. Borders and Barnes & Noble have the same stock of books, but Amazon integrates user reviews and commentary to add more value to their literary collection. With each review, the site gets more valuable.

It's almost been 10 years since the first Web 2.0 conference. And at this point I was going to write again about the current state of library software but I can't even.  Not anymore.

Instead, I recommend you read this intriguing post on Headless libraries (h/t Lisa Hinchliffe).

Third Law: ????

When I think of the future of education, I don't think of MOOCs.

Instead, I think of the person who decides to learn something and works at it by doing it for the better part of a year, documents the process for themselves and others, and at the end of the self-imposed challenge, that person is able to show off a remarkable transformation:

Libraries aren't always part of a formal educational system but it is generally understood that learning is part of our collective mission. Now combine that with the growing understanding that making and learning are deeply-intertwined.

Libraries need to become places where people learn by doing and we need to start sharing our ideas and our spaces in order to support this mission. This doesn't mean we have to give up work providing literature; I'm suggesting we supplement this work with author readings, book clubs, NaNoWriMo support groups, and help with self-publishing. Likewise with film, audio and video.

Our public needs need work that they can use in their learning.

Profit!!! or

What do libraries that are built for re-use look like?

There are exceptional libraries that have been established and maintained specially for the re-use of work by artists including  Prelinger Library and Reanimation Library.

I also highly recommend following the work of The Library As Incubator Project that highlights specific projects and exhibits that libraries big and small have developed in collaboration with artists:

On our website, we:
  • Feature artists, writers, performers and libraries who exemplify the “library as incubator” idea.
  • Highlight physical and digital collections and resources that may be of particular use to artists and writers.
  • Provide ideas for art education opportunities in libraries with our program kit collection and practical how-to’s for artists and librarians.

Richard Veevers in a comment to the first part of this post, kindly recommended watching a particular talk by Eli Neiburger's 2012 talk and after watching it, I whole-heartedly second that recommendation.

It's called Access, schmaccess.

Monday, July 15, 2013

The Library as Copy Machine: Part I : Pirates of the Alexandrian

Librarians, publishers, and authors are all struggling with the burden of copyright, that being the exclusive right to copy in a digital world. Copyright has become a burden because copying is an inherent property of digital transmission:

As computers retrieve images from the web or displays from a server, they make temporary, internal copies of those works. Every action you invoke on your computer requires a copy of something to be made. Many methods have been employed to try to stop the indiscriminate spread of copies, including copy-protection schemes, hardware-crippling devices, education programs, and statutes, but all have proved ineffectual. The remedies are rejected by consumers and ignored by pirates.

The evidence is in. Copyright hurts readers as well.

So let’s the change the context. Let's stop thinking of copying in terms of the exclusive domain of publishers. Let us remember that the history of the library pre-dates the history of the publisher.

Let us remember how the greatest library in the world was built from copies. And piracy. Literally piracy.

During the reign of Ptolemy Eurgertes, the Library borrowed Athens’ official versions of the plays of Aeschylus, Sophocles, and Euripides, giving Athens an enormous amount of money; the modern equivalent of millions of dollars, as surety for their return. The scribes of the Library made fine copies of these books on the highest quality of parchment. The originals were kept for the Great Library and the copies were returned to Athens, causing the Alexandrians to forfeit their bond. Other ethically dubious means for procuring materials were also employed. It is said that during a famine in Athens, ambassadors from the Great Library forced the sale of valuable original manuscripts owned by that city in exchange for food. A more conventional technique employed by the Ptolemies was to send people out to buy books, looking especially for rare texts and libraries which might be bought en masse. In addition to buying books, the Ptolemies acquired books through plunder. It is widely reported that upon entering the Alexandrian harbor, ships were inspected, and any books they were carrying were seized. A copy was made and given to the original owner, but the original was kept for the Great Library. It was through such means that the Great Library amassed its large collection.

Let us remember that Google’s search engine is not built from the Internet. It is continually built from a copy of the indexable Internet. One library was built from scribes. Another one built was by spiders.

And when the printing press replaced the use of scribes, the act of creating copies occurred outside of the library and the collecting continued. Many of the world's national libraries have been built through the establishment of a legal deposit program that requires publishers to give their national library a copy (or several) of each work they publish.

By law, a copy of every UK print publication must be given to the British Library by its publishers, and to five other major libraries that request it. This system is called legal deposit and has been a part of English law since 1662.
From 6 April 2013, legal deposit also covers material published digitally and online, so that the Legal Deposit Libraries can provide a national archive of the UK’s non-print published material, such as websites, blogs, e-journals and CD-ROMs.

And while academic libraries have not framed their work this way, the mandates passed by University Senates which requires authors and creators to deposit copies of their works into the local institutional repository sounds a lot like legal deposit to me. Except it's not so much as "legal deposit" than 'deposit when legal' because a university is not a nation and so its mandate cannot trump the contract agreements between publishers and authors. If it could, can you just imagine what a powerful force it could be? Well, Elsevier did and those crafty publishing devils specifically state that in their Article Posting Policy that it's okay if authors deposit a copy of their paper in an institutional repository - as long as is it is voluntary. 

I'm going to suggest that all libraries - public, special, and academic - should think of local copying as a means of collection development. While, as I have shown, that it is not a new way of developing library collections, it will be new to most non-national libraries who have enjoyed a long, comfortable existence as a middleman between commercial publishers and a local reading public.

I think this project - Winnipeg Public Library's Local Music History Digitization Days - is a wonderful example that highlights what such a re-imagining of libraries and archives can achieve. The project itself was billed as a two day event and it was held this past April:

Past Forward (http://wpl-digitization.winnipeg.ca/ ) is WPL's ongoing Public History project, where people can discuss, share, research and contribute to our past. Since one of the fantastic things about Winnipeg is our local music scenes, we thought preserving this part of our city's past would be a great contribution to the project.

So, if you were involved with music in Winnipeg, this is your chance to get all that stuff out from under your bed and online! Get your show posters, venue photographs, & handbills digitized, and contribute to our public history collection.

Register for a space to get high quality digital scans of your memorabilia...

I love that everyone benefits from this project. Those from the public who participate get digital copies of their memorabilia perhaps with equipment they don't have access to and with the guidance of an expert that they might have needed. And not only does the library gets a copy for their public history collection, they have a showcase of work that they can add to, in similar event the following year, if they choose to do so. Furthermore, they could also use this model repeatedly and applied to completely different interest group each time as a form of community outreach. Collecting from one community at a time.

It should be noted that even here the burden of copyright has not been departed or gone:

Also, out of respect to bands and artists, we are asking people to only bring materials that they helped produce, so that there are no copyright issues!

Right. Like we could get our rock and roll stars to make their promotional work freely available to their fans. Like... The Clash!

Mick Jones of The Clash: transformed his own archive of nearly 10,000 artefacts into one unique 5-week "guerrilla-library." Users were able to scan (courtesy of the U.K distributor of the Book2net Kiosk) certain objects and via memory stick carry them away.

And now, I'd like to depart on a tangent that the quote above affords me. I want to focus on the technologies - like the Book2net Kiosk - that not make our scanning possible but they make entirely new libraries possible.

(Indeed, it only takes a photocopier to build a library.)

Both the Windsor Public Library and the University of Windsor have an Espresso Book Machine, a print-on-demand contraption that both prints and binds softcover books within minutes. I've printed my own book with the machine and I can say from experience that it is a delight to make beautiful papery things to share. But what intrigued me about the machines is how the self-published and public domain titles produced by an Espresso can be added to EspressNet - “a network of content, enabling EBMs to order and print books by retrieving, encrypting, and transmitting files from a multitude of content sources.” Every year, the collection of EspressNet grows along with the use of their printers.

Other self-publishing and print-on-demand services such as Lulu.com and Blurb.com. have created similar collections of works for sale. The distribution system of the written work is baked right into the publishing process, although sometimes it requires an additional cost. Publish with Lulu and you can distribute your print books with Amazon and Ingram (myilibrary) and your ebooks with Apple iBooks and Barnes and Noble nook store. Of note, Ingram has its own business unit that does combined on-demand book printing and distribution at the publisher level through Lightning Source.

Libraries should also consider how we too can bind publishing and distribution together and so that both services similarly re-enforce each other.

Two of my favourite examples of such consideration are PUMA and BibApp.  PUMA is a system “where the upload of a publication results automatically in an update of both the personal and institutional homepage, the creation of an entry in BibSonomy, an entry in the academic reporting system of the university, and its publication in the institutional repository.”  BibApp is similar. It "matches researchers on your campus with their publication data and mines that data to see collaborations and to find experts in research areas. With BibApp, it’s easy to see what publications can be placed on the Web for greater access and impact. BibApp can push those publications directly into an institutional or other repository." 

And it's not too late for us to get into self-publishing end of ebook publishing. You can't tell me it isn't possible because it's already being done:
“I realized we needed to do something,” LaRue says. “The vendors were screwing us.” In December 2010, with all of these ingredients mixing in his mind, he had a moment of clarity. As with the music industry before it, a common analogy in these conversations, he decided that the publishing industry’s future didn’t rest with the legacy conglomerates that had dominated it in the past. Its strength resided in the independent presses and self-publishing writers who had seized the opportunity that e-books offered: the democratization of publishing. Libraries, he reasoned, needed to harness that creative outburst. He devised a plan to do it.
It was remarkable in its simplicity: LaRue decided to build a digital warehouse and contracting system, which would allow his libraries to purchase directly from smaller publishers and authors, cutting out the Big Six and OverDrive, which would mean lower prices. In January 2011, Douglas County Libraries purchased Adobe software that for $10,000 would serve as the backbone of the new system, safely transferring files from the provider to the library to the reader. LaRue wrote “Dear Publishing Partner” letters, setting simple yet firm expectations for how the content would be handled and eliminating the restrictions that accompanied the major publishers’ products. The whole enterprise cost $200,000, but LaRue says the libraries have already saved that much in a year because the prices they’re paying for the independent and self-published materials are much lower, up to 45 percent below retail.
The system went live in February 2012, and LaRue went to work finding partners. They soon flooded Douglas County’s digital shelves. The libraries have so far purchased e-books from more than 900 smaller publishers and hundreds of individual authors. They make up 21,000 of the 35,000 titles in his virtual catalog. The rest come from the major publishers, sold through intermediaries at much higher prices. Those mainstream titles are still more popular with readers, making up 65 percent of the county’s loans, but it’s clear that the appetite for the independent and self-published content is growing...

Copy that.