For a while I’ve wondered whether CIP data—that block of text that appears on the copyright page of most books—had outlived its usefulness, and I’ve spoken to a number of people in an attempt to confirm this. It turns out it’s not quite the case.
Back before computers, if you wanted to find books in a library’s collection, you consulted the catalog. The oldest catalogs were themselves books, but for many decades leading up to computerized catalogs, these catalogs were kept on index cards in drawers. Libraries quite a while ago agreed to standard rules for creating these cards, so we got to the point where if two libraries bought the same book, their cards would in theory look identical. Instead of two librarians consulting cataloging rules to end up creating identical cards, it was more efficient if the publisher printed the text that would go on the cataloging card in an unobtrusive location in the book. Then all the librarian would need to do is transcribe the text onto a card!
In general, every national library (usually the largest library in the country, run by the government to preserve the country’s cultural heritage) took on the role of creating this text for publishers in that country. The publisher would send a near final draft of the manuscript to the national library, and the library would send back the text for the publisher to insert on the copyright page before sending to press.
Attitudes toward CIP data in libraries seem to vary. If a librarian doesn’t know cataloging rules very well, or doesn’t trust that they will remember all the exceptions to the rules, they might take the CIP data on blind faith. According to a recent message on TEI-L, that is apparently the case in Slovenia. On the other hand, cataloging rules are meant to deal with all kinds of materials, even those that lack CIP data. So the cataloging rules tell you, for example, to trust only the title page, not the dust jacket or spine label, but they never say “just look at the CIP data”. So libraries that that take their cataloging seriously basically think, “well, the CIP data is a nice gesture, but we trust only ourselves, not the national library, to catalog things correctly”.
But no library really has their staff do everything by themselves. Even if they did, you’d still make mistakes. So most libraries get electronic catalog records to add to their own online catalog from the national library, from OCLC, from the company that sold the book to the library, or from a combination of these. This tends to work quite automatically with records from book vendors: you have a shipment of books and a set of records, and you load all of those records into your catalog and trust they all match. But if you get your records from the national library or OCLC, you’ll have to search for the record matching the book in front of you. Only if your library catalog doesn’t have the capability (or subscription access) to import records from the Library of Congress (LC) or OCLC would you rely on the CIP data (as a shortcut to following cataloging rules to transcribe from the title page).
So I asked colleagues at the University of Michigan how they find a matching catalog record in a case where they weren’t given the record by a book vendor. When the monograph receiving team gets a new book, they first try searching OCLC’s database using the ISBN. If that doesn’t match, they try searching on the title from the title page. If they still don’t find a match, they send it to the catalogers, who will catalog the book according to cataloging rules: using the title page, not the CIP data. Now, every library has its own procedures for finding records in OCLC or elsewhere. Instead of using an ISBN, they might use the Library of Congress Control Number (LCCN), which is included in the CIP data. While books with LCCNs almost certainly also have ISBNs, this provides a nice backup in case the ISBN is wrong. Perhaps, then, the LCCN is the most important part of the CIP data. After all, the LCCN is usually recorded by publishers in their ONIX metadata in case any book vendors or libraries consuming the ONIX metadata want to make use of it to retrieve the full catalog record from LC or OCLC.
But even if the book vendor or library refers to the LCCN to find someone else’s catalog record as a starting point, how does someone else’s record get created initially? Someone had to do it the first time, and if the publisher applies for CIP data, the Library of Congress will be the first, saving time for others. Indeed, Ingram and Bowker use records from the Library of Congress in addition to data directly from publishers in the catalog records they distribute to libraries, though I spoke to representatives of each at AAUP 2013 who said that they knew of no data fields which come only from the CIP data; rather, the CIP data is just a backup in case the other sources of data are incomplete. On the other hand, a representative of Project Muse at the same conference said that Project Muse retrieves the full catalog record from LC as a starting point for fuller records that they provide to subscribing libraries.
So while the CIP data printed on the page isn’t actually used directly by libraries today, an application for CIP data still leads to the creation of a catalog record, which in turn leads to data that is used by libraries and book vendors.
Informed readers may wonder how CIP data relates to Preassigned Control Numbers . Unfortunately, an application for a PCN does not lead to the creation of a catalog record that is distributed beyond LC. Some day, perhaps, LC will offer a hybrid of the CIP and PCN programs, producing a full catalog record without requiring that the publisher print the CIP data in the book.