Can Libraries Protect Users' Privacy While Tracking Books?


One of the bigger challenges for librarians is protecting the privacy of patrons while stopping them from stealing the books. In many cases, there's not much interesting in our reading choices, but sometimes there is. A spy might look for hints or clues in the list of books taken out by researchers from the nearby army base. A blackmailer may try to subvert some of the local police or security personnel by looking at what they read. This may be why some librarians are so careful to protect the choices of their customers.

Is it possible to protect the reading choices of library patrons from hackers, insiders, and snoops while catching thieves? At first glance, this seems difficult because the library must keep track of the books on loan to defend itself against people who don't bring them back. Some libraries try to delete all records after a book is returned, but that doesn't stop the curious from looking at the list of books that are currently checked out.


The surprising result is that the library doesn't need to keep a list of what people are reading to stop theft. A few simple one way functions can lock out even the most adept snoops. (A good one-way function is the Secure Hash Algorithm or SHA and many toolkits now come with implementations that implement it and a more general, metaprotocol, the HMAC.)


One way functions scramble information so it is unreadable, but they don't remove all of the usefulness. If you want to see what functions like SHA do to text, you can try to scramble some book data here with this Javascript :


Book Title
Author
Calculate
Result SHA(title/author)




The results should be inscrutible if the one way function is working correctly. There should be no way to take the results of The trick is to pass the book title and the author through the one-way function, SHA(), before storing it away. That is, put SHA("All's Well That Ends Well/William Shakespeare") in the database instead of just the plaintext title "All's Well That Ends Well". Here's what a librarian's database might look like:

Name

SHA(book title)
Due Date
Replacement Cost
Bob Jones
19ded208e1d4f03f18f54bfead142edb3971632c Jan 1
$20

201a0d9e68c174c0a8664b4d8510204ccad5583d
Jan 3
$21

873e637cc6d6eb2466ff2d0e02da8a54d1103cb7 Jan 3
$25
Mary Jones
51f33f597ecef6886e5426a304a2dec9b7b248f8
Jan 2
$15


4f697e91504dd8754afa0c9d29e040fb94b9e52c
Jan 3

$15



This table tells us that Bob Jones has out some book due on January 1. If he doesn't bring it back, he will owe $20. When he does return it, the library can examine the title, compute SHA(title/author), and delete the entry. Bob Jones is relieved of his responsibilities. If he doesn't return it, he can be billed.

This solution does have a few weaknesses. SHA(title/author) may be easy to guess. Someone can take the list of titles from Amazon or Books in Print and look for matches. There aren't too many books. Another solution is to give each book a unique, random ID number, something many libraries do already. If the number is long enough and chosen at random, then it is not possible for a spy or a blackmailer to try to guess the titles. Another solution is to add a password to the hashing equation by storing SHA(title/author/password) or SHA(title/author/patron's name). This significantly increases the complexity of any brute force attack, although it does not

Removing the Reader's Name

The system can be made a bit more secure by also locking away the identities. Instead of storing the name in the column, the system can store SHA('name') or better SHA('random id string'). When a person returns their book, they can present their libary card
to delete the book from their record.

SHA(Name)

SHA(book title)
Due Date
Replacement Cost
06fda7eb8124bc3fa05ed70feb92c1c157d8fb23
19ded208e1d4f03f18f54bfead142edb3971632c Jan 1
$20

201a0d9e68c174c0a8664b4d8510204ccad5583d
Jan 3
$21

873e637cc6d6eb2466ff2d0e02da8a54d1103cb7 Jan 3
$25
6980b62ba56b08d90e1f6103fdc42391856d12dc
51f33f597ecef6886e5426a304a2dec9b7b248f8
Jan 2
$15


4f697e91504dd8754afa0c9d29e040fb94b9e52c
Jan 3

$15



Removing the person's name from the record would require taking some anonymous bond or deposit for the loan-- a practice that might be too unwieldy. If people post a cash deposit with each loan, then it would be possible to remove the identity completely.

The library could also use some form of reputation database to see who was trustworthy and who wasn't. This isn't much different from what they do now.

If anyone has thoughts about the advantages and limitations of the approach taken here, please write.   --- Peter Wayner, p3 (a) wayner (dot) org