When Google’s the library, who’s the librarian?

PaidContent says it’s a false alarm that Viacom will get personally identifiable information on our video viewing from YouTube and Google as part of its self-destructive lawsuit. Nonetheless, the episode has sparked the question I pose in the headline: When Google becomes our library, who acts as the librarian to protect our privacy as a matter of principle?

And what is the principle? Any site with content — Google, Amazon, a newspaper, a blog, an ISP — is now the moral equivalent of a library or bookstore, two institutions that try hard not to hand over information on what content we seek and consume arguing that that would violate our First Amendment rights. The controversy in the telco immunity legislation is that those searches were made without warrants. In this case, there is a warrant. When I ran sites, we got subpoenas all the time and handed over IP addresses when ordered; that was company policy. I always found it troubling and as a result ordered that we would change our data retention policy and get rid of IP addresses as soon as possible. Should Google and other sites erase IPs and rely only on cookies without personally identifiable information?

I say all this more as a question than as a statement. Viacom could have just as easily gotten our addresses and account names. Even as blind as Viacom is to the new reality — the suit itself is the proof of that — they realized, as PaidContent points out, that getting our personal viewing information would have turned them into a corporate peeping-tom pariah. So what is the principle and the law in your view? What should they be? And what are the practical tactics we should expect content sites to take? Should I be erasing my logs? Is that pointless because Google Analytics has them too? What gives?

: LATER: Bob Wyman adds in the comments:

PaidContent was spun… They are wrong. Viacom claims that they will receive no “personally identifiable information” because they managed to get the judge to accept that “login id is a pseudonym … which … ‘cannot identify specific individuals’” (See pages 13-14 of the ruling). The judge granted Viacom’s demand to receive “all data from the Logging database” — including login id.

I don’t know about you, but I sure think my Google “login id” does a pretty good job of identifying me…

:UPDATE: the Journal has a good July 4 story outlining how Google is trying to get Viacom to agree to scrubbing personally identifiable information out of the data because of the uproar over it.

We need a principle as we have one governing the ethics and if possible the behavior of bookstores and libraries. Google is the library.