Noise Could Mask Web Searchers' IDs
New Scientist (03/07/09) Marks, Paul
Microsoft researchers say that adding noise to search engine records could protect Web users' identities, and that implementing such a technique would be a major step toward provable privacy. Records of Web searches are extremely useful to software engineers looking to improve search technology, and can provide valuable insight for scientists exploring digital search behaviors. However, attempts to make search data anonymous have been mostly unsuccessful. Microsoft researchers Krishnaram Kenthapadi, Nina Mishra, Alex Ntoulas, and Aleksandra Korolova say they have developed a safe way to release search data. The researchers propose publishing data associated only with the most popular queries, so that specific, rarely performed searches, such as for individual names or unique interests, cannot be used to identify people. The researchers also inserted noise into the data by adding digits to the data's figures. Korolova says that adding the noise gives the data provable privacy, and the amount of noise added defines the level of privacy that can be guaranteed. She says the added noise strikes a balance between guaranteeing privacy and providing useful data sets.
View Full Article
Ok... I'm all for privacy. One of the strengths of the Internet and particularly the World Wide Web has been anonymity in communication. Online, you're not young or old, male or female, white or black or Asian or Hispanic — unless, of course, you want to be. You're your ideas and beliefs, and the anonymity of the Web enables you to express those ideas without prejudice or fear of repercussions in the "real" world (unless you live somewhere like China or North Korea where Internet activity is actively monitored and the free exchange of ideas harshly supressed). Sure, there are always consequences to actions or words, but online those consequences are limited to heated exchanges of ideas and at worst social ostracism from a particular online community.
Because of the hodge-podge of technologies cobbled together to create the current Internet, most of which didn't have security as a priority, we've lost some of that anonymity. Companies and other organizations can glean a disturbing amount of data about our real-world identities and online activities from various sources, including the Web searches mentioned by the article quoted above. Security and privacy certainly need to be concerns in the design of future Internet technologies and our usage of current technologies.
But what is Microsoft's innovative answer to the problem of data mining Web search results? To withhold information and falsify the information provided. Genius!
The first half of this "solution" is nothing but common sense. If data mining of personal information is a problem, then the sources of that information should be particular about what information is provided, and to whom. On the consumer side, we implement this idea by not agreeing to Terms of Service that do not protect our privacy. On the provider side, organizations refuse to publish information that might result in bad press or a decline in customer confidence.
The second half of the so-called solution is profoundly stupid. Records of Web searches can be quite valuable to legitimate research. There's absolutely no point tp publishing these records if they're intentionally falsified. Falsifying the data renders it completely worthless to real research, while making it only somewhat less attractive to those who would use it for less noble applications. You might as well not publish the information at all. So how, exactly, is this a solution?
Way to go, Microsoft. I'm looking forward to your next big idea. By the way, how is that "Life without walls" ad campaign going? Because it seems to me that without walls you don't really need Windows...