Messing With Google

It looks like leading search scholar Helen Nissenbaum, and Daniel Howe, have devised a practical way of subverting search engines’ tracking and profiling of users. Called TrackMeNot, the software

runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and MSN. It hides users’ actual search trails in a cloud of ‘ghost’ queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles.

This is a fascinating innovation; if it were to become widely used, it could undermine search engines’ capacity to build the kind of “database of intentions” their business has so far been modeled on. So we can expect some kind of backlash. Might search engines refuse to operate if they detect TrackMeNot operating in the background? Might they include terms of use agreements that pledge users not to install the software?

On the one hand, one might construe use of the search engine as an implied contract; in exchange for giving up data to the search engine (including queries), you get free use of it. So TrackMeNot might be construed as an interference with such an implied contract.

On the other hand, virtually no one thinks of searching in that way. The “consideration” may well be my looking at a page with ads once my query is run. So I’d think the search engines would have to make explicit the expectation that one only run “sincere” queries. (I wonder what they think of the game suggested by Greg in the comments to the last post!).

Perhaps the most interesting angle of all of this is the AI processes that will allow TrackMeNot to build on, say, one’s extant pattern of searches, to generate a series of initially similar and then wildly divergent ones. I would think only such a strategy would amount to the “Ring of Gyges” necessary to seamlessly merge one’s search signals into the “noise” generated by TrackMeNot.

Hat Tip: Battelle’s Search Blog.

4 thoughts on “Messing With Google

  1. Use of TrackMeNot considered harmful. This extension–and almost anything built along its model–is nearly useless in disguising your specific search queries from anyone watching them. It might in generally slightly pollute the quality of overall search engine query analysis logic. But that’s a very different political use of anonymity than hiding from query inspectors looking at you. It doesn’t do anything to mitigate the harms of an AOL-style data leak, just to undermine aggregated marketing. Tal Zarsky, e.g., has argued that there is harm in the creation of such targeted marketing, but it’s a harder case to make.

    Note that automated queries cause some trouble for ad-serving business models, as well, because they result in page loads that are never seen by a human. Since search engines already deal with serious click fraud detection problems, they can probably screen TrackMeNot without much trouble, and indeed may not have even noticed it as something distinct. In the end, it may only cost them some bandwidth.

  2. Frank, this tool may create some uncertainty about some aspects of your searches, and thus make it hard to serve ads as well all the time. However, TrackMeNot does not provide the sort of broad privacy protection its website advertises. For a list of just some of the problems, see Bruce Schneier:
    http://www.schneier.com/blog/archives/2006/08/trackmenot_1.html

    I don’t think most search users are worried about the database of intentions because it means they can be served ads related to their searches. They’re worried about something like the AOL data leak or a DoJ subpoena revealing their deepest secrets and dirtiest laundry to strangers — and possibly to the entire world. TrackMeNot, while intriguing as a concept, doesn’t do that in practice.

  3. James and Derek: Thanks for the link to Schneier, who has indeed provided some devastating critiques. I guess I was too quick to take seriously the “noise-to-signal” aspect of the program because I’ve been thinking about information overload too much!

    I hinted in the post at the possibility that the program could improve with time as a hiding device. An aspect of that might be taking queries in “dicier” directions via some iterated algorithms (As in: “I wasn’t searching for the bad stuff! The program did it!”). But after reading Schneier’s post, I’m worried that it may end up setting up users for more suspicion than their original set of queries. The only way that *wouldn’t* happen would be if it were to very quickly become widespread. But ironically, observations like Schneier’s are likely to prevent that from happening.

    As for what Trackmenot really manages (making targeted marketing harder), I’ll have to read the Zarsky. I’ve been thinking along those lines as well…namely, that

    a) search engines “collapse” into one search space an array of commercial, cultural, religious, political, and other “spaces”
    b) the search engine is likely to be more profitable to the extent its advertisers get attention and sales.

    So if you’ve got one political item to put on a front page, and one commercial, it’s far more profitable to put the commercial one on (if both are equally relevant and thereby neither advances the SE’s reputation for reliability). So I’m happy to see some initiatives like Quaero to generate some public sponsorship of SE’s.

  4. Oops–in the last paragraph in the last comment, I should have said something a bit more subtle about the “choice” between equally relevant sites. I think that it is in an advertising-driven search engine’s interests to “frame” or “push” any given query in a commercial direction…toward an approach to the problem that would be solved by buying something advertised on the paid search results.

    For example, consider someone typing in “acid reflux” as a search query. The search engine might find that a “top page” consisting entirely of commentaries about pharmaceutical approaches to the problem leads to more “clickthroughs” for advertisers than one that has, say, sites that read “You don’t need drugs to cure your digestive problems!” (even if it turns out the latter sites are slightly more popular).

    Market advocates will say “well, if they do that too much, they’ll end up like Overture” (which basically “sold off” top slots, and ended up not doing nearly as well as Google). But my question is: do many people ever compare search engine results anymore? Is there really much of a “market” here? I suppose if Google’s “organic” search results appear too obviously diverting people away from non-commercial approaches to the searches they’re doing, they’ll lose trust (and may lose popularity rapidly given the low switchign costs). But I’m talking about subtle changes here.

Comments are closed.