The demand for a week worth of random searches from Google (and all other top tier search engines) is getting the attention it deserves from the major media as the linked headline above at ABC News shows. All major networks and most internet news sites, are incredulous at the request by the US Justice Department for a million “random” sites from each of the top search engines databases, along with a week worth of search queries.
The furor is focusing on the wrong area though. The government claims that their investigation into user queries is intended to lead them to porn sites from innocuous search phrases in order to help them build a case for reversing a Supreme Court ruling which invalidated the Child Online Protection Act (COPA).
The fear should be that the DOJ or any other government or law enforcement agency can demand information purportedly for one purpose and use it for many other purposes. The biggest concern is that vast amounts of data could then be used by the feds as a part of their longing for “Total Information Awareness”. That crazy program, headed by John Poindexter through the Defense Advanced Research Projects Agency (DARPA), was dropped by Congress like a hot potato due to public outcry opposing another overreaching hunger for data, ostensibly for Homeland Security, but clearly meant to feed Bush Administration Big Brother programs.
Search behavior, which is retained by Google and their competitors Yahoo and MSN, holds different interest depending on who holds that information. The search engines inevitably use it to increase profits by helping them to understand how users interact with their search engine and to entice users to click on ads that make them money (by clicking on pay per click ads), or in the case of Yahoo and MSN – to visit their shopping sections and navigate to more profitable parts of their portals (Google’s Froogle is free and earns them no income outside of the Adwords ads.)
Of course, privacy advocates, such as Sherwin Siy of the Electronic Privacy Information Center say, “If they didn’t keep and store this data they wouldn’t be in this bind,” … “It highlights the potential for misuse, whether it’s unreasonable search and seizure by the government or sale of the information to private companies.”
Of course the commercial value of the information kept by each of the search engines is vast, but the value changes dramatically depending on who looks at that same information and what they are attempting to mine from that data. The request for a weeks worth of search queries represents a vast amount of data considering the huge numbers of searches done at Google in seven days. By my admittedly rough estimate, based on information available from online sources, it appears that requested data could amount to upward of 15 billion search queries being requested from Google.
Google made a statement to the press which characterized the DOJ request as “over-reaching”, which is a huge understatement. The data may have been requested for one purpose but can then be used for literally endless other purposes. And, as Danny Sullivan of SearchEngineWatch.com has said – this information is available from other sources and can be obtained (purchased) without warrants.
That simple fact makes it appear that DOJ is simply attempting to avoid paying for the information and supports the Google claim that the request is “burdensome” to fulfill, in addition to the “over-reaching” issue. If law enforcement or government agencies were to request those records on a regular basis, the cost in retrieving, sorting, saving, storing and delivering the data could be gigantic to the search engines, more expensive to Google because they serve up more than half of all searches online – more than any other search engine.
The DOJ should at least cover the costs involved in this request and if Google eventually caves and provides the information to Justice, then they should not only get some sort of payment to cover costs, including all resources and employees required to fulfill the demands for information, but future related expenses from all requests that will come due to the precedent set in this case.
Most are looking at and talking about user privacy as the reason to deny DOJ this expensive request. Privacy is not at issue here, as Danny Sullivan pointed out repeatedly – no user information is either requested or given by MSN or by Yahoo when they responded. More attention should be paid to the “overreaching” issue claimed by Google and the “random searches” issue – which makes the entire demand for data laughable – and makes the claim of COPA enforcement doubtful.
The appearance is of the Bush administration seeking vast troves of information for use in any purpose they wish (and no doubt specific reasons having little to do with protecting children from porn.) Filtering software is more effective and sensible in protecting kids from seeing things they shouldn’t see. The appearance is of a demand for data ostensibly for a laudable purpose, when it is difficult to believe that information would be of any use whatsoever to investigators seeking to protect kids from naughty photos.
This news comes as the Bush administration is being buffeted by criticism from Congress of warrantless phone taps and, unless some more targeted search data were requested, appears to be simply another part of an overreaching data mining request from an information hungry monster.