Your first girlfriend -- and the other things search engines store about you
Microsoft Live Search records the type of search you conduct, while Google stores your browser type and language
July 10, 2007 12:00 PM ETComputerworld - What if there were a giant database that contained your hidden insecurities, embarrassing medical questions and the fact that you still think from time to time about your high school girlfriend? Well, such a data store does exist -- if you've ever plugged such private topics into a search engine.
The fact is, search engines such as Google, Yahoo and Microsoft Live Search all record and retain in their vast data banks any term that you query, in addition to the date and time your query was processed, the IP address of your computer and a cookie-based unique ID that -- unless you delete it -- enables the search engine to continue to know if requests are coming from that particular computer, even if the connection changes.
Microsoft Live Search also records the type of search you conducted (image, Web, local, etc.), while Google additionally stores your browser type and language. And when you click on a link displayed on Google, that may also be recorded and associated with your computer's IP address.
While Google Inc. recently announced that it would make its search logs anonymous after 18 months' time by deleting part of the IP address and obfuscating cookies associated with search queries, Microsoft Corp. and Yahoo Inc. haven't yet made their retention policies public. AOL LLC stores this data for just one month.
The upshot: If someone were to ask one of these search engine companies to produce a list of IP addresses or cookie values that searched on a particular search term, they conceivably could. Or, conversely, given an IP address or cookie value, the search engine firm could produce a list of terms searched by the user of that address or cookie value.
Don't worry; be happy
Some people say there's not much to worry about, since the server logs don't associate these search terms with personally identifiable information, such as your name or e-mail address. However, if you have an account with or have registered for any of the additional services on a search engine site -- e-mail, social networks, calendars, shopping lists -- it's feasible that that connection could be made, says Brad Templeton, chairman of the board at the Electronic Frontier Foundation, a group that protects liberties and privacy in cyberspace. In the case of Microsoft and Yahoo, that information can be extensive because of how much personal information these search engine firms ask for on their account registration forms, including your occupation, job title and marital status and the number of children in your household.
According to Whitney Burk, public relations manager at Microsoft, "there is no systematic way of identifying, isolating or cross-referencing search data with personally identifiable information." Google also says it stores the two types of information separately. However, according to Templeton, "it would be very difficult to make it impossible for someone to make that correlation."
search engines
Additional Resources



White Papers & Webcasts
Data Protection is not an insurance policy -you cannot buy-back lost data
Find out why you need to maintain access to critical information to run your business and remain competitive.
Essential Archive Requirements for E-Discovery
Register Now!
Strategic ECM Webinar
Learn what new strategic business benefits can be realized through ECM!
CIO Strategies for the Retention and Deletion of Email
Register Now!
Rethinking Business Continuity and High Availability in Storage - HP and Forrester Pre-Recorded Webcast
Download it.
CIO Viewpoints: Exchange 2007 Risks and Mitigation Strategies
Download This Whitepaper Today!
5 Architecture Issues that Impact BES performance
Register to attend this LIVE Webinar to learn 5 Architecture Issues that Impact BES performance!
The Power/Density Paradox: The Result of High Density without Power Efficiency
Download this brief to explore what the power/density paradox is and how IT professionals can mitigate the risk.
Four Principles for Reducing Storage TCO
View cost reduction strategies in this video! Provided by Hitachi Data Systems.
