AOL search data reportedly released
'It was a mistake, and we apologize,' says an AOL spokesman
August 7, 2006 12:00 PM ETIDG News Service - AOL LLC apparently released details of Internet searches performed over a period of three months by hundreds of thousands of its subscribers, raising privacy concerns.
The data, ostensibly made available for research purposes, is no longer accessible at the Web site http://research.aol.com, but details about it were cited on technology blog site Techcrunch, and the page linking to it was cached by Google's search engine.
The cached copy of the page said the data comprised about 19 million Web searches performed by 658,000 users from March through May. The page warned of sexually explicit language in some of the queries and said of the data, "This collection is distributed for noncommercial research use only." The page contained a link to a compressed copy of the data archive.
The page asked researchers using the information to cite a research paper that is based on the data titled "A Picture of Search," which names two AOL employees as co-authors. That paper is still available online (download PDF).
Andrew Weinstein, a spokesman for Dulles, Va.-based AOL, said today that the company learned of the data posting late yesterday and removed the information almost immediately.
"This was a screw-up, and we're angry and upset about it," Weinstein said. "It was an innocent-enough attempt to reach out to the academic community with new research tools, but it was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant.
"Although there was no personally identifiable data linked to these accounts, we're absolutely not defending this," he said. "It was a mistake, and we apologize. We've launched an internal investigation into what happened, and we are taking steps to ensure that this type of thing never happens again."
The data request should have been reviewed by a special privacy group within AOL, Weinstein said, which is supposed to review any requests that come in for research data. The investigation will look into how the information was posted without undergoing the proper reviews, he said.
The information that was posted included search data for roughly 658,000 anonymized users over the three-month period, but it did not include any personally identifiable data. The search queries themselves, however, can sometimes include such information, according to AOL.
According to comScore Media Metrix, the AOL search network had 42.7 million unique visitors in May, so the total data set covered roughly 1.5% of May search users. The company had about 20 million search records over that period, so the data included roughly 0.33% of all searches conducted through the AOL network over that period. The searches were conducted by users in the U.S. within AOL's client software, according to the company.
The release of such information could pose serious privacy concerns, and major search engine companies fought a request for similar data on user searches last year by the U.S. Department of Justice. The U.S. government wanted the data to check the effectiveness of a federal law aimed at minors' access to harmful material.
In January the Justice Department filed a motion with the court to compel Google to comply with its subpoena and turn over a "random sample" of 1 million Web site addresses found in its search engine index. It also asked the company for the text of all queries filed on the search engine during a specific week.
Although Google fought the request, AOL, Yahoo Inc. and Microsoft Corp.'s MSN were also subpoenaed and complied to varying degrees.
Computerworld's Todd R. Weiss contributed to this report.
Reprinted with permission from
Story copyright 2009 International Data Group. All rights reserved.
AOL
Additional Resources



Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.
White Papers & Webcasts
Southern Company
Download Now
Data Protection and Disaster Recovery with iSCSI and VMware
Get this on demand webcast now
Defending Against the Storm
Download Now
Extending Client Refresh - 11 Steps to Maximize Savings
Register Now!
Share our Strength
Download Now
Lower the Cost and Complexity of a Mobile Workforce through Automation
Download This Resource Now!
Managing Mobility: Improve Data Security, Compliance and Manageability
Download This Resource Now!
Top 10 Things to Know about Data Protection
Download Now
Consolidate Your Servers and Storage to Lower Costs with Oracle Database 11g
Register for this webcast!
