Skip the navigation
News

AOL search data reportedly released

'It was a mistake, and we apologize,' says an AOL spokesman

By Jeremy Kirk
August 7, 2006 12:00 PM ET

IDG News Service - AOL LLC apparently released details of Internet searches performed over a period of three months by hundreds of thousands of its subscribers, raising privacy concerns.

The data, ostensibly made available for research purposes, is no longer accessible at the Web site http://research.aol.com, but details about it were cited on technology blog site Techcrunch, and the page linking to it was cached by Google's search engine.

The cached copy of the page said the data comprised about 19 million Web searches performed by 658,000 users from March through May. The page warned of sexually explicit language in some of the queries and said of the data, "This collection is distributed for noncommercial research use only." The page contained a link to a compressed copy of the data archive.

The page asked researchers using the information to cite a research paper that is based on the data titled "A Picture of Search," which names two AOL employees as co-authors. That paper is still available online (download PDF).

Andrew Weinstein, a spokesman for Dulles, Va.-based AOL, said today that the company learned of the data posting late yesterday and removed the information almost immediately.

"This was a screw-up, and we're angry and upset about it," Weinstein said. "It was an innocent-enough attempt to reach out to the academic community with new research tools, but it was obviously not appropriately vetted, and if it had been, it would have been stopped in an instant.

"Although there was no personally identifiable data linked to these accounts, we're absolutely not defending this," he said. "It was a mistake, and we apologize. We've launched an internal investigation into what happened, and we are taking steps to ensure that this type of thing never happens again."

The data request should have been reviewed by a special privacy group within AOL, Weinstein said, which is supposed to review any requests that come in for research data. The investigation will look into how the information was posted without undergoing the proper reviews, he said.

The information that was posted included search data for roughly 658,000 anonymized users over the three-month period, but it did not include any personally identifiable data. The search queries themselves, however, can sometimes include such information, according to AOL.

According to comScore Media Metrix, the AOL search network had 42.7 million unique visitors in May, so the total data set covered roughly 1.5% of May search users. The company had about 20 million search records over that period, so the data included roughly 0.33% of all searches conducted through the AOL network over that period. The searches were conducted by users in the U.S. within AOL's client software, according to the company.

The release of such information could pose serious privacy concerns, and major search engine companies fought a request for similar data on user searches last year by the U.S. Department of Justice. The U.S. government wanted the data to check the effectiveness of a federal law aimed at minors' access to harmful material.

In January the Justice Department filed a motion with the court to compel Google to comply with its subpoena and turn over a "random sample" of 1 million Web site addresses found in its search engine index. It also asked the company for the text of all queries filed on the search engine during a specific week.

Although Google fought the request, AOL, Yahoo Inc. and Microsoft Corp.'s MSN were also subpoenaed and complied to varying degrees.

Computerworld's Todd R. Weiss contributed to this report.

Reprinted with permission from IDG.net. Story copyright 2010 International Data Group. All rights reserved.
Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Privacy White Papers
Overcome Top 7 Admin Challenges of Active Directory
As Active Directory's role in the enterprise has drastically increased, so has the need to secure the data. Gain insight on creating repeatable,...
Insiders Can Ruin Your Company. Take Action.
Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in...
Top Solutions and Tools to Prevent Devastating Malware
Custom malware frequently goes undetected. According to Forrester Research, the best way to reduce risk of breach is to deploy file integrity monitoring...
Streamline Compliance and Increase ROI
Streamline, simplify, and automate compliance related activities; especially those that impact multiple business units. This white paper from NetIQ, outlines solutions that will...
X-Ray of the PCI Process-4 Proactive Steps
This white paper from Forrester Research Inc., helps break PCI into understandable components. Security and risk professionals will gain knowledge and insight into...
All Privacy White Papers
Privacy Webcasts
Live Webcast
North Pole to South Seas: Overcoming the Pitfalls of remote Performance
In today's always-on world, connectivity is a business requirement. You need the tools that allow you to operate as if you were on...
Live Webcast
Playing Defense: Staying on Top of Your Disaster Recovery Game
When it comes to disaster recovery, rapidly growing data volumes, distributed computing models, and new technologies all combine to present an ever-changing playing...
Live Webcast
Banish Poor Application Performance: Eliminate Business Disruptions, Increase End User Productivity
End User Experience, 30-Min Webinar
Wed. Feb. 22nd ~ 11 AM ET

Are you ready to gain the proactive ability to rapidly respond...
A Road Map for Best Practice Social Media Acceptable Use Policy
Organizations around the world are racing to leverage the power of social media for business. Sites like Facebook are used for marketing, human...
Data Protection and Disaster Recovery with iSCSI and VMware
Get this on demand webcast now
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
All Privacy Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs