Here are links to papers, articles and books about Web harvesting.
- "Text Mining, Web Mining, Information Retrieval and Extraction from the WWW References," by Weiguo "Patrick" Fan
- "Web Harvesting: New Problems, New Solutions," by Chris Buckingham, Caesius Software
- Mining the Web: Analysis of Hypertext and Semi Structured Data, by Soumen Chakrabarti; Morgan Kaufmann, 2002
- "Web-Mining Technology and Academic Librarianship: Human-Machine Connections for the Twenty-First Century," by May Y. Chau
- Web Farming for the Data Warehouse, by Richard Hackathorn; Morgan Kaufmann, 1998
- "Farming Web Resources for the Data Warehouse," by Richard Hackathorn; DM Review magazine, June 1999
- Web Content Mining With Java, by Tony Loton; John Wiley & Sons, 2002
- "Uncovering Information Hidden in Web Archives: A Glimpse at Web Analysis Building on Data Warehouses," by Andreas Rauber, et al.
- "Bringing Order to Data Chaos" (download PDF), white paper by QL2 Software Inc.
The Future of BI
Stories in this report:
- Editor's Note: The Future of Business Intelligence
- BI for the Masses
- Text Mining Tools Take on Unstructured Data
- Fraud Sniffers
- Web Harvesting
- Resources for More Information on Web Harvesting
- Web Harvesting and Libraries
- Doubtful BI
- Predictions for BI's Future
- Shark Tank: Tales of Business Un-Intelligence
- Four Steps to Get Your Data In Shape
- When Good Data Goes Bad
- Managing Data Madness
- Securing Business Intelligence