ClearForest: Scaling Dow's Paper Mountain
Computerworld -
CATEGORY: Data management
LOCATION: New York
TECHNOLOGY: ClearTags
HOW IT WORKS: ClearForest's technology assimilates text data of any size and structure, extracts key terms, assigns the terms to meaningful categories (a taxonomy) and establishes their interrelationships. By combining semantic, statistical and structural analysis, ClearTags automatically classifies documents and discovers pertinent entities, facts, events and relationships buried deep within the text.
CUSTOMER Sampling: Credit Suisse Group, General Motors Co., Dow Chemical Co., Eastman Kodak Co.
TIP: "Get ugly early," says ClearForest CEO Barak Pridor. In other words, figure out what you need to get done and what your requirements are from your content, then put a solution in place that will address the problem.
WHAT'S IN STORE: "What ClearForest does is part of a solution," says Laura Ramos, an analyst at Giga Information Group Inc. in Santa Clara, Calif.
"The problem that people are struggling with is how to get their information organized," she explains. "In the future, this technology plays an important role across a lot of different types of applications, whether focused specifically on content management, knowledge management or even business intelligence and analytics and decision-making. So the ability to add structure to unstructured information is clearly a missing piece to that."
User Profile
After merging with Union Carbide Corp. in February 2001, The Dow Chemical Co. had to figure out how to index and categorize the several hundred thousand technical reports it inherited.
The reports covered an 80-year period and included information from companies acquired by Union Carbide. Most of the collections were in paper form. In addition, documents and chemical substance registries were scattered across many disparate collections, and document management and indexing practices were inconsistent.
"We needed to find a way to integrate those reports into the Dow system, preferably without hiring 100 human indexers to sit and read through them," says Anne Rogers, leader of proprietary information services at Midland, Mich.-based Dow.
She says Dow was able to use ClearForest's technology to do automatic indexing and categorizing by document type and then set up rules for understanding the chemistry.
Rogers says Dow worked with ClearForest, which tuned and developed its ClearTags software to index all of the company's content. Before doing that, however, Dow had to digitize the reports so the ClearForest tools could read them electronically.
Rogers says ClearForest was able to take all the content and sort it by document typereports that needed to be saved for a long time vs. reports to be saved for a short time.
The second step was to teach the autotagger about chemistry and use some existing chemical indexes.
"[We] set it up so it could read through these reports and say, 'OK, here's what the report's about,' " Rogers says. "We actually did this in conjunction with human indexers, so it was a machine-aided indexing process rather than just purely automatic." ClearTags was able to speed up the human interaction component, she says.
Today, the entire Union Carbide collection is fully integrated with Dow's electronic collection system that provides search capabilities for litigation support, divestitures, and research and development projects, which are mission-critical to Dow.
Read more about BI and Analytics in Computerworld's BI and Analytics Topic Center.



- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- X-Ray of the PCI Process-4 Proactive Steps
- This white paper from Forrester Research Inc., helps break PCI into understandable components. Security and risk professionals will gain knowledge and insight into...
- Forrester: Economic Impact of Switching to Google Apps
- Content provided by Google
Read this Forrester report on the "total economic impact" of Google Apps, and learn how switching to Google Apps creates... - Intelligent Systems: Unlocking Hidden Business Value with Data
- An intelligent system enables data to flow across an enterprise infrastructure, spanning the devices where valuable data is gathered from employees and customers,...
- Concepts of NonStop SQL/MX
- For DBAs and developers who are familiar with Oracle solutions and want to learn about NonStop SQL/MX, this whitepaper provides an overview of...
- HP Advanced Information Services for SAP In-Memory Appliance (SAP HANA)
- Organizations are eager to connect the vast amounts of data available within and outside their businesses to compete more effectively and make better... All BI and Analytics White Papers
- Quantifying the Business Value of VMware View - Webcast
- Many enterprises have discovered that the use of virtualization to support desktop workloads creates a range of significant benefits. These benefits include price...
- Good to Great - How to Take Business Analytics to the Next Level
- By attending this webcast you will learn how you can implement an effective BA strategy that will deliver maximum strategic value to your...
- Supporting Mobile Productivity With A Limited IT Budget
- Join us and hear from Kaseya mobile IT management experts as we discuss core strategies for supporting the mobile revolution on a shoestring...
- User Experience Monitoring
- In this webinar, you will learn hints & tips for improving end-user response times from Forrester Research analyst, Jean-Pierre Garbani.
- Hints & Tips Cisco
- Overwhelmed by tracking your Vblock, Flexpod or Cisco UCS performance? Spend one hour with Nimsoft to learn how you can eliminate the overhead... All BI and Analytics Webcasts