Sidebar: Text Mining Glossary

Text miners use a variety of approaches to extract and present relevant information. Below are definitions of common methods:

Categorization - Presents the search results in categories, rather than as an undifferentiated mass.

Clustering - Grouping similar documents based on their content.

Extraction - Extracting relevant information from a document - for example, pulling out all the company names from a data set.

Keyword search - Searching documents for the occurrence of a particular word or set of words.

Natural-language processing - Determining the meaning of written words taking into account their context, grammar, colloquialisms and so on.

Taxonomy - Categorization of data according to a predefined framework, either industry-standard or customized. Some tools can automatically generate a taxonomy based on analysis of the data store.

Visualization - Graphically presenting the mined data so relationships are easier to spot and understand.

Copyright © 2004 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon