Text miners use a variety of approaches to extract and present relevant information. Below are definitions of common methods:
Categorization - Presents the search results in categories, rather than as an undifferentiated mass.
Clustering - Grouping similar documents based on their content.
Extraction - Extracting relevant information from a document - for example, pulling out all the company names from a data set.
Keyword search - Searching documents for the occurrence of a particular word or set of words.
Natural-language processing - Determining the meaning of written words taking into account their context, grammar, colloquialisms and so on.
Taxonomy - Categorization of data according to a predefined framework, either industry-standard or customized. Some tools can automatically generate a taxonomy based on analysis of the data store.
Visualization - Graphically presenting the mined data so relationships are easier to spot and understand.