Sentiment analysis is not a new concept. Business, marketing and investment strategists have long been scanning newspapers, industry publications and the like to scope out how the public, consumers and influential pundits view a particular company, brand or product.
Two recent developments are dramatically changing the game, however. First, the Web has seen a rapid proliferation of online news services, blogs and other social media that post up-to-the-minute information and commentary about companies and their products. Second, semantic-software vendors -- including Lexalytics, Expert System, Endeca, Cambridge Semantics and Cymfony -- now offer sentiment analysis tools as components of their platforms.
In fact, a growing number of semantic platforms can query and analyze material from Web 2.0 sources such as blogs, Twitter, and social networks like Facebook and LinkedIn. (For more about semantic technology, see our main story.)
Other vendors that offer software or services that handle sentiment and semantic analysis of Web and Web 2.0-sourced material include Dachis Group, Evolve24, Radian6, OpenAmplify, DNA 13 and recent start-up AdmantX.
Sentiment analysis platforms use two main methodologies. One involves a statistical or model-based approach wherein the system learns to assess sentiment by analyzing large quantities of pre-scored material. The other method utilizes a large dictionary of pre-scored phrases.
With either option, the customer often needs to add the scoring for domain- or specialty-specific phases.
Dow Jones, in collaboration with Columbia University and the University of Notre Dame, has compiled a dictionary of about 3,700 words that can signal changes in sentiment. Positive words include ingenuity, strength and winner; negative ones include litigious, colludes and risk.
The dictionary is the basis for Dow Jones Lexicon, a service that allows securities traders and analysts to determine sentiment, frequency and other relevant complex patterns within news coverage to develop predictive trading strategies. In addition, Dow Jones has developed a tool that it calls the Economic Sentiment Indicator, which performs sentiment analysis on material from 15 daily newspapers to track and predict the health of the U.S. economy.
Getting to nuances of opinion
The marriage of semantic and sentiment analysis can provide a powerful tool for determining the import and meaning of a piece of information, industry sources agree.
Say a Ford marketing executive wants to check out current consumer attitudes toward a particular car model. A platform such as Expert System's Cogito can semantically query various industry blogs and postings and come back with potentially relevant comments such as: "The seats of my ugly Ford Explorer are great." Using a combination of semantic and sentiment analysis, it can then determine that "this is a positive judgment of seats, but not of the brand," says Expert System Partner Luca Scagliarini.
Another tool, Thomson Reuters' Machine Readable News service, is designed to assess the positive or negative impact of news reports on industries and individual companies, says Rich Brown, the service's global business manager. For example, a major hurricane threatening the Gulf Coast could benefit energy companies by pushing up oil prices and the prices of energy companies' stocks, but it could hurt companies that have rigs in the Gulf of Mexico.
The potential paybacks of using semantic and sentiment analysis tools and services are impressive. Deutsche Bank recently conducted a study to determine the ROI of using a service like Thomson Reuters' Machine Readable News to pick stocks. It found that picking stocks for a portfolio based on the results of news sentiment analysis generated yearly returns of 5% for a low-risk strategy and 12% for a high-risk strategy, after trading costs. The report concluded that "News sentiment is a promising alpha signal for quantitative investors.... because it opens up a whole new data source -- textual data -- that was previously impossible to use in a systematic fashion."
Still, analyzing Web content can be tricky, particularly when the material being analyzed includes emotional commentary peppered with "profanity, acronyms, sarcasms, multiple exclamation points" and the like, says Brown. Also, industry experts (and common sense) suggest that you'll get better results if you stick with well-known and credible sources and disregard postings by Joe Schmo Blogger who just loves to spout.
Horwitt, a freelance reporter and former Computerworld senior editor, is based in Waban, Mass. Contact her at email@example.com.