Sentiment analysis

Sentiment analysis refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Sentiment analysis is widely applied to reviews and social media for a variety of applications, ranging from marketing to customer service.
Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).

The idea was launched by Richard Thaler in Nundge:

The Ciivility Check. We have saved our favorite proposal for last. The modern world sutlers from insutficienr civility. Every hour of every day, people send angry emails they soon regret, cursi ng people they barely know (or even worse, their friends and loved ones). A few of us have learned a simple rule: don’t send an angry email in the heat of the moment. File it, and wait a day before you send it. (In fact, the next day you may have calmed down so much that you forget even to look at it. So much the better.) But many people either haven’t learned the rule or don’t always follow it. Technology cou ld easi ly help. In fact, we have no doubt that technologically savvy types could design a helpful program by next month.
We propose a Civility Check that can accurately tell whether the email you’re about to send is angry and caution you, “WARNING: THIS APPEARS TO BEAN UNCIVIL EMAIL. DO YOU REALLY AND TRULY WANT TO SEND IT?”
(Software already exists to detect foul language . What we are proposing is more subtle, because it is easy to send a really awful email message that does not contain any four-letter words.) A stronger version, which people could choose or which might be the default, would say, “WARNING: THIS APPEARS TO BE AN UNCIVIL EMAIL. THIS WILL NOT BE SENT UNLESS YOU ASK TO RESEND IN TWENTY-FOUR HOURS.” With the stronger version, you might be able to bypass the delay with some work (by inputting, say, your Social Security number and your grandfather’s birth date, or maybe by solving some irritating math problem !). *
The Reflective System can be nicer as well as smarter than the Automatic System. Sometimes it’s even smart to be nice. We think that Humans would be better off if they gave a boost to what Abraham Lincoln called “the better angels of our nature.”

A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, “beyond polarity” sentiment classification looks, for instance, at emotional states such as “angry,” “sad,” and “happy.”

Early work in that area includes Turney and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document’s polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Bo and Lilian expanded the basic task of classifying a movie review as either positive or negative to predicting star ratings on either a 3 or a 4 star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover it can be proven that specific classifiers such as the Max Entropy and the SVMs can benefit from the introduction of neutral class and improve the overall accuracy of the classification.

A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral or positive sentiment with them are given an associated number on a -10 to +10 scale (most negative up to most positive) and when a piece of unstructured text is analyzed using natural language processing, the subsequent concepts are analyzed for an understanding of these words and how they relate to the concept. Each concept is then given a score based on the way sentiment words relate to the concept, and their associated score. This allows movement to a more sophisticated understanding of sentiment based on an 11 point scale. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text.