Sentiment analysis has emerged as a leading technique to automatically identify affective information within texts. In sentiment analysis, affective states are generally represented using either categorical or dimensional approaches (Calvo and Kim, 2013). The categorical approach represents affective states as several discrete classes (e.g., positive, negative, neutral), while the dimensional approach represents affective states as continuous numerical values on multiple dimensions, such as valence-arousal (VA) space (Russell, 1980), as shown in Fig. 1. The valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and the arousal represents the degree of excitement and calm. Based on this two-dimensional representation, any affective state can be represented as a point in the VA coordinate plane by determining the degrees of valence and arousal of given words (Wei et al., 2011; Malandrakis et al., 2011; Yu et al., 2015; Wang et al., 2016) or texts (Kim et al., 2010; Paltoglou et al, 2013). Dimensional sentiment analysis has emerged as a compelling topic for research with applications including antisocial behavior detection (Munezero et al., 2011), mood analysis (De Choudhury et al., 2012) and product review ranking (Ren and Nickerson, 2014)
In 2016, we hosted a first dimensional sentiment analysis task for Chinese words (Yu et al., 2016b) at the 20th International Conference on Asian Language Processing (IALP 2016), which attracted 22 registered teams (including 3 private firms). This year, we extend this task to include both word- and phrase-level dimensional sentiment analysis for the Chinese language.
Figure 1. Two-dimensional valence-arousal space.
Sentiment lexicons with valence-arousal ratings are useful resources for the development of dimensional sentiment applications. Due to the limited availability of such VA lexicons, especially for Chinese, the objective of the task is to automatically acquire the valence-arousal ratings of Chinese affective words and phrases.
Given a word or phrase, participants are asked to provide a real-valued score from 1 to 9 for both valence and arousal dimensions, indicating the degree from most negative to most positive for valence, and from most calm to most excited for arousal. The input format is “term_id, term”, and the output format is “term_id, valence_rating, arousal_rating”. Below are the input/output formats of the example words 好 (good), 非常好 (very good), 滿意 (satisfy), and 不滿意 (not satisfy).
Example 1:
Input: 1, 好
Output: 1, 6.8, 5.2
Example 2:
Input: 2, 非常好
Output: 2, 8.500, 6.625
Example 3:
Input: 3, 滿意
Output: 3, 7.2, 5.6
Example 4:
Input: 4, 不滿意
Output: 4, 2.813, 5.688
The performance is evaluated by examining the difference between machine-predicted ratings and human-annotated ratings (valence and arousal are treated independently). The evaluation metrics include:
A detailed description for the metrics can be found here.