Sentiment analysis has emerged as a leading technique to automatically identify affective information within texts. In sentiment analysis, affective states are generally represented using either categorical or dimensional approaches (Calvo and Kim, 2013). The categorical approach represents affective states as several discrete classes (e.g., positive, negative, neutral), while the dimensional approach represents affective states as continuous numerical values on multiple dimensions, such as valence-arousal (VA) space (Russell, 1980), as shown in Fig. 1. The valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and the arousal represents the degree of excitement and calm. Based on this two-dimensional representation, any affective state can be represented as a point in the VA coordinate plane by determining the degrees of valence and arousal of given words (Wei et al., 2011; Malandrakis et al., 2011; Yu et al., 2015; Wang et al., 2016) or texts (Kim et al., 2010; Paltoglou et al, 2013). Dimensional sentiment analysis has emerged as a compelling topic for research with applications including antisocial behavior detection (Munezero et al., 2011), mood analysis (De Choudhury et al., 2012) and product review ranking (Ren and Nickerson, 2014).
Figure 1. Two-dimensional valence-arousal space.
Sentiment lexicons with valence-arousal ratings are useful resources for the development of dimensional sentiment applications. Due to the limited availability of such VA lexicons, especially for Chinese, the objective of the task is to automatically acquire the valence-arousal ratings of Chinese affective words.
Given a word, participants are asked to provide a real-valued score from 1 to 9 for both valence and arousal dimensions, indicating the degree from most negative to most positive for valence, and from most calm to most excited for arousal. The input format is “word_id, word”, and the output format is “word_id, vallence_rating, arousal_rating”. Below are the input/output formats of the example words 勝利(victory), 痛苦 (pain), 乏味 (tedious), and 放鬆 (relaxed).
Input: 0001, 勝利
Output: 0001, 7.8, 7.2
Input: 0002, 痛苦
Output: 0002, 2.4, 6.8
Input: 0003, 乏味
Output: 0003, 3.4, 3.0
Input: 0004, 放鬆
Output: 0004, 6.2, 2.0
The performance is evaluated by examining the difference between machine-predicted ratings and human-annotated ratings (valence and arousal are treated independently). The evaluation metrics include:
A detailed description for the metrics can be found here.