The Chinese Valence-Arousal Text (CVAT) is an affective corpus containing 2,969 sentences extracted from the web with six different categories: news articles, political discussion forums, car discussion forums, hotel reviews, book reviews, and laptop reviews. Each sentence is manually annotated with a real-valued score for both valence and arousal dimensions. The valence represents the degree of positive and negative sentiment, and arousal represents the degree of calm and excitement. Both dimensions range from 1 (highly negative or calm) to 9 (highly positive or excited). The scatter plot of the CVAT is shown below.


No. Text Category Valence Arousal
357 很多車主抱怨新車怠速抖動嚴重----冷車時更嚴重。 Car 3.250 5.667
805 房間裏黴味,煙味撲鼻,沒有窗戶通風,骯髒的地毯上的斑斑點點的污蹟,令人觸目驚心。 Hotel 1.889 6.875
982 CPU顯卡也完全夠用,接口也非常齊全,總體來說很滿意! Laptop 7.143 5.000
1078 飛安帶來更多保障,也提供旅客更安心的服務品質。 News 7.000 4.222

  • Valence/Arousal: Mean of the valence/arousal ratings.

  • Download

    Click here to download CVAT 1.0 (2,009 sentences; released on August 1, 2016).

    Click here to download CVAT 2.0 (2,969 sentences; released on August 1, 2019).


    Liang-Chih Yu, Lung-Hao Lee, Shuai Hao, Jin Wang, Yunchao He, Jun Hu, K. Robert Lai, and Xuejie Zhang. 2016. Building Chinese affective resources in valence-arousal dimensions. In Proceedings of NAACL/HLT-16, pages 540-545.


    Liang-Chih Yu


    Department of Information Management, Yuan Ze University