Dataset for Targeted Sentiment Analysis in Turkish

Mutlu, Mustafa Melih; Özgür, Arzucan

Dataset for Targeted Sentiment Analysis in Turkish

Files

test_dataset.csv (58.87 KB)

train_dataset.csv (186.98 KB)

validation_dataset.csv (46.51 KB)

Date

2022

Authors

Mutlu, Mustafa Melih

Özgür, Arzucan

Publisher

Boğaziçi University

Contact Person

Halil Burak Pala, palahb@gmail.com, Boğaziçi University

Description

This dataset contains 3440 public Turkish tweets whose timestamps span a six- month period between January 2020 and June 2020 and that are about six different brands. The tweets are collected via the official Twitter API by separately searching our 6 targets selected from famous companies and brands. This dataset is manually annotated with three labels, positive, negative, and neutral. Two factors are considered in the annotation process, namely sentence sentiment and targeted sentiment. Each tweet has the following two labels. The sentence sentiment label expresses the overall sentiment of the sentence, regardless of the target word, as in traditional sentiment analysis techniques. On the other hand, the targeted sentiment label reflects the sentiment for the target in that sentence. The dataset is splitted as train, validation and test sets. Train set contains 2200 tweets. Validation set contains 548 tweets. Test set contains 692 tweets.

Keywords

targeted sentiment

Referenced by

https://arxiv.org/abs/2205.04185

URI

https://tulap.cmpe.boun.edu.tr/handle/20.500.12913/86

Full item page