Dataset for Targeted Sentiment Analysis in Turkish

Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Boğaziçi University
Contact Person
Halil Burak Pala, palahb@gmail.com, Boğaziçi University
Abstract
Description
This dataset contains 3440 public Turkish tweets whose timestamps span a six- month period between January 2020 and June 2020 and that are about six different brands. The tweets are collected via the official Twitter API by separately searching our 6 targets selected from famous companies and brands. This dataset is manually annotated with three labels, positive, negative, and neutral. Two factors are considered in the annotation process, namely sentence sentiment and targeted sentiment. Each tweet has the following two labels. The sentence sentiment label expresses the overall sentiment of the sentence, regardless of the target word, as in traditional sentiment analysis techniques. On the other hand, the targeted sentiment label reflects the sentiment for the target in that sentence. The dataset is splitted as train, validation and test sets. Train set contains 2200 tweets. Validation set contains 548 tweets. Test set contains 692 tweets.
Keywords
targeted sentiment
Citation
Sponsor