Dataset:
BOUN Treebank v2.11

dc.contributor.authorMarşan, Büşra
dc.contributor.authorTürk, Utku
dc.contributor.authorAtmaca, Furkan
dc.contributor.authorÖzateş, Şaziye Betül
dc.contributor.authorBerk, Gözde
dc.contributor.authorBedir, Seyyit Talha
dc.contributor.authorKöksal, Abdullatif
dc.contributor.authorBaşaran, Balkız Öztürk
dc.contributor.authorGüngör, Tunga
dc.contributor.authorÖzgür, Arzucan
dc.contributor.authorUskudarli, Susan
dc.contributor.authorAkkurt, Salih Furkan
dc.date.accessioned2023-03-03T22:22:46Z
dc.date.available2023-03-03T22:22:46Z
dc.date.issued2022
dc.descriptionThis dataset is the re-annotated version of BOUN Treebank. Extracted from Turkish National Corpus (TNC), BOUN Treebank consists of 9,761 sentences (121,214 tokens) from five different text types: Biographical texts, national newspapers, instructional texts, popular culture articles, and essays. The syntactic dependency relations and morphological features of the sentences were manually annotated by linguists following the UD scheme. Some statistics on the treebank: - Although the dataset shows word order variance, more than %70 of the sentences have OV and SV word order. - The average token count of the updated treebank is 12.74 and the average arc length is 2.90.
dc.description.sponsorshipTÜBİTAK, 16909, Dilbilim Temelli Türkçe Doğal Dil İşleme Platformu, nationalFunds
dc.identifier.urihttps://tulap.cmpe.boun.edu.tr/handle/20.500.12913/65
dc.language.isotur
dc.publisherBoğaziçi University
dc.relation.isreferencedbyhttps://arxiv.org/abs/2207.11782
dc.subjectdependency annotation
dc.subjectuniversal dependencies
dc.titleBOUN Treebank v2.11
dc.typecorpus
dspace.entity.typeDataset
local.contact.personBüşra, Marşan, busra.marsan@boun.edu.tr, Boğaziçi University
local.size.info9761, sentences
Files
Original bundle
Now showing 1 - 3 of 3
No Thumbnail Available
Name:
tr_boun_v2-dev.conllu
Size:
944.41 KB
Format:
Unknown data format
Description:
Turkish BOUN Treebank v2, dev file
No Thumbnail Available
Name:
tr_boun_v2-test.conllu
Size:
933.25 KB
Format:
Unknown data format
Description:
Turkish BOUN Treebank v2, test file
No Thumbnail Available
Name:
tr_boun_v2-train.conllu
Size:
7.55 MB
Format:
Unknown data format
Description:
Turkish BOUN Treebank v2, train file
Collections