Dataset:
BOUN Treebank v2.11 - Unrestricted

Abstract
Description
The BOUN Treebank is created by the TABILAB and supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) under grant number 117E971. The BOUN Treebank includes a total of 9,761 manually annotated sentences from various topics including biographical texts, national newspapers, instructional texts, popular culture articles, and essays. The texts are taken from the Turkish National Corpus (TNC). The dependency relations in the BOUN Treebank is manually annotated in the UD framework. The morphological features and UPOS information are retrieved from the morphological parser of Sak et al. (2011) and converted to UD morphology automatically using our script. The morphological features, UPOS tags, XPOS tags, and lemma forms are manually corrected. A version of this treebank is published in UD repository and webpage. This version diverges from UD version in following terms: - Question particles are marked with PART UPOS tag instead of AUX as this tag is linguistically more accurate. - Dependency tag of queestion particles is part:q instead of aux:q. - Word form of null copulas are N/A instead of “null” to indicate that they are abstract forms. - The MISC function that denotes the root or stem of the dervied form is df= instead of DerivedFrom=
Keywords
dependency, treebank
Citation
Sponsor