Dataset: BOUN Treebank v2.11 - Unrestricted
dc.contributor.author | Marşan, Büşra | |
dc.contributor.author | Türk, Utku | |
dc.contributor.author | Atmaca, Furkan | |
dc.contributor.author | Özateş, Şaziye Betül | |
dc.contributor.author | Berk, Gözde | |
dc.contributor.author | Bedir, Seyyit Talha | |
dc.contributor.author | Köksal, Abdullatif | |
dc.contributor.author | Başaran, Balkız Öztürk | |
dc.contributor.author | Güngör, Tunga | |
dc.contributor.author | Özgür, Arzucan | |
dc.contributor.author | Uskudarli, Susan | |
dc.contributor.author | Akkurt, Salih Furkan | |
dc.date.accessioned | 2023-03-05T10:26:54Z | |
dc.date.available | 2023-03-05T10:26:54Z | |
dc.date.issued | 2022 | |
dc.description | The BOUN Treebank is created by the TABILAB and supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) under grant number 117E971. The BOUN Treebank includes a total of 9,761 manually annotated sentences from various topics including biographical texts, national newspapers, instructional texts, popular culture articles, and essays. The texts are taken from the Turkish National Corpus (TNC). The dependency relations in the BOUN Treebank is manually annotated in the UD framework. The morphological features and UPOS information are retrieved from the morphological parser of Sak et al. (2011) and converted to UD morphology automatically using our script. The morphological features, UPOS tags, XPOS tags, and lemma forms are manually corrected. A version of this treebank is published in UD repository and webpage. This version diverges from UD version in following terms: - Question particles are marked with PART UPOS tag instead of AUX as this tag is linguistically more accurate. - Dependency tag of queestion particles is part:q instead of aux:q. - Word form of null copulas are N/A instead of “null” to indicate that they are abstract forms. - The MISC function that denotes the root or stem of the dervied form is df= instead of DerivedFrom= | |
dc.identifier.uri | https://tulap.cmpe.boun.edu.tr/handle/20.500.12913/81 | |
dc.language.iso | tur | |
dc.publisher | Boğaziçi University | |
dc.relation.isreferencedby | https://arxiv.org/abs/2207.11782 | |
dc.rights | The MIT License (MIT) | |
dc.rights.uri | http://opensource.org/licenses/mit-license.php | |
dc.source.uri | https://github.com/BOUN-TABILab-TULAP/UD_Turkish-BOUN_v2.11_unrestricted | |
dc.subject | dependency | |
dc.subject | treebank | |
dc.title | BOUN Treebank v2.11 - Unrestricted | |
dc.type | corpus | |
dspace.entity.type | Dataset | |
local.contact.person | Büşra, Marşan, busra.marsan@boun.edu.tr, Boğaziçi University |
Files
Original bundle
1 - 3 of 3
No Thumbnail Available
- Name:
- test-unr.conllu
- Size:
- 933.28 KB
- Format:
- Unknown data format
- Description:
- test file
No Thumbnail Available
- Name:
- dev-unr.conllu
- Size:
- 944.44 KB
- Format:
- Unknown data format
- Description:
- dev file
No Thumbnail Available
- Name:
- train-unr.conllu
- Size:
- 7.55 MB
- Format:
- Unknown data format
- Description:
- train file
License bundle
1 - 1 of 1