The Turkish Language Processing Platform (TULAP) provides open source Turkish NLP resources developed at Boğaziçi University. We provide several corpora and tools. You can try out the demos of all tools whose source code and dockerized versions are also made available.

What's New

 corpus 
corpus
Description:
This dataset is the re-annotated version of BOUN Treebank. Extracted from Turkish National Corpus (TNC), BOUN Treebank consists of 9,761 sentences (121,214 tokens) from five different text types: Biographical texts, ...
 This item contains 3 files (9.35 MB).
 toolService 
toolService
Description:
Boğaziçi University Annotation Tool (BoAT) is a desktop annotation tool which is specifically designed for dependency parsing and supports the CoNLL-U format. Annotation tools are fundamental to the facilitation of the ...
 This item contains no files.
 toolService 
toolService
Description:
The web-based Boğaziçi University Annotation Tool (BoAT) supports grammar annotation especially suitable for morphologically rich languages (MRLs). It is useful for creating treebanks and conforms to Universal Dependencies ...
 This item contains no files.

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
This dataset is the re-annotated version of BOUN Treebank. Extracted from Turkish National Corpus (TNC), BOUN Treebank consists of 9,761 sentences (121,214 tokens) from five different text types: Biographical texts, ...
 This item contains 3 files (9.35 MB).
 toolService 
toolService
Description:
Dependency parsing is the process of parsing a sentence with respect to a dependency grammar for a language. Given a sentence, the dependency parser extracts the binary dependency relations between the words in the sentence. ...
 This item contains no files.
 corpus 
corpus
Description:
The corpus is in the form of a text file which includes 229,554 data instances (sentence pairs). Each data instance is formed of a sentence id, Turkish sentence, and English sentence. Example: Sentence id: 148514 Turkish ...
 This item contains 2 files (12.94 MB).
 
Publicly Available