The Turkish Language Processing Platform (TULAP) provides open source Turkish NLP resources developed at Boğaziçi University. We provide several corpora and tools. You can try out the demos of all tools whose source code and dockerized versions are also made available.

What's New

 corpus 
corpus
Description:
This dataset is the re-annotated version of BOUN Treebank. Extracted from Turkish National Corpus (TNC), BOUN Treebank consists of 9,761 sentences (121,214 tokens) from five different text types: Biographical texts, ...
 This item contains 3 files (9.35 MB).
 toolService 
toolService
Description:
Boğaziçi University Annotation Tool (BoAT) is a desktop annotation tool which is specifically designed for dependency parsing and supports the CoNLL-U format. Annotation tools are fundamental to the facilitation of the ...
 This item contains no files.
 toolService 
toolService
Description:
The web-based Boğaziçi University Annotation Tool (BoAT) supports grammar annotation especially suitable for morphologically rich languages (MRLs). It is useful for creating treebanks and conforms to Universal Dependencies ...
 This item contains no files.

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
The corpus is in the form of a text file which includes 229,554 data instances (sentence pairs). Each data instance is formed of a sentence id, Turkish sentence, and English sentence. Example: Sentence id: 148514 Turkish ...
 This item contains 2 files (12.94 MB).
 
Publicly Available
 corpus 
corpus
Description:
The corpus consists of 5 file: geography factoid questions, geography open-ended questions, biology factoid questions, biology open-ended questions, and student questions. Factoid question files in each domain include ...
 This item contains 1 file (348.77 KB).
 
Publicly Available
 corpus 
corpus
Author(s):
Description:
The corpus is provided as an Excel file which includes 1950 Turkish sentences and their descriptions in Turkish sign language. For each sentence, the first column shows the Turkish sentence and the second column shows the ...
 This item contains 1 file (235.4 KB).
 
Publicly Available