The Turkish Language Processing Platform (TULAP) provides open source Turkish NLP resources developed at Boğaziçi University. We provide several corpora and tools. You can try out the demos of all tools whose source code and dockerized versions are also made available.

What's New

 toolService 
toolService
Description:
Newer version of BoAT.
 This item contains no files.
 corpus 
corpus
Description:
The BOUN Treebank is created by the TABILAB and supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) under grant number 117E971. The BOUN Treebank includes a total of 9,761 manually annotated ...
 This item contains 3 files (9.39 MB).
 
Publicly Available
 corpus 
corpus
Description:
The corpus is derived from SQuAD2.0 using Amazon Translate. It consists of 61,293 question-answer pairs and 18,776 paragraphs containing answers of the questions.
 This item contains 1 file (45.99 MB).
 
Publicly Available

Most Viewed Items

Top Last Week
 toolService 
toolService
Description:
Text summarization is the process of automatically generating brief, fluent, and salient text output from a longer input document. Summarization tasks can be broadly divided into two groups as abstractive summarization and ...
 This item contains no files.
 toolService 
toolService
Description:
The web-based Boğaziçi University Annotation Tool (BoAT) supports grammar annotation especially suitable for morphologically rich languages (MRLs). It is useful for creating treebanks and conforms to Universal Dependencies ...
 This item contains no files.
 corpus 
corpus
Description:
The corpus includes raw Turkish text collected from web and is formed of three parts. Newscor is news corpus collected from news sites. It is given both in splitted form (train, development, and test splits) and in full ...
 This item contains 8 files (4.63 GB).
 
Publicly Available