[CWB] Is it possible to use CWB with Chinese?

Hardie, Andrew a.hardie at lancaster.ac.uk
Fri Jun 27 20:16:02 CEST 2025


Yes it is, but you need to tokenise the text first.

There aren’t really any technical details specific to any given language. The whole system makes no assumptions about the language, except that the data is divided into words.

best

Andrew.

From: CWB <cwb-bounces at sslmit.unibo.it> On Behalf Of Diana Santos
Sent: 27 June 2025 15:18
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: [CWB] Is it possible to use CWB with Chinese?

Dear all,
I wonder if it is possible to use CWB with Chinese, and of so, if anyone could point me to information where technical details can be found (for those who read/process Chinese).
Thanks a lot in advance,
Diana
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20250627/935d6791/attachment-0001.html>


More information about the CWB mailing list