[CWB] Maximum corpus size

Austin Yang austin.yang.2014 at gmail.com
Mon Feb 6 09:53:18 CET 2023


Dear all,
I'm trying to encode a corpus size over 2GiB. The CWB encoding tutorial
noted that it is possible by changing the CL_MAX_CORPUS_SIZE from CWB
source code. I modified the parameter (CL_MAX_CORPUS_SIZE) from the cl.h
file (which I'm not sure if it's the CWB source code mentioned in the
tutorial) by 10x, but the CQPweb site still show that the maximum token
is 2,147,483,647 tokens. Did I miss something from the tutorial? Any
comments will be greatly appreciated!

CWB version 3.5.0


Best,
Austin Yang (楊承洋)
MS in Cognitive Neuroscience, NCU
BS in Psychology, CYCU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20230206/a6b67be0/attachment.html>


More information about the CWB mailing list