[CWB] CQPweb index files
Stephanie Evert
stefanML at collocations.de
Mon Mar 11 19:43:51 CET 2024
Hi Simon,
you need to keep both corpora: <corpus> is the actual corpus that you've pre-indexed with CWB, and <corpus>__freq is a database with per-text frequency information. CQPweb stores the data as a CWB corpus because at the time when CQPweb was developed, MySQL was so horribly slow in aggregating large frequency tables that we got a lot better performance by abusing CWB as a sort-of relational database.
Best,
Steph
> On 11 Mar 2024, at 17:08, Simon Meier-Vieracker <simon.meier-vieracker at tu-dresden.de> wrote:
>
> Hi,
>
> just to be sure about the index files on CQPweb:
>
> Our usual workflow is to import the corpus to CWB with cwb-encode and then "install a corpus you have already indexed in CWB“. As I understand it, CQPweb then creates a new folder „corpus__freq“ where the index files which CQPweb needs are created.
>
> Since we are running out of disk space on our server: Do we still need the normal CWB index files after having the corpus installed in CQPweb?
>
> Thanks in advance
> Simon_______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list