[CWB] How to add new data to a corpus without re-indexing it

Blätte, Andreas andreas.blaette at uni-due.de
Thu Jul 8 10:55:09 CEST 2021


Dear colleague,

technically it is no problem to add p- and s-attributes to an existing corpus. In case you do not mind working with R: The ‘cwbtools’ R package has respective functionality: https://CRAN.R-project.org/package=cwbtools

One limitation: Adding documents / further text is not possible.

Greetings, Andreas

Von: <cwb-bounces at sslmit.unibo.it> im Auftrag von wu liangping <liangpingwu at 126.com>
Antworten an: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Datum: Donnerstag, 8. Juli 2021 um 10:47
An: "cwb at sslmit.unibo.it" <cwb at sslmit.unibo.it>
Betreff: [CWB] How to add new data to a corpus without re-indexing it

Dear all,

Has anyon managed to add new data to a corpus without re-indexing it?

In the "Latest news" of a recent 3.2 branch CQPweb installation, it reads that CQPweb has "[c]ompleted the feature that adds new data to a corpus without re-indexing it (this can now be done for p-attributes as well as s-attributes and corpus metadata)" since version 3.2.31. However, a previous discussion back in 2012 in the thread titled "Appending text to an existing corpus" clearly says that we "need to re-index from scratch" if we want to append text to an existing corpus.

Has anyone tried the new feature with success? Or better still, is there any documentation for this new feature?

Thanks for any hints before we decide to dive into the actual code.


Best,
WU Liangping

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20210708/5adc16c8/attachment-0001.html>


More information about the CWB mailing list