[CWB] export corpus

Andrés Chandía andres.chandia at upf.edu
Mon Apr 24 13:13:42 CEST 2023


Thanks Stephanie, we will definitely have to do it via shell, the process
from the web never ends with a big corpus... and thanks for the advice on
XML tags, one of the corpus we were trying to export is XML tagged...

*... Andrés Chandía*


Missatge de Stephanie Evert <stefanML at collocations.de> del dia dg., 23
d’abr. 2023 a les 9:29:

> If you do it on the command-line rather than via CQPweb, make sure you
> have CWB v3.5 and read Sec. 8 of the Corpus Encoding Manual carefully to
> see how you can reconstruct nested XML tags and attribute-value pairs in
> the start tags (if they have been split up by cwb-encode).
>
> Best,
> Stephanie
>
> > On 23 Apr 2023, at 01:26, Josep M. Fontana <josepm.fontana at upf.edu>
> wrote:
> >
> > Thanks. We'll try that.
> >
> > JM
> >
> > On 22/4/23 23:48, Hardie, Andrew wrote:
> >> With cwb-decode.
> >>
> >> best
> >>
> >> Andrew
> >>
> >> From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On
> Behalf Of Andrés Chandía
> >> Sent: Thursday, April 20, 2023 6:23 PM
> >> To: Open source development of the Corpus WorkBench <
> cwb at sslmit.unibo.it>
> >> Subject: [CWB] export corpus
> >>
> >> How do I export big corpus not compromising the machine resources?
> >> No data available in manuals...
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20230424/9d2f3160/attachment.html>


More information about the CWB mailing list