[CWB] Question about translations for sentences

Diana Santos dianamsmpsantos at gmail.com
Tue Dec 8 21:54:04 CET 2020


Thanks a lot, Stefan! I will upgrade CQP until the end of the year, and
will let you know if I get any problems.
Diana

Stefan Evert <stefanML  collocations.de> escreveu no dia terça, 8/12/2020
à(s) 15:48:

> In addition to what Andrew explained, you should also (when you can afford
> the time :) …
>
> > Thanks a lot. However (maybe this is because I am using a version of cqp
> which is too old? 3.0.0)
>
> 1) Get a current version of CWB (3.4.27 at the moment).  There are a lot
> of improvements and bug fixes that haven't been ported back to the old 3.0
> branch.
>
> You'll need to check CWB out from the SVN repository and compile from
> source, but that's not too difficult (internal note: I guess we should
> provide some instructions on the Web site).  Unless you have Ubuntu 20.04
> because the install script is broken there.
>
> > The corpus is encoded with eg.
> > <mwe lema=one=example=of lema pos=N>
>
> 2) Encode your XML tags as proper XML, i.e. with attribute values quoted:
>
> <mwe lema="some noun" pos="N">
>> </mwe>
>
> > and created with the flag -V mwe.
>
> 3) Encode with -S mwe:0+lema+pos
>
> This will split out the annotations on <mwe> tags into separate attributes
> mwe_lema and mwe_pos; the ":0" checks that your open and close tags are
> properly balanced and will ignore any nested <mwe> regions (with warnings).
>
> > However, when I query
> > [ ] :: match.mwe="/.*/";
>
> Then you can directly match lemma and pos
>
>         … :: match.mwe_lema=".+ness" & match.mwe_pos = "N";
>
> Best,
> Stefan
>
> _______________________________________________
> CWB mailing list
> CWB  sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- pr�xima parte ----------
Um anexo em HTML foi limpo...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20201208/aeeb60ec/attachment.html>


More information about the CWB mailing list