[CWB] Bilingual corpus alignment

Austin Yang austin.yang.2014 at gmail.com
Tue Oct 4 02:46:48 CEST 2022


Hey Andrew and all of the community,
Thanks for the reply! Your reply is greatly appreciated!
This is my first time working with a bilingual corpus, so forgive me for my
ignorance in advance.
I'm still a bit confused to what the alignment attribute. The alignment
command is 'sudo cwb-align-import -r '/var/CQPweb/registry' -p test.algn'
Output: Use of uninitialized value $12_keys in split at
/usr/local/bin/cwb-align-import line 119, <$fn> line3. Alignment TEST-EN =>
TEST-CHN has been created with 7 non-empty beads.
I tried the 'show + test.algn', however it doesn't seem to work, and the
registry file doesn't seem to give much information in this regard.
Does it mean the alignment failed? Or I didn't set a designated alignment
attribute?
Another kind of out of scope question is that assuming everything works out
in cqp. Is it possible to upload and present the bilingual part (assume
some queried 'Taiwan' it should show a English segment containing 'Taiwan'
and a Chinese segment in the next line) in CQPweb?
Once again, any help is desperately needed and deeply appriciated!


Best,
Austin Yang (楊承洋)
MS in Cognitive Neuroscience, NCU
BS in Psychology, CYCU


On Mon, Oct 3, 2022 at 5:30 PM Hardie, Andrew <a.hardie at lancaster.ac.uk>
wrote:

> You need to
>
>
>
> show +your_alignment_attribute
>
>
>
> in CQP
>
>
>
> best
>
>
>
> Andrew.
>
>
>
> *From:* cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> *On
> Behalf Of *Austin Yang
> *Sent:* 03 October 2022 02:44
> *To:* cwb at sslmit.unibo.it
> *Subject:* [CWB] Bilingual corpus alignment
>
>
>
> Dear all,
>
> Recently I've encountered a problem using cwb's alignment encoding
> function.
>
> "Problem" might not be the accurate word but, I used a different alignment
> tool and fitted into cwb's standard format, and ran the regedit and encode
> procedure. This created an alx file in the source language index file. The
> tutorial says "This procedure only creates an a-attribute in HOLMES-EN
> (source corpus), linking it to HOLMES-DE (target corpus).", but that's all
> I can find. I don't know how to use cqp/cwb to present sentence alignment
> (i.e. I imagine querying "Sherlock" in the source corpus, it will present
> both the English and Dutch sentence including "Sherlock"). The attachment
> shows the command and output. I'm not even sure if the alignment is
> successful or not. Any help or information that sheds some light to this
> situation will be greatly appreciated!
>
>
>
>
> Best,
>
> Austin Yang (楊承洋)
>
> MS in Cognitive Neuroscience, NCU
>
> BS in Psychology, CYCU
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20221004/024b71a8/attachment.html>


More information about the CWB mailing list