[CWB] Parallel corpus alignment question

Graham Ranger -- UAPV graham.ranger at univ-avignon.fr
Thu May 22 14:27:18 CEST 2025


OK... I'll answer my own questions (this is becoming a habit, 
apologies!). (Please correct me, elaborate, etc. if necessary.)

1) the admin pw is "user", but it seems to work only with sudo, not with su
2) I have aligned source texts with target texts, cqpweb doesn't appear 
to be able to display all the alignments simultaneously but does enable 
quick switching which is good.

Apologies for troubling you but perhaps this might be of help to other 
cqpweb[inabox] users.
Best,
Graham.

Le 22/05/2025 à 14:03, Graham Ranger -- UAPV a écrit :
> And a follow-up question... could somebody tell me what the admin 
> password is for cqpwebinabox? (I'm trying to do this on a VM with 
> cqpwebinabox, before putting it on a public server.)
> Thanks again!
> Graham.
>
>
> Le 22/05/2025 à 13:38, Graham Ranger -- UAPV a écrit :
>> Hello to all,
>> I'm currently trying to set up a parallel corpus including a source 
>> text and four different translations.
>> The method I use to set up a parallel corpus is this (copied and 
>> adapted from the cqp / cwb manuals):
>>
>> To set up parallel corpora:
>>
>> 1) Get them installed on cqpweb with the different xml tags declared, 
>> etc.
>> 2) Use cwb-align to generate an alignment file suffixed .align, i.e.
>> cwb-align -r /var/cqpweb/registry/ -o test.align TEST_EN TEST_FR s
>> This indicates the registry directory explicitly with the -r option.
>> 3) Modify the registry files using nano to indicate the other aligned 
>> corpus. Th
>> is means modifying /var/cqpweb/registry/"my_corpus" and appending 
>> ALIGNED "other
>> _corpus".
>> 4) Use cwb-align-encode to point to the alignment file. This need to 
>> be done as
>> admin i.e. with su and using -d and -r options to point to the data 
>> and registry
>>  directories
>> The second command does the same thing backwards, i.e. reads the 
>> alignments the
>> other way round, with the -R switch.
>> cwb-align-encode -d /var/cqpweb/index/test_en/ -r 
>> /var/cqpweb/registry/ test.ali
>> gn
>> cwb-align-encode -d /var/cqpweb/index/test_fr/ -r 
>> /var/cqpweb/registry/ -R test.
>> align
>> 5) Test it out in cqpweb.
>>
>> Now, my question is: can I set up a parallel corpus in such a way 
>> that a search in the source will display all the aligned translations 
>> simultaneously?
>> If so, is it just a question of following this how-to for each 
>> source-target pair, and then declaring multiple alignments in cqpweb 
>> or do I align all the text from the CLI?
>> I hope the question is clear and thank you in advance for any guidance.
>> Best,
>> Graham.
>>
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20250522/609767b2/attachment-0001.html>


More information about the CWB mailing list