[CWB] Corpus installation - admin v user

Hardie, Andrew a.hardie at lancaster.ac.uk
Mon Apr 24 11:28:15 CEST 2023


Hi Mike,

The job queue is to stop too many user-installed corpora blocking up the system all at once. Admin-installed corpora are assumed to be allowed to take up a full CPU, ton of RAM, etc. immediately, so they don't go through that queue.

I think that your firewall situation does, however, imply that it would be good to move away from having the server/browser connection maintained throughout indexing. (Basically to make the process disconnect the browser once indexing starts). I'll add this as a feature request. However, it's a big re-engineering of the UI so it won't happen soon.

best

Andrew.

From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On Behalf Of Michael Lynch
Sent: Friday, April 21, 2023 5:14 AM
To: cwb at sslmit.unibo.it
Subject: [CWB] Corpus installation - admin v user

Hi all,

I've been looking at ways around a problem with installing large corpora on our installation of CQPweb.

Our university's firewall cuts off web connections after around 20 seconds, which means that indexing and installing corpora over a certain size via the admin interface gets interrupted.

The workaround for this so far has been to index and add metadata using command-line tools on the server, but I'd like to get installation working via the web interface.

I've been looking through the CQPweb source, and have noticed that the process for user-installed corpora is managed using a job queue, which means that in theory it wouldn't get interrupted by the firewall timeout.

Is there any plan to rework the admin corpus installation code so that it uses the same queuing system?

Alternatively, are there any differences between user- and admin-installed corpora, in terms of the functionality available once they're installed?

At the moment, we don't allow users to upload corpora, but if we could grant installation privileges to our admin users so that they can install corpora using the user-installed system, it could be a way around the firewall problem

Regards,
Mike

Mike Lynch (he/him) | Research Engineer Group Lead
The University of Sydney
Sydney Informatics Hub | Core Research Facilities
M +61 478 872 039 | E m.lynch at sydney.edu.au<mailto:m.lynch at sydney.edu.au>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20230424/d8a8b6ef/attachment-0001.html>


More information about the CWB mailing list