[CWB] Timing issues when using the CQP Perl module

Stefan Evert stefan.evert at uos.de
Sat Jan 12 02:04:30 CET 2008


>

Hi everyone!

>
> While using the CQP Perl module inside a Perl server script, I  
> notice some timing anomalies.
> These are introduced when I call the query method of a CQP object.
>
>     $my_cqp->query("$cqpQuery");
> This line in my code for simple queries (e.g. [word="haus"%c] 
> [word="mieten"%c] )
> sometimes takes more than 30 seconds to return, with the usual  
> response time being around 2-3 secs.
> (i do some very basic time counting using gettimeofday to get secs  
> and microsecs).
>
> a) Has anybody come up with a similar observation? Or even a useful  
> conclusion/solution?

I have never observed strange behaviour from the Perl module in this  
respect (or has anyone else had such problems), so the most likely  
answer is that sometimes CQP just takes a long time to execute the  
query.

Query execution times depend greatly on server load, memory usage,  
and whether the corpus data files are already cached in memory or  
have to be read from disk.  Even a simple query like the one you  
mentioned can take fairly long on a BNC-size corpus when the cache is  
still "cold"; a second query immediately afterward will complete in a  
few seconds or less.

How large are the corpora on which you've observed this behaviour?   
There is absolutely no reason why CQP should take that long on a 5- 
million word corpus.

> b) Could it be a problem of my server's setup? Is a CWB-rebuild  
> recommended?
>
> I am using Version:   2.2.b91 ( as reported by cqp -v).

Your version is a bit old, but the problem is most likely not  
something that a CWB update would solve.

One thing that comes to mind is that your Web server may impose a  
limit on the number of child processes in the CGI subsystem (to keep  
it from being overloaded by many parallel queries).  Since the Perl  
module has to spawn CQP as a subprocess, it might be kept on hold  
occasionally when the limit has been reached.  I'm not enough of an  
expert on Web servers to tell whether this is a probable explanation  
or not.  In any case, the delay should happen when you create a new  
CQP object rather than when executing a query.

Best,
Stefan


More information about the CWB mailing list