[CWB] CWB Digest, Vol 207, Issue 7

Chelsey MacPherson CMacPherson at dal.ca
Fri Jul 19 15:21:42 CEST 2024


Thank you everyone for your replies!

Query issue: Corpus na Gàidhlig, CQPweb, simple query (ignore case)

I see now that the issue persists when I select "accent insensitive". Then the query "ora[i,]n" displays as this in the query history: ""ora[i,]n" %cd". I don't understand the "%cd" symbol. Is it ignore case? Either way, the accent insensitive selection affects the query and I'll only get results with variants of "orain". As a work around I did this "[òra,òrai,ora,orai,óra,órai]n<https://dasg.arts.gla.ac.uk/CQPweb/dasg/index.php?ui=search&insertString=%5B%C3%B2ra%2C%C3%B2rai%2Cora%2Corai%2C%C3%B3ra%2C%C3%B3rai%2Camhra%2Camhrai%5Dn&insertType=sq_nocase>" and was able to get accented versions or "oran" and "orain".

I don't know enough to verify the preprocessing or indexing question. I see in the corpus metadata that there is no word-level annotation and the STTR is not cached for the tokens—though I don't comprehend those things or know if that answers that question.

Thanks again,
Chelsey
________________________________
From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> on behalf of cwb-request at sslmit.unibo.it <cwb-request at sslmit.unibo.it>
Sent: Thursday, July 18, 2024 12:39 PM
To: cwb at sslmit.unibo.it <cwb at sslmit.unibo.it>
Subject: CWB Digest, Vol 207, Issue 7

CAUTION: The Sender of this email is not from within Dalhousie.

Send CWB mailing list submissions to
        cwb at sslmit.unibo.it

To subscribe or unsubscribe via the World Wide Web, visit
        http://liste.sslmit.unibo.it/mailman/listinfo/cwb
or, via email, send a message with subject or body 'help' to
        cwb-request at sslmit.unibo.it

You can reach the person managing the list at
        cwb-owner at sslmit.unibo.it

When replying, please edit your Subject line so it is more specific
than "Re: Contents of CWB digest..."


Today's Topics:

   1. Re: query efficiency issue (graham.ranger)
   2. Re: query efficiency issue (Hardie, Andrew)


----------------------------------------------------------------------

Message: 1
Date: Thu, 18 Jul 2024 17:18:19 +0200
From: "graham.ranger" <graham.ranger at univ-avignon.fr>
To: Open source development of the Corpus WorkBench
        <cwb at sslmit.unibo.it>
Subject: Re: [CWB] query efficiency issue
Message-ID: <20240718151748.5460F200EE at zmtaauth05.partage.renater.fr>
Content-Type: text/plain; charset="utf-8"

Hello all,?Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.Best,?Graham.Envoy? depuis mon appareil Galaxy
-------- Message d'origine --------De : Stephanie Evert <stefanML at collocations.de> Date : 17/07/2024  13:44  (GMT+01:00) ? : CWBdev Mailing List <cwb at sslmit.unibo.it> Objet : Re: [CWB] query efficiency issue > I'm having difficulties with a query in Corpus na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?I suspect we'll only be able to help you if you tell us which Web interface you used to run the query.? I suppose it is some CQPweb installation?Your query      ora[i,]nshould work as a simple query (CEQL syntax) and find both words.? If it doesn't, there might be something wrong with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus. Do you know it's actually there?You could also try different variants of the query or search for both forms separately. [oran,orain]    oran    orainBest,Stephanie_______________________________________________CWB mailing listCWB at ss
 lmit.unibo.ithttp://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2dc602ad/attachment-0001.html>

------------------------------

Message: 2
Date: Thu, 18 Jul 2024 15:37:23 +0000
From: "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
To: Open source development of the Corpus WorkBench
        <cwb at sslmit.unibo.it>
Subject: Re: [CWB] query efficiency issue
Message-ID:
        <LO4P265MB34858A504F7884D46F9DB1D0CBAC2 at LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM>

Content-Type: text/plain; charset="utf-8"

For the record it?s this server: https://dasg.arts.gla.ac.uk/CQPweb/

Andrew

From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On Behalf Of graham.ranger
Sent: Thursday, July 18, 2024 4:18 PM
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: [External] Re: [CWB] query efficiency issue

Hello all,
Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.
Best,
Graham.


Envoy? depuis mon appareil Galaxy


-------- Message d'origine --------
De : Stephanie Evert <stefanML at collocations.de<mailto:stefanML at collocations.de>>
Date : 17/07/2024 13:44 (GMT+01:00)
? : CWBdev Mailing List <cwb at sslmit.unibo.it<mailto:cwb at sslmit.unibo.it>>
Objet : Re: [CWB] query efficiency issue

> I'm having difficulties with a query in Corpus na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?

I suspect we'll only be able to help you if you tell us which Web interface you used to run the query.  I suppose it is some CQPweb installation?

Your query

ora[i,]n

should work as a simple query (CEQL syntax) and find both words.  If it doesn't, there might be something wrong with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus. Do you know it's actually there?

You could also try different variants of the query or search for both forms separately.

[oran,orain]
oran
orain

Best,
Stephanie
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2ad0f5ff/attachment.html>

------------------------------

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://liste.sslmit.unibo.it/mailman/listinfo/cwb


End of CWB Digest, Vol 207, Issue 7
***********************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240719/dcc42e7f/attachment.html>


More information about the CWB mailing list