[CWB] CQP s-attributes constrains: select text_ids for searching

Ramiro Santa Ana dev at academia.org.mx
Fri Jun 27 15:15:20 CEST 2025


Thank you, Stephanie, this is what I was looking for. For the regex, I
guess I can do something like text_id = "(ID1|ID2|IDn)”

In the other hand, when you said “so this can be tedious (or not work at
all) if you have a very long list of text IDs”, which thing could not work?
If I have like, say, 100 docs, could this approach not work?

El vie, 27 jun 2025 a la(s) 4:28 a.m., Stephanie Evert (
stefanML at collocations.de) escribió:

> Dear Ramiro,
>
> I suppose that you're talking about doing the subcorpus search directly in
> CQP?  Then you can simply restrict your matches to those in the desired
> documents.  You can either do it in the CQP query
>
> .... query ... within text :: match.text_id = "...";
>
> or you can make a subcorpus of all relevant texts
>
> Sub = <text_id = "..."> [] expand to text;
>
> then activate it for a subcorpus search:
>
> Sub;
> ... query ...;
>
> The inconvenience is that you can use word lists with s-attributes, so
> this can be tedious (or not work at all) if you have a very long list of
> text IDs. You have to match against a regular expressions that is
> essentially a disjunction of the relevant text IDs.
>
> Hope this helps,
> Stephanie
>
>
>
> On 26 Jun 2025, at 21:21, Ramiro Santa Ana Anguiano <dev at academia.org.mx>
> wrote:
>
> I wonder how I can do a search but only on documents with specific
> text_ids.
>
> I think what I am looking for is in the CQP Manual
> <https://cwb.sourceforge.io/files/CQP_Manual.pdf> (probably on page 30),
> but I am not sure about the right syntax in order to achieve this.
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20250627/d2d3e8df/attachment.html>


More information about the CWB mailing list