[CWB] How to use int() to get all the sentences with a numeric positional attribute higher then some value

José Manuel Martínez Martínez chozelinek at gmail.com
Wed Jun 8 23:22:34 CEST 2022


Thank you for your answer, Stephanie! It makes sense now.
Cheers!
--
José Manuel Martínez Martínez
https://chozelinek.github.io


On Tue, May 31, 2022 at 7:51 PM Stephanie Evert <stefanML at collocations.de>
wrote:

>
> > I have a corpus for which each sentence we have an structural attribute
> s_lsent_linguistic_features that can have a value from 0 to 1.
> > I want to filter the sentences with a value lower than 0.3.
> > I've seen in the documentation that one can use the int built-in
> function to make comparisons with values that should be interpreted as
> numbers.
> > I'm using something like this
> > <s>[_.s_sent_quality_score = "0\.[1|2|3].*"]
> > But I was wondering, whether using the int built-in function it could be
> written in a better/easier way.
>
> Not directly: int() does exactly what its name says and converts the
> annotated string to an integer if possible.  In this case, you'll probably
> always get a 0 result.
>
> However, you could convert your annotations to fixed-point representation,
> e.g. multiply by 1000 and round to integer for three decimal digits of
> precision.  So e.g. 0.3 would be stored as the string "300" in your
> s-attribute.
>
> Then your query translates to
>
>         <s> [int(_.s_sent_quality_score) >= 100 &
> int(_.s_sent_quality_score) < 400]
>
> Best,
> Stephanie
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20220608/c4544e0a/attachment.html>


More information about the CWB mailing list