[CWB] cqp sorting by s-attribute?
Javier Pueyo
javier.pueyo at gmail.com
Tue Jun 11 13:43:44 CEST 2019
Thanks Stefan and Maarten for your help. This pipe works nicely for me:
tabulate Query match, matchend, match text_date > "| sort -k 3,3 | cut -f
1-2 > query_sorted.txt";
and then
undump Sorted < "query_sorted.txt";
set PrintStructures text_date;
cat Sorted;
I will also try Maarten's solution, but I am going to need at least 4 more
columns I want to sort by (text_year, text_place, text_tipology, etc.), and
I don't know if perfomance will be an issue with so many "fake"
p-attributes...
Thanks again,
Javier
El mar., 11 jun. 2019 a las 3:05, Stefan Evert (<stefanML at collocations.de>)
escribió:
> > I was wondering if there is some way to sort cqp KWIC results by
> s-attributes (text_id, tex_date) instead of sorting them -attributes (word,
> lemma, etc.). I tried
> >
> > CORPUS> sort Last by text_date;
>
> That's not possible because s-attributes are implemented in an entirely
> different way than p-attributes, so the sorting code would have to be
> rewritten completely (and it would be less efficient).
>
> S-attributes also used not to work with "group", but special case code was
> added there at some time (which uses a trick I'd rather not speak about to
> achieve efficient counting).
>
> If you want your query sorted by s-attributes, you will have to rely on
> external tools. The basic procedure is as follows (assuming a named query
> Query rather than implicit Last):
>
> tabulate Query match, matchend, match text_date > "query.txt";
>
> Then open the file "query.txt" with spreadsheet software (preferably
> LibreOffice; with MS Excel, make sure to select "Open File" from the menu
> so you'll get the import dialog to read TAB-delimited data properly). You
> can now sort on the third column (or whatever other criteria you want to
> add), then remove everything except for the first two columns and save them
> as a TAB-delimited file (say "query_sorted.txt").
>
> It is important to make sure that only the "match" and "matchend" columns
> are left in this file, so it can be imported back into CQP. In CQP, the
> next steps is:
>
> undump Sorted < "query_sorted.txt";
>
> You should now see that the query results are sorted by date:
>
> set PrintStructures story_title;
> cat Sorted;
>
> If you're familiar with Unix command line tools, you will be able to do
> the sort much more easily with a combination of "sort" and "cut". This can
> even be included in a pipe run from within CQP.
>
> If your query has target and/or keyword anchors, you will have to add them
> to the file and make sure they're read back in.
>
> Hope this helps,
> Stefan
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20190611/5b8cd4aa/attachment.html>
More information about the CWB
mailing list