[CWB] Accessing Results from CQP child with Python
Andrew Nelson
an.linguist at gmail.com
Wed Mar 25 19:33:19 CET 2020
Hi José,
Thanks for getting back to me. This was an extremely helpful comment. I had
noticed previously that I was able to access dump results from python, but
I didn't realize why. I ended up doing a very similar thing that the dump
command does in my own CQP function and it's working perfectly!! Thank you!
Also here's a screenshot of my code for reference if anyone wants to do
this in the future. First, execute a query where formatted_query is
something like 'query = 'this' 'is' [pos = 'dt'] 'query'
Then execute a command where you cat that query. Then read stringio to
dataframe and return the DF to use how you want!
Thanks again!
Andrew
On Wed, Mar 25, 2020 at 11:24 AM José Manuel Martínez Martínez <
chozelinek at gmail.com> wrote:
> Hi Andrew,
> I was playing also with the package you are using (cwb-ccc
> <https://github.com/ausgerechnet/cwb-ccc> pretty awesome work by Philipp)
> to run the commands. Checking the class Corpus
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cwb.py#L22>,
> there's a method query
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cwb.py#L505>
> which uses the method df_node_from_query
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cwb.py#L217>
> and there in line 253 you will see the call
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cwb.py#L253>
> to the Dump
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cqp_interface.py#L224>
> method of the API, check line 257
> <https://github.com/ausgerechnet/cwb-ccc/blob/9809dc16c8aaae924b5dcdcee1e5035465a82994/ccc/cqp_interface.py#L257>
> you can see how the results become a dataframe object in memory instead to
> serializing to disk.
> Cheers,
> --
> José Manuel Martínez Martínez
> https://chozelinek.github.io
>
> --
> José Manuel Martínez Martínez
> https://chozelinek.github.io
>
>
> On Wed, Mar 25, 2020 at 3:22 PM Andrew Nelson <an.linguist at gmail.com>
> wrote:
>
>> Hello,
>>
>> The tool I'm building has several features and I'm writing everything in
>> python. When I was writing the initial program I couldn't figure out how to
>> feed the CQP results to a dataframe, so to just get it working I wrote it
>> to a file, then read the data from the file to a dataframe to send the DF
>> to html. I'm using a cqp_interface.py wrapper to run the CQP commands (found
>> here
>> <https://github.com/ausgerechnet/cwb-ccc/blob/master/ccc/cqp_interface.py>).
>> Can anyone here help explain how I should go about passing the results from
>> cqp to python so I don't have to do this extra step of saving to a file? My
>> largest corpus is over 800 million tokens, so saving and loading results
>> just slows things down more.
>>
>> Thanks for any help!
>>
>> Best regards,
>>
>> Andrew
>>
> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20200325/ae65503f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 431881B6-6629-46A5-9E7D-A86747958063.png
Type: image/png
Size: 92033 bytes
Desc: not available
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20200325/ae65503f/attachment-0001.png>
More information about the CWB
mailing list