[CWB] [cwb:feature-requests] #4 Matching against word lists in a more flexible way

Andrew Hardie andrewhardie at users.sf.net
Fri Jan 30 18:51:39 CET 2015


- **status**: open --> wont-fix
- **Group**:  --> TODO-3.5
- **Comment**:

This would be nice but is too low a priority to remain on the list for now.



---

** [feature-requests:#4] Matching against word lists in a more flexible way**

**Status:** wont-fix
**Group:** TODO-3.5
**Created:** Fri Jan 26, 2007 02:57 PM UTC by Manuel Kountz
**Last Updated:** Fri Jan 26, 2007 02:58 PM UTC
**Owner:** nobody

Assume someone wants to query for concatenative tokens like "vor+zu+gehen", where prefixes and verb stems both come from a list, while the "zu" is a fixed part of the query. It seems that CQP does not support anything like searching for something which is formed from an element of a word list and some fixed part, or which is formed by combining two word lists \(other than through the API\).

One solution would be to treat the word list as a regular expression in any case, transforming it into a transducer disjoining all elements in the list, and eventually combining this transducer with another one which implements a RE from a query. \(AFAIK this is the way SFST treats lexicon files.\) This should also speed up word list matching, which is said to be comparatively slow.


---

Sent from sourceforge.net because cwb at sslmit.unibo.it is subscribed to https://sourceforge.net/p/cwb/feature-requests/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/cwb/admin/feature-requests/options.  Or, if this is a mailing list, you can unsubscribe from the mailing list.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20150130/677a00aa/attachment.html>


More information about the CWB mailing list