<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New","serif";
        color:black;}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;
        color:black;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Verdana","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor="white" lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">Hi Josep,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">The issue here is that the Icelandic corpus on Ray’s server have been installed as if it had been tagged by the Lancaster tagger combination of CLAWS + USAS
(which uses the CLAWS7 tagset) whereas in fact it hasn’t. Couldn’t be, in fact, since C7 is a tagset for English not Icelandic.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">This is my fault, indirectly. Way back when CQPweb was only used here at Lancaster, corpus installation had to be done manually, which was a very time-consuming
process. To speed things up, I created the indexing web-forms, which have two settings for p-attributes: “default” i.e. assume it has been tagged by CLAWS and USAS, or “custom” i.e. specify the p-attributes yourself. In retrospect this was clearly the Wrong
Thing, as nowhere else but Lancaster is CLAWS+USAS the “default”, making it too easy for superusers elsewhere to do the wrong thing in the web forms. I
<i>am</i> going to replace this system with something more site-neutral, when I get the time....<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">Anyway, the upshot: if you leave “default” specified when indexing a corpus, then CQPweb will believe it has CLAWS7 tags and USAS semantic tags, even if it
doesn’t. The way to get around this is to ignore what CQPweb says the tags are and to look at what they really are (e.g. by going to frequency list and looking at a freq list of the part-of-speech tag attribute).
<o:p></o:p></span></p>
<p class="MsoNormal"><a href="http://124.193.83.252/cqp/IcePaHC/freqlist.php?flTable=__entire_corpus&flAtt=pos&flFilterType=begin&flFilterString=&flFreqLimit1=&flFreqLimit2=&pp=50&flOrder=desc&uT=y">http://124.193.83.252/cqp/IcePaHC/freqlist.php?flTable=__entire_corpus&flAtt=pos&flFilterType=begin&flFilterString=&flFreqLimit1=&flFreqLimit2=&pp=50&flOrder=desc&uT=y</a><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">Once you know what tags to use, the simple query syntax
<i>will</i> work. (I just tried <b>_Q-A</b>, for instance, and it works. Not that I have any idea what Q-A means in this tagset!)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">“show +pos” doesn’t work because the interface only allows
<i>queries</i> to be specified by the user. Other CQP commands are blocked. (In fact, CQPweb
<i>always</i> uses show +pos or equivalent, but the tags are rendered in the tooltip that pops over the central link of a concordance, not in the main concordance itself.)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">best<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D">Andrew.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext"> cwb-bounces@sslmit.unibo.it
[mailto:cwb-bounces@sslmit.unibo.it] <b>On Behalf Of </b>Josep M. Fontana<br>
<b>Sent:</b> 25 October 2012 12:04<br>
<b>To:</b> cwb@sslmit.unibo.it<br>
<b>Subject:</b> Re: [CWB] Announcement: Another CWB/CQPweb setup in China<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Hi,<br>
<br>
I am a little (or quite) confused about the syntax of CQPweb queries (simple query language). I went to the wonderful resource Ray Wu has made available so that I could see how it works since we are in the process of installing CQPweb as an interface for our
corpora. I wasn't able to complete any search using the simple query language, though. I'm sure it is something very simple that I am missing.