Sorry, I forgot to mentions my version.<br><br>Version: 3.0.2 (i download it when 3.0 was not available)<br><br>When I searched I had seen something similar, but it happens with very large corpora, but I have this error even with a really tiny one.<br>
<br>Thanks<br>Eva<br><br>
Message: 2<br>
Date: Wed, 10 Oct 2012 11:55:08 +0000<br>
From: "Hardie, Andrew" <<a href="mailto:a.hardie@lancaster.ac.uk">a.hardie@lancaster.ac.uk</a>><br>
To: Open source development of the Corpus WorkBench<br>
<<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>><br>
Subject: Re: [CWB] Huffman code error<br>
Message-ID:<br>
<28078EC3FBF1B940A3EF3D0D19BE351D0D38F6@EX-0-MB1.lancs.local><br>
Content-Type: text/plain; charset="iso-8859-1"<br>
<br>
I have the feeling this bug has come up before - can you check your version? (cqp -v)<br>
<br>
thanks<br>
<br>
Andrew.<br><br><div class="gmail_quote">2012/10/10 BOFÍAS ALBERCH, EVA <span dir="ltr"><<a href="mailto:eva.bofias@upf.edu" target="_blank">eva.bofias@upf.edu</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi, <br><br>I have an error, I am not able to solve. I'm trying to build a Latin corpora but I get this error:<br><br>Error: Huffman codes too long (32 bits, current maximum is 31 bits).<br> Please contact the CWB development team for assistance.<br>
<br>I got this error when trying to build a 40 words corpora (I cut it to see if I could detect the error; with 39 words I do not get the error)<br><br>-----------<br><doc type="CHRISTIAN_LATIN" title="Abelard"><br>
<s><br>PETRUS Petrus N:nom<br>ABAELARDUS UNKNOUN ADJ<br>( ( PUN<br>1079-1142 card ADJ:NUM<br>) ) PUN<br>ABAELARDI UNKNOUN N:voc<br>AD UNKNOUN N:abl<br>AMICUM amicus ADJ<br>
SUUM sus N:gen<br>CONSOLATORIA consolatorius ADJ<br>Sepe sepes N:dat<br>humanos humanus ADJ<br>affectus affectus N:nom<br>aut aut CC<br>provocant provoco V:IND<br>aut aut CC<br>
mittigant mi V:IND<br>amplius ample ADV<br>exempla exemplum N:nom<br>quam qui REL<br>verba verbum N:nom<br>. . SENT<br></s><br><s><br>Unde unde ADV<br>post post PREP<br>
nonnullam nonnullus ADJ<br>sermonis sermo N:gen<br>ad ad PREP<br>habiti habeo V:PTC<br>consolationem consolatio N:acc<br>, , PUN<br>de de PREP<br>ipsis ipse DET<br>calamitatum calamitas N:gen<br>
mearum meus POSS<br>experimentis experimentum N:abl<br></s><br></doc><br><br>-----------------<br>This are the attributes I use to describe the corpus:<br><br>cat $SOURCEFILE | /usr/local/cwb-3.4.1/bin/cwb-encode -c utf8 -d $DATADIR -R $REGDIR/$CORPUSNAME -xsB -P lema -P pos -V s -S doc:0+type+title -S not:0+text<br>
<br>Thanks<br><br>Eva Bofias<br>
</blockquote></div><br>