<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <font size="+1">Hi<font size="+1">, Pavel:<br>
        <font size="+1">My dirty way is to<font size="+1"> launch a
            query <font size="+1">with the restrictions for that <font
                size="+1">subcorpus. Take the EUROPARL-EN, element
                speaker has <font size="+1"><font size="+1"><font
                      size="+1"><font size="+1">an attri<font size="+1">bute
                          called language, indicating the source
                          language of th<font size="+1">e tokens
                            contained in that element</font>. If I only
                          want the <font size="+1">tokens <font
                              size="+1">in English y run th<font
                                size="+1">is query:<br>
                                <br>
                                &nbsp;&nbsp;&nbsp; []<font size="+1"> <font size="+1">::
                                    ma<font size="+1">tch.speaker<font
                                        size="+1">_language="<font
                                          size="+1">DE</font>";<br>
                                        <font size="+1"><br>
                                          If you do<font size="+1">:<br>
                                            <font size="+1"><br>
                                              &nbsp;&nbsp;&nbsp; size Last<font
                                                size="+1">;<br>
                                                <font size="+1"><br>
                                                  You get the size in<font
                                                    size="+1"> tok<font
                                                      size="+1">ens<font
                                                        size="+1">, in
                                                        this case
                                                        5532412.<br>
                                                      </font><br>
                                                      <font size="+1">When
                                                        I want to
                                                        calculate the
                                                        same but for all
                                                        the subc<font
                                                          size="+1">orpora
                                                          at once<font
                                                          size="+1"> (in
                                                          <font
                                                          size="+1">my
                                                          case all subc<font
                                                          size="+1">orpora
                                                          according to
                                                          the source
                                                          language)</font>:<br>
                                                          <br>
                                                          <font
                                                          size="+1">&nbsp;&nbsp;&nbsp;
                                                          []<font
                                                          size="+1">;<br>
                                                          <br>
                                                          <font
                                                          size="+1">&nbsp;&nbsp;&nbsp;
                                                          group<font
                                                          size="+1">
                                                          Last match <font
                                                          size="+1">verbalization_language;<br>
                                                          <br>
                                                          <font
                                                          size="+1">Then
                                                          you get a
                                                          table <font
                                                          size="+1">similar
                                                          to:<br>
                                                          <br>
                                                          <font
                                                          size="+1">DE&nbsp;&nbsp;&nbsp;
                                                          5532412<br>
                                                          FR&nbsp;&nbsp;&nbsp; 4921250<br>
                                                          NL&nbsp;&nbsp;&nbsp; 3003754<br>
                                                          ES&nbsp;&nbsp;&nbsp; 2772929<br>
                                                          IT&nbsp;&nbsp;&nbsp; 2407213<br>
                                                          PT&nbsp;&nbsp;&nbsp; 1665839<br>
                                                          EL&nbsp;&nbsp;&nbsp; 1382710<br>
                                                          SV&nbsp;&nbsp;&nbsp; 1378828<br>
                                                          DA&nbsp;&nbsp;&nbsp; 698575<br>
                                                          FI&nbsp;&nbsp;&nbsp; 571006<br>
                                                          PL&nbsp;&nbsp;&nbsp; 363083<br>
                                                          <font
                                                          size="+1">...&nbsp;&nbsp;&nbsp;
                                                          ...<br>
                                                          <br>
                                                          <font
                                                          size="+1">I
                                                          hope it hel<font
                                                          size="+1">ps<font
                                                          size="+1"><font
                                                          size="+1">!</font></font></font></font><br>
                                                          </font></font></font></font><br>
                                                          </font></font></font></font></font></font></font></font></font></font></font></font></font></font></font></font></font></font></font></font>Best,<br>
                                <br>
                                <font size="+1">jmm</font><br>
                                <br>
                              </font></font></font></font></font></font></font></font></font></font></font></font></font></font>
    <div class="moz-cite-prefix">El 15/04/13 20:53, Pavel Vond&#345;i&#269;ka
      escribi&oacute;:<br>
    </div>
    <blockquote cite="mid:1507813.WnCAretq9Z@platyz" type="cite">
      <pre wrap="">Excuse me, but is there any way to get the size of a subcorpus in tokens? 
Somehow I cannot find such a basic thing, sorry.

Thanks,
Pavel

_______________________________________________
CWB mailing list
<a class="moz-txt-link-abbreviated" href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a>
<a class="moz-txt-link-freetext" href="http://devel.sslmit.unibo.it/mailman/listinfo/cwb">http://devel.sslmit.unibo.it/mailman/listinfo/cwb</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>