<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <font size="+1">Hi, <font size="+1">Stefan,<br>

        <br>

        <font size="+1">thank you for the an<font size="+1">swer and the

            tips<font size="+1">!<br>

              <br>

            </font></font></font></font></font><font size="+1"><font

        size="+1"><font size="+1"><font size="+1"><font size="+1"><font

                size="+1">Just one more question, if <font size="+1">I

                  <font size="+1">had</font> subjects who don't know any

                  foreign language, should I write<font size="+1">...</font><br>

                  <br>

                  <font size="+1">&nbsp;&nbsp;&nbsp; foreign_languages="||"</font><br>

                  <font size="+1"><font size="+1"><br>

                      I assume<font size="+1">,</font> by anology, that

                      should be the way to do it.<br>

                      <br>

                      <font size="+1">Or would it be enough <font

                          size="+1">to write<br>

                          <br>

                          <font size="+1">&nbsp;&nbsp;&nbsp; </font>foreign_languages=""<br>

                          <br>

                          <font size="+1">Best,<br>

                            <br>

                            <font size="+1">jmm</font><br>

                          </font></font></font></font></font></font></font></font><br>

          </font></font></font></font>

    <div class="moz-cite-prefix">El 21/04/13 13:39, Stefan Evert

      escribi&oacute;:<br>

    </div>

    <blockquote

      cite="mid:91052F11-9C12-48DA-8789-B7152925AC63@collocations.de"

      type="cite">

      <blockquote type="cite">

        <pre wrap="">I'm reading the corpus encoding tutorial and in section 5 I've found interesting stuff about feature sets for positional attributes. I am wondering if it would be possible to use such feature but with structural attributes.

</pre>

      </blockquote>

      <pre wrap="">

Yes.

</pre>

      <blockquote type="cite">

        <pre wrap="">Say that in my corpus I've collected information about the speakers, and some of them can speak more than one foreign language. I would like to have a structural attribute like

foreing_languages="ES|PT|IT"

for each text produced by that particular speaker.

</pre>

      </blockquote>

      <pre wrap="">

Simply encode them in feature set format as you would for positional attributes.  In your case, you need to add leading and trailing "|" separators as specified in the tutorial, e.g.

        &lt;speaker foreign_languages="|ES|PT|IT|"&gt;

        ...

        &lt;/speaker&gt;

and declare the foreign_languages XML attribute to be set valued (cf. "cwb-encode -h"):

        cwb-encode .... -S speaker:0+foreign_languages/

(the trailing slash marks foreign_languages as a set-valued attribute).  cwb-encode will validate the set notation of attribute values and re-order the set elements alphabetically (keep in mind that sets are unordered, so you cannot specify first, second and third foreign language in this way).

You will then be able to restrict searches to speakers who know Portuguese e.g. with a global constraint such as

        ... query ... :: match.speaker_foreign_language contains "PT";

Hope this helps,

Stefan

_______________________________________________

CWB mailing list

<a class="moz-txt-link-abbreviated" href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a>

<a class="moz-txt-link-freetext" href="http://devel.sslmit.unibo.it/mailman/listinfo/cwb">http://devel.sslmit.unibo.it/mailman/listinfo/cwb</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>