Data Source Tutorial

Appendix B: Data Parser

 

 

First we need to extract the value of the title attribute using another GetTag rule (which, again, is inserted as a child of the first rule) that extracts the text between "title="" (including the opening quote character) and the closing quote character:

<ParsingRule type="GetTag" source="DaySource" result="Condition">

<StartTag>title="</StartTag>

<EndTag>"</EndTag>

</ParsingRule>

Then we remove the "Chance for" string, and any text that follows it, using the TrimFromStart rule:

<ParsingRule type="TrimFromStart" source="Condition" result="Condition">

<SearchText>Chance for</SearchText> </ParsingRule>

This leaves us with the desired text, but with a trailing space character (e.g. "Heavy Rain "). We remove this extra space with a Trim rule:

<ParsingRule type="Trim" source="Condition" result="Condition" />

Why didn't we just use a <space> token in the SearchText parameter of the TrimFromStart rule (e.g. <SearchText><space>Chance for</SearchText>)? Because TrimFromStart's SearchText parameter does not support the special "tag" syntax, so it doesn't interpret the <space> token as a space character.

Now let's turn to the temperature. In each table cell, the temperature string is always found between a <br> tag and a closing </font> tag (e.g. <br>Hi <font color="#FF0000">81°F</font>). So we use those tags in a GetTag rule to extract the temperature:

<ParsingRule type="GetTag" source="DaySource" result="Temp"> <StartTag><br></StartTag> <EndTag></font></EndTag>

</ParsingRule>

This gets us the temperature, but with the string "<font color="#FF0000">" embedded inside. We want to keep the "Hi" or "Lo" part, so we just want to remove that opening <font …> element. If you look at the HTML code, you'll see that "Hi" temperatures get a font color of #FF0000 while "Lo" temperatures get #0033CC. Since we want to remove the font tag regardless of what color is specified, we use a ReplaceTag rule to remove everything from the "<font" to the ">":

<ParsingRule type="ReplaceTag" source="Temp" result="Temp"> <StartTag><font</StartTag>

<EndTag>></EndTag>

<NewText/>

</ParsingRule>

Polycom, Inc.

187

Page 197
Image 197
Polycom PDS 2000 manual ParsingRule type=Trim source=Condition result=Condition