Need a hand with regex in c# that spans multiple rows -
i trying parse out html page has table rows in it. need of table cells within table row.
here's sample of html i"m trying parse:
<tr style="font-size:8pt;"> <td style="font-size:8pt;">1545644656</td> <td style="font-size:8pt;">billy</td> <td style="font-size:8pt;">johnson</td> <td style="font-size:8pt;">def</td> <td style="font-size:8pt;"></td> <td style="font-size:8pt;">1134 main st</td> <td style="font-size:8pt;"></td> <td style="font-size:8pt;">anytown</td> <td style="font-size:8pt;">pa</td> <td style="font-size:8pt;">05405</td> </tr>
and here regex i"m using of stuff between tr start , tr end
regex exp = new regex("<tr style=\"font-size:8pt;\">(.*?)</tr>", regexoptions.ignorecase | regexoptions.multiline);
i'm doing foreach loop loop on of matches (there multiple rows)
foreach (match mtch in exp.matches(browser.html))
but it's not matching anything. had exact same code working on site before added new lines (\n) when 1 single long string...now doesn't match multi-line approach they're using.
any ideas here?
. wildcard matches character \n.
http://msdn.microsoft.com/en-us/library/az24scfc.aspx#character_classes
http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx
i believe need regexoptions.singleline instead.
Comments
Post a Comment