| 发表于:2007-01-05 22:18:38 楼主 |
有一个这样的html文档: <a href= " "> </a> <a href= "http://classad.163.com/html/area/110/index.html "> 陕 </a> <a href= "http://classad.163.com/html/area/341/index.html "> 宁 </a> <a href= "http://classad.163.com/html/area/577/index.html "> 藏 </a> <a href= "http://classad.163.com/html/area/325/index.html "> 宜昌 </a> <a href= "http://classad.163.com/html/area/31/index.html "> 郑州 </a> <a href= " "> </a> <a href= "http://classad.163.com/html/area/254/index.html "> 南通 </a> <a href= " "> </a> <a href= "http://classad.163.com/html/area/313/index.html "> 荆门 </a> <a href= " "> </a> <a href= "http://classad.163.com/html/area/81/index.html "> 日照 </a> <a href= "http://classad.163.com/html/area/36/index.html "> 洛阳 </a> <a href= "http://classad.163.com/html/area/393/index.html "> 朝阳区 </a> <a href= " "> </a> 用正则表达式: <a\s+href\s*=\s*[ " ¦ ']?(? <uri> [^ ' "> ]*)[ " ¦ ']?[^ <> ]*> \s*( <[^ <> ]+> )*(? <title> [^ <> ]*)( <[^ <> ]+> )*\s* </a> 是可以完全匹配的,也就是说可以匹配15结果(我在测试工具上运行过);但是我写的代码只可以匹配11个,我的代码是: string htmlregexepression=@ " <a\s+href\s*=\s*[ " " ¦ ']?(? <uri> [^ ' " "> ]*)[ " " ¦ ']?[^ <> ]*> \s*( <[^ <> ]+> )*(? <title> [^ <> ]*)( <[^ <> ]+> )*\s* </a> "; regex linksexpression =new regex( htmlregexepression, regexoptions.multiline ¦ regexoptions.ignorecase ); matchcollection matchs = linksexpression.matches(pagehtmlcontent); 请问是不是htmlregexepression转换错误了??请高手急救!!!!! 无限感谢!!!!!!!!!!!!!!!!!!!!! |
|
|
|
|