C#:使用正则表达式分组构造删除捕获的组以外的标记
本文关键字:删除 正则表达式 | 更新日期: 2023-09-27 18:35:47
我想删除除某些分组标签之外的所有标签。我有一个带有更多标签的文本。我只想删除除某些标签以外的所有标签。只是下面的简单示例
<first>This <em>is</em> an <strong>testing</strong> data</first><second> testing <b>data</b> second</second><third>testing <i>data</i> third </third>
我只想删除除"em,i,b"以外的所有标签。输出文本如下
This <em>is</em> an <strong>testing</strong> data testing <b>data</b> second<testing <i>data</i> third
如何在正则表达式中执行此操作。
我已经尝试过如下
sampleStr = Regex.Replace(sampleStr , "<(?!(strong|em|u|i|b))>", "");
但它不起作用..
你非常接近:
<(?!/?(?:strong|em|i|u|b))[^>]+>
这是一个"正式"的解释:
Remove Other Tags
<(?!/?(?:strong|em|i|u|b))[^>]+>
Match the character “<” literally «<»
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!/?(?:strong|em|i|u|b))»
Match the character “/” literally «/?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the regular expression below «(?:strong|em|i|u|b)»
Match either the regular expression below (attempting the next alternative only if this one fails) «strong»
Match the characters “strong” literally «strong»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «em»
Match the characters “em” literally «em»
Or match regular expression number 3 below (attempting the next alternative only if this one fails) «i»
Match the character “i” literally «i»
Or match regular expression number 4 below (attempting the next alternative only if this one fails) «u»
Match the character “u” literally «u»
Or match regular expression number 5 below (the entire group fails if this one fails to match) «b»
Match the character “b” literally «b»
Match any character that is NOT a “>” «[^>]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “>” literally «>»
Created with RegexBuddy