c#:替换HTML字符串中的第一个普通字符
本文关键字:第一个 字符 替换 HTML 字符串 | 更新日期: 2023-09-27 18:10:24
我想做的是将HTML字符串中的第一个字符替换为我自己的自定义样式。不幸的是,我不能以一种通用的方式来为我所有的例子工作。
考虑下一个可能的HTML字符串:
string str1 = "hello world";
string str2 = "<p><div>hello</div> world <div>some text</div></p>";
string str3 = "<p>hello <span>world</span></p>";
string str4 = "<p><a href="#">h</a>hello world</p>";
string str5 = "<p>hello world <div>some text</div></p>";
结果应该是:
str1 = "<span class='"my-style'">h</span>ello world";
str2 = "<p><div><span class='"my-style'">h</span>ello</div> world <div>some text</div></p>";
str3 = "<p><span class='"my-style'">h</span>ello <span>world</span></p>";
str4 = "<p><a href="#'><span class='"my-style'">h</span></a>hello world</p>";
str5 = "<p><span class='"my-style'">h</span>ello world <div>some text</div></p>";
结果中的"h"字母已更改为<span class='"my-style'">h</span>
谁能帮我一下?
您可以使用以下两种方法。首先提取innertext的第一个单词:
private static string ExtractHtmlInnerTextFirstWord(string htmlText)
{
//Match any Html tag (opening or closing tags)
// followed by any successive whitespaces
//consider the Html text as a single line
Regex regex = new Regex("(<.*?>''s*)+", RegexOptions.Singleline);
// replace all html tags (and consequtive whitespaces) by spaces
// trim the first and last space
string resultText = regex.Replace(htmlText, " ").Trim().Split(' ').FirstOrDefault();
return resultText;
}
注:功劳归http://www.codeproject.com/Tips/477066/Extract-inner-text-from-HTML-using-Regex
然后,将第一个单词替换为编辑后的值(也调用ExtractHtmlInnerTextFirstWord
.
private static string ReplaceHtmlInnerText(string htmlText)
{
// Get first word.
string firstWord = ExtractHtmlInnerTextFirstWord(htmlText);
// Add span around first character of first word.
string replacedFirstWord = firstWord.Replace(firstWord[0].ToString(), "<span class='"my-style'">" + firstWord[0] +"</span>");
// Replace only first occurrence of word.
var regex = new Regex(Regex.Escape(firstWord));
string replacedText = regex.Replace(htmlText, replacedFirstWord, 1);
return replacedText;
}
你可以用以下方式调用这个方法:
private static void Main(string[] args)
{
string str1 = "hello world";
string str2 = "<p><div>hello</div> world <div>some text</div></p>";
Console.WriteLine("Original: " + str1);
Console.WriteLine("Edited value: " + ReplaceHtmlInnerText(str1));
Console.WriteLine("Original: " + str2);
Console.WriteLine("Edited value: " + ReplaceHtmlInnerText(str2));
Console.Read();
}
输出:Original: hello world
Edited value: <span class="my-style">h</span>ello world
Original: <p><div>hello</div> world <div>some text</div></p>
Edited value: <p><div><span class="my-style">h</span>ello</div> world <div>some text</div></p>
:首字母选择器可以帮助您使用CSS完成此操作。http://www.w3schools.com/cssref/sel_firstletter.asp