c#:替换HTML字符串中的第一个普通字符

本文关键字:第一个 字符 替换 HTML 字符串 | 更新日期: 2023-09-27 18:10:24

我想做的是将HTML字符串中的第一个字符替换为我自己的自定义样式。不幸的是,我不能以一种通用的方式来为我所有的例子工作。

考虑下一个可能的HTML字符串:

string str1 = "hello world";
string str2 = "<p><div>hello</div> world <div>some text</div></p>";
string str3 = "<p>hello <span>world</span></p>";
string str4 = "<p><a href="#">h</a>hello world</p>";
string str5 = "<p>hello world <div>some text</div></p>";

结果应该是:

str1 = "<span class='"my-style'">h</span>ello world";
str2 = "<p><div><span class='"my-style'">h</span>ello</div> world <div>some text</div></p>";
str3 = "<p><span class='"my-style'">h</span>ello <span>world</span></p>";
str4 = "<p><a href="#'><span class='"my-style'">h</span></a>hello world</p>";
str5 = "<p><span class='"my-style'">h</span>ello world <div>some text</div></p>";

结果中的"h"字母已更改为<span class='"my-style'">h</span>

谁能帮我一下?

c#:替换HTML字符串中的第一个普通字符

您可以使用以下两种方法。首先提取innertext的第一个单词:

private static string ExtractHtmlInnerTextFirstWord(string htmlText)
{
    //Match any Html tag (opening or closing tags) 
    // followed by any successive whitespaces
    //consider the Html text as a single line
    Regex regex = new Regex("(<.*?>''s*)+", RegexOptions.Singleline);
    // replace all html tags (and consequtive whitespaces) by spaces
    // trim the first and last space
    string resultText = regex.Replace(htmlText, " ").Trim().Split(' ').FirstOrDefault();
    return resultText;
}

注:功劳归http://www.codeproject.com/Tips/477066/Extract-inner-text-from-HTML-using-Regex

然后,将第一个单词替换为编辑后的值(也调用ExtractHtmlInnerTextFirstWord .

)。
private static string ReplaceHtmlInnerText(string htmlText)
{
    // Get first word.
    string firstWord = ExtractHtmlInnerTextFirstWord(htmlText);
    // Add span around first character of first word.
    string replacedFirstWord = firstWord.Replace(firstWord[0].ToString(), "<span class='"my-style'">" + firstWord[0] +"</span>");
    // Replace only first occurrence of word.
    var regex = new Regex(Regex.Escape(firstWord));
    string replacedText = regex.Replace(htmlText, replacedFirstWord, 1);
    return replacedText;
}

你可以用以下方式调用这个方法:

private static void Main(string[] args)
{
    string str1 = "hello world";
    string str2 = "<p><div>hello</div> world <div>some text</div></p>";
    Console.WriteLine("Original: " + str1);
    Console.WriteLine("Edited value: " + ReplaceHtmlInnerText(str1));
    Console.WriteLine("Original: " + str2);
    Console.WriteLine("Edited value: " + ReplaceHtmlInnerText(str2));
    Console.Read();
}
输出:

Original: hello world 
Edited value: <span class="my-style">h</span>ello world 
Original: <p><div>hello</div> world <div>some text</div></p> 
Edited value: <p><div><span class="my-style">h</span>ello</div> world <div>some text</div></p>

:首字母选择器可以帮助您使用CSS完成此操作。http://www.w3schools.com/cssref/sel_firstletter.asp