使用Regex和c#转换为句子大小写

本文关键字:句子 大小写 转换 Regex 使用 | 更新日期: 2023-09-27 18:07:37

我使用下面的代码将字符串转换为SENTENCE Case。

var sentenceRegex = new Regex(@"(^[a-z])|[?!.:;]'s+(.)", RegexOptions.ExplicitCapture);
var result = sentenceRegex.Replace(toConvert.ToLower(), s => s.Value.ToUpper());

但是,当句子以HTML_TAGS开头时,它会失败,如下例所示。

我想跳过HTML标签并将文本转换为SENTENCE CASE。当前文本:

<BOLD_HTML_TAG>lorem ipsum is simply dummy</BOLD_HTML_TAG> text of the printing and typesetting industry.
<PARAGRAPH_TAG>LOREM ipsum has been the industry's standard dummy
textever since the 1500s</PARAGRAPH_TAG>.

句子后的大小写输出应该如下:

<BOLD_HTML_TAG>Lorem ipsum is simply dummy</BOLD_HTML_TAG> text of the
printing and typesetting industry. <PARAGRAPH_TAG>Lorem ipsum has been
the industry's standard dummy textever since the
1500s</PARAGRAPH_TAG>.

如果有人能帮助我的正则表达式,我应该使用忽略(不删除它)从字符串的HTML标签,并将字符串转换为SENTENCE CASE。

使用Regex和c#转换为句子大小写

可能不好看,但它可以工作;)

using System;
using System.Text.RegularExpressions;
public class Program
{
    public static void Main()
    {
        string toConvert = "<BOLD_HTML_TAG>lorem ipsum is simply dummy</BOLD_HTML_TAG> text of the printing and typesetting industry."+
                "<PARAGRAPH_TAG>LOREM ipsum has been the industry's standard dummy "+
                "text ever since the 1500s</PARAGRAPH_TAG>.";
        var sentenceRegex = new Regex(@"(?<=<(?<tag>'w+)>).*?(?=</'k<tag>>)", RegexOptions.ExplicitCapture);
        var result = sentenceRegex.Replace(toConvert, s => s.Value.Substring(0,1).ToUpper()+s.Value.ToLower().Substring(1));
        Console.WriteLine(toConvert + "'r'n" + result);
    }
}

Regex在向后看和向前看中使用命名组匹配标记,然后提取字符串,最后将第一个字母变为上字母,其余字母变为下字母。