c#删除多余空白的最快方法

本文关键字:方法 空白 删除 多余 | 更新日期: 2023-09-27 18:00:07

将多余的空格替换为一个空格的最快方法是什么
例如

来自

foo      bar 

foo bar

c#删除多余空白的最快方法

最快的方法?对字符串进行迭代,并在StringBuilder中逐个字符构建第二个副本,每组空间只复制一个空间。

更容易键入的Replace变体将创建额外字符串的bucket负载(或者浪费时间构建regex-DFA)。

使用比较结果编辑:

使用http://ideone.com/NV6EzU,n=50(必须在ideone上减少它,因为它花了太长时间,他们不得不扼杀我的进程),我得到:

Regex:7771ms。

串生成器:894ms。

正如预期的那样,Regex对于如此简单的事情来说效率非常低。

您可以使用正则表达式:

static readonly Regex trimmer = new Regex(@"'s's+");
s = trimmer.Replace(s, " ");

要提高性能,请通过RegexOptions.Compiled

有点晚了,但我已经做了一些基准测试,以获得去除额外空白的最快方法。如果有更快的答案,我很乐意补充。

结果:

  1. NormalizeWhiteSpaceForLoop:156毫秒(由我-根据我关于删除所有空白的回答)
  2. NormalizeWhiteSpace:267ms(作者Alex K.)
  3. Regex编译:1950毫秒(由SLaks提供)
  4. Regex:2261毫秒(通过SLaks)

代码:

public class RemoveExtraWhitespaces
{
    public static string WithRegex(string text)
    {
        return Regex.Replace(text, @"'s+", " ");
    }
    public static string WithRegexCompiled(Regex compiledRegex, string text)
    {
        return compiledRegex.Replace(text, " ");
    }
    public static string NormalizeWhiteSpace(string input)
    {
        if (string.IsNullOrEmpty(input))
            return string.Empty;
        int current = 0;
        char[] output = new char[input.Length];
        bool skipped = false;
        foreach (char c in input.ToCharArray())
        {
            if (char.IsWhiteSpace(c))
            {
                if (!skipped)
                {
                    if (current > 0)
                        output[current++] = ' ';
                    skipped = true;
                }
            }
            else
            {
                skipped = false;
                output[current++] = c;
            }
        }
        return new string(output, 0, current);
    }
    public static string NormalizeWhiteSpaceForLoop(string input)
    {
        int len = input.Length,
            index = 0,
            i = 0;
        var src = input.ToCharArray();
        bool skip = false;
        char ch;
        for (; i < len; i++)
        {
            ch = src[i];
            switch (ch)
            {
                case ''u0020':
                case ''u00A0':
                case ''u1680':
                case ''u2000':
                case ''u2001':
                case ''u2002':
                case ''u2003':
                case ''u2004':
                case ''u2005':
                case ''u2006':
                case ''u2007':
                case ''u2008':
                case ''u2009':
                case ''u200A':
                case ''u202F':
                case ''u205F':
                case ''u3000':
                case ''u2028':
                case ''u2029':
                case ''u0009':
                case ''u000A':
                case ''u000B':
                case ''u000C':
                case ''u000D':
                case ''u0085':
                    if (skip) continue;
                    src[index++] = ch;
                    skip = true;
                    continue;
                default:
                    skip = false;
                    src[index++] = ch;
                continue;
            }
        }
        return new string(src, 0, index);
    }
}

测试:

[TestFixture]
public class RemoveExtraWhitespacesTest
{
    private const string _text = "foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo foo                  bar                  foobar                     moo ";
    private const string _expected = "foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo foo bar foobar moo ";
    private const int _iterations = 10000;
    [Test]
    public void Regex()
    {
        var result = TimeAction("Regex", () => RemoveExtraWhitespaces.WithRegex(_text));
        Assert.AreEqual(_expected, result);
    }
    [Test]
    public void RegexCompiled()
    {
        var compiledRegex = new Regex(@"'s+", RegexOptions.Compiled);
        var result = TimeAction("RegexCompiled", () => RemoveExtraWhitespaces.WithRegexCompiled(compiledRegex, _text));
        Assert.AreEqual(_expected, result);
    }
    [Test]
    public void NormalizeWhiteSpace()
    {
        var result = TimeAction("NormalizeWhiteSpace", () => RemoveExtraWhitespaces.NormalizeWhiteSpace(_text));
        Assert.AreEqual(_expected, result);
    }
    [Test]
    public void NormalizeWhiteSpaceForLoop()
    {
        var result = TimeAction("NormalizeWhiteSpaceForLoop", () => RemoveExtraWhitespaces.NormalizeWhiteSpaceForLoop(_text));
        Assert.AreEqual(_expected, result);
    }
    public string TimeAction(string name, Func<string> func)
    {
        var timer = Stopwatch.StartNew();
        string result = string.Empty; ;
        for (int i = 0; i < _iterations; i++)
        {
            result = func();
        }
        timer.Stop();
        Console.WriteLine(string.Format("{0}: {1} ms", name, timer.ElapsedMilliseconds));
        return result;
    }
}
string q = " Hello     how are   you           doing?";
string a = String.Join(" ", q.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries));

我使用以下方法-它们处理所有空白字符而不仅仅是空格,修剪前导尾随空格,删除多余的空白,所有空白都被替换为空格字符(因此我们有统一的空格分隔符)。并且这些方法是快速的。

public static String CompactWhitespaces( String s )
{
    StringBuilder sb = new StringBuilder( s );
    CompactWhitespaces( sb );
    return sb.ToString();
}
public static void CompactWhitespaces( StringBuilder sb )
{
    if( sb.Length == 0 )
        return;
    // set [start] to first not-whitespace char or to sb.Length
    int start = 0;
    while( start < sb.Length )
    {
        if( Char.IsWhiteSpace( sb[ start ] ) )
            start++;
        else 
            break;
    }
    // if [sb] has only whitespaces, then return empty string
    if( start == sb.Length )
    {
        sb.Length = 0;
        return;
    }
    // set [end] to last not-whitespace char
    int end = sb.Length - 1;
    while( end >= 0 )
    {
        if( Char.IsWhiteSpace( sb[ end ] ) )
            end--;
        else 
            break;
    }
    // compact string
    int dest = 0;
    bool previousIsWhitespace = false;
    for( int i = start; i <= end; i++ )
    {
        if( Char.IsWhiteSpace( sb[ i ] ) )
        {
            if( !previousIsWhitespace )
            {
                previousIsWhitespace = true;
                sb[ dest ] = ' ';
                dest++;
            }
        }
        else
        {
            previousIsWhitespace = false;
            sb[ dest ] = sb[ i ];
            dest++;
        }
    }
    sb.Length = dest;
}
string text = "foo       bar";
text = Regex.Replace(text, @"'s+", " ");
// text = "foo bar"

此解决方案适用于空格、制表符和换行符。如果只需要空格,请将"%s"替换为"。

对于较大的字符串,我需要其中一个,并提出了下面的例程。

任何连续的空白(包括制表符、换行符)都将替换为normalizeTo中的任何内容。删除前导/尾部空白。

它的速度大约是RegEx的8倍,我的5k->5mil字符串。

internal static string NormalizeWhiteSpace(string input, char normalizeTo = ' ')
{
    if (string.IsNullOrEmpty(input))
        return string.Empty;
    int current = 0;
    char[] output = new char[input.Length];
    bool skipped = false;
    foreach (char c in input.ToCharArray())
    {
        if (char.IsWhiteSpace(c))
        {
            if (!skipped)
            {
                if (current > 0)
                    output[current++] = normalizeTo;
                skipped = true;
            }
        }
        else
        {
            skipped = false;
            output[current++] = c;
        }
    }
    return new string(output, 0, skipped ? current - 1 : current);
}
string yourWord = "beep boop    baap beep   boop    baap             beep";
yourWord = yourWord .Replace("  ", " |").Replace("| ", "").Replace("|", "");
  1. 删除多余的空白子字符串
  2. 按照Blindy的建议,接受在原始字符串上循环的字符

以下是性能的最佳平衡&我发现的可读性(使用100000次迭代计时运行)。有时,这种测试比不太清晰的版本更快,最多慢5%。在我的小测试字符串中,regex花费了4.24倍的时间。

public static string RemoveExtraWhitespace(string str)
    {
        var sb = new StringBuilder();
        var prevIsWhitespace = false;
        foreach (var ch in str)
        {
            var isWhitespace = char.IsWhiteSpace(ch);
            if (prevIsWhitespace && isWhitespace)
            {
                continue;
            }
            sb.Append(ch);
            prevIsWhitespace = isWhitespace;
        }
        return sb.ToString();
    }

它不快,但如果简单性有帮助,它会起作用:

while (text.Contains("  ")) text=text.Replace("  ", " ");

我尝试过使用数组,但没有使用if

结果

PS C:'dev'Spaces> dotnet run -c release
// .NETCoreApp,Version=v3.0
Seed=7, n=20, s.Length=2828670
Regex by SLaks            1407ms, len=996757
StringBuilder by Blindy    154ms, len=996757
Array                      130ms, len=996757
NoIf                        91ms, len=996757
All match!

方法

private static string WithNoIf(string s)
{
    var dst = new char[s.Length];
    uint end = 0;
    char prev = char.MinValue;
    for (int k = 0; k < s.Length; ++k)
    {
        var c = s[k];
        dst[end] = c;
        // We'll move forward if the current character is not ' ' or if prev char is not ' '
        // To avoid 'if' let's get diffs for c and prev and then use bitwise operatios to get 
        // 0 if n is 0 or 1 if n is non-zero
        uint x = (uint)(' ' - c) + (uint)(' ' - prev); // non zero if any non-zero
        end += ((x | (~x + 1)) >> 31) & 1; // https://stackoverflow.com/questions/3912112/check-if-a-number-is-non-zero-using-bitwise-operators-in-c by ruslik
        prev = c;
    }
    return new string(dst, 0, (int)end);
}
private static string WithArray(string s)
{
    var dst = new char[s.Length];
    int end = 0;
    char prev = char.MinValue;
    for (int k = 0; k < s.Length; ++k)
    {
        char c = s[k];
        if (c != ' ' || prev != ' ') dst[end++] = c;
        prev = c;
    }
    return new string(dst, 0, end);
}

测试代码

public static void Main()
{
    const int n = 20;
    const int seed = 7;
    string s = GetTestString(seed);
    var fs = new (string Name, Func<string, string> Func)[]{
        ("Regex by SLaks", WithRegex),
        ("StringBuilder by Blindy", WithSb),
        ("Array", WithArray),
        ("NoIf", WithNoIf),
    };
    Console.WriteLine($"Seed={seed}, n={n}, s.Length={s.Length}");
    var d = new Dictionary<string, string>(); // method, result
    var sw = new Stopwatch();
    foreach (var f in fs)
    {
        sw.Restart();
        var r = "";
        for( int i = 0; i < n; i++) r = f.Func(s);
        sw.Stop();
        d[f.Name] = r;
        Console.WriteLine($"{f.Name,-25} {sw.ElapsedMilliseconds,4}ms, len={r.Length}");
    }
    Console.WriteLine(d.Values.All( v => v == d.Values.First()) ? "All match!" : "Not all match! BAD");
}
private static string GetTestString(int seed)
{
    // by blindy from https://stackoverflow.com/questions/6442421/c-sharp-fastest-way-to-remove-extra-white-spaces
    var rng = new Random(seed);
    // random 1mb+ string (it's slow enough...)
    StringBuilder ssb = new StringBuilder(1 * 1024 * 1024);
    for (int i = 0; i < 1 * 1024 * 1024; ++i)
        if (rng.Next(5) == 0)
            ssb.Append(new string(' ', rng.Next(20)));
        else
            ssb.Append((char)(rng.Next(128 - 32) + 32));
    string s = ssb.ToString();
    return s;
}

这段代码运行良好。我还没有衡量表现。

string text = "   hello    -  world,  here   we go  !!!    a  bc    ";
string.Join(" ", text.Split().Where(x => x != ""));
// Output
// "hello - world, here we go !!! a bc"

试试这个:

System.Text.RegularExpressions.Regex.Replace(input, @"'s+", " ");

这个问题中有一些要求不明确,值得思考。

  1. 您想要一个前导或尾随空格字符吗
  2. 当您用单个字符替换所有空白时,您希望该字符保持一致吗?(即,这些解决方案中的许多将用''t''t替换''t''t,用''替换''

这是一个非常有效的版本,它用一个空格替换所有空白,并在for循环之前删除任何前导和尾随空白。

  public static string WhiteSpaceToSingleSpaces(string input)
  {
    if (input.Length < 2) 
        return input;
    StringBuilder sb = new StringBuilder();
    input = input.Trim();
    char lastChar = input[0];
    bool lastCharWhiteSpace = false;
    for (int i = 1; i < input.Length; i++)
    {
        bool whiteSpace = char.IsWhiteSpace(input[i]);
        //Skip duplicate whitespace characters
        if (whiteSpace && lastCharWhiteSpace)
            continue;
        //Replace all whitespace with a single space.
        if (whiteSpace)
            sb.Append(' ');
        else
            sb.Append(input[i]);
        //Keep track of the last character's whitespace status
        lastCharWhiteSpace = whiteSpace;
    }
    return sb.ToString();
  }

我不知道这是否是最快的方法,但我使用了它,这对我来说很有效:

    /// <summary>
    /// Remove all extra spaces and tabs between words in the specified string!
    /// </summary>
    /// <param name="str">The specified string.</param>
    public static string RemoveExtraSpaces(string str)
    {
        str = str.Trim();
        StringBuilder sb = new StringBuilder();
        bool space = false;
        foreach (char c in str)
        {
            if (char.IsWhiteSpace(c) || c == (char)9) { space = true; }
            else { if (space) { sb.Append(' '); }; sb.Append(c); space = false; };
        }
        return sb.ToString();
    }

这很有趣,但在我的电脑上,下面的方法和Sergey Povalyaev的StringBulder方法一样快——(1000次重复,10k个src字符串,约282ms)。但不确定内存使用情况。

string RemoveExtraWhiteSpace(string src, char[] wsChars){
   return string.Join(" ",src.Split(wsChars, StringSplitOptions.RemoveEmptyEntries));
}

显然,它适用于任何字符,而不仅仅是空格。

虽然这不是OP所要求的,但如果你真正需要的是用一个实例替换字符串中的特定连续字符,你可以使用这种相对有效的方法:

    string RemoveDuplicateChars(string src, char[] dupes){  
        var sd = (char[])dupes.Clone();  
        Array.Sort(sd);
        var res = new StringBuilder(src.Length);
        for(int i = 0; i<src.Length; i++){
            if( i==0 || src[i]!=src[i-1] || Array.BinarySearch(sd,src[i])<0){
                res.Append(src[i]); 
            }
        }
        return res.ToString();
    }
public string GetCorrectString(string IncorrectString)
    {
        string[] strarray = IncorrectString.Split(' ');
        var sb = new StringBuilder();
        foreach (var str in strarray)
        {
            if (str != string.Empty)
            {
                sb.Append(str).Append(' ');
            }
        }
        return sb.ToString().Trim();
    }

我刚刚把它提出来,但还没有测试。但我觉得这很优雅,并且避免了regex:

    /// <summary>
    /// Removes extra white space.
    /// </summary>
    /// <param name="s">
    /// The string
    /// </param>
    /// <returns>
    /// The string, with only single white-space groupings. 
    /// </returns>
    public static string RemoveExtraWhiteSpace(this string s)
    {
        if (s.Length == 0)
        {
            return string.Empty;
        }
        var stringBuilder = new StringBuilder();
        var whiteSpaceCount = 0;
        foreach (var character in s)
        {
            if (char.IsWhiteSpace(character))
            {
                whiteSpaceCount++;
            }
            else
            {
                whiteSpaceCount = 0;
            }
            if (whiteSpaceCount > 1)
            {
                continue;
            }
            stringBuilder.Append(character);
        }
        return stringBuilder.ToString();
    }

我是不是遗漏了什么?我想出了这个:

// Input: "HELLO     BEAUTIFUL       WORLD!"
private string NormalizeWhitespace(string inputStr)
{
    // First split the string on the spaces but exclude the spaces themselves
    // Using the input string the length of the array will be 3. If the spaces
    // were not filtered out they would be included in the array
    var splitParts = inputStr.Split(' ').Where(x => x != "").ToArray();
   // Now iterate over the parts in the array and add them to the return
   // string. If the current part is not the last part, add a space after.
   for (int i = 0; i < splitParts.Count(); i++)
   {
        retVal += splitParts[i];
        if (i != splitParts.Count() - 1)
        {
            retVal += " ";
        }
   }
    return retVal;
}
// Would return "HELLO BEAUTIFUL WORLD!"

我知道我在这里创建第二个字符串来返回它,同时创建splitParts数组。我只是觉得这很直接。也许我没有考虑到一些潜在的情况。

我知道这真的很旧,但压缩空白(用单个"空格"字符替换任何重复出现的空白字符)的最简单方法如下:

    public static string CompactWhitespace(string astring)
    {
        if (!string.IsNullOrEmpty(astring))
        {
            bool found = false;
            StringBuilder buff = new StringBuilder();
            foreach (char chr in astring.Trim())
            {
                if (char.IsWhiteSpace(chr))
                {
                    if (found)
                    {
                        continue;
                    }
                    found = true;
                    buff.Append(' ');
                }
                else
                {
                    if (found)
                    {
                        found = false;
                    }
                    buff.Append(chr);
                }
            }
            return buff.ToString();
        }
        return string.Empty;
    }

我对C#不是很熟悉,因此我的代码不是一个优雅/最高效的代码。我来这里是为了找到一个适合我的用例的答案,但我找不到(或者我想不出)。

对于我的用例,我需要用以下条件规范化所有空白空间(WS-{spacetabcr lf}):

  • WS可以有任何组合
  • 用最重要的WS替换WS序列
  • 在某些情况下,tab需要保留(例如,一个与选项卡分离的文件,在这种情况下,还需要保留重复的选项卡)。但在大多数情况下,它们必须转换为空间

因此,这里有一个示例输入和一个预期输出(免责声明:我的代码仅针对此示例进行测试)


        Every night    in my            dreams  I see you, I feel you
    That's how    I know you go on
Far across the  distance and            places between us   

You            have                 come                    to show you go on

转换为

Every night in my dreams I see you, I feel you
That's how I know you go on
Far across the distance and places between us
You have come to show you go on

这是我的代码

using System;
using System.Text.RegularExpressions;
public class Program
{
    public static void Main(string text)
    {
        bool preserveTabs = false;
        //[Step 1]: Clean up white spaces around the text
        text = text.Trim();
        //Console.Write("'nTrim'n======'n" + text);
        //[Step 2]: Reduce repeated spaces to single space. 
        text = Regex.Replace(text, @" +", " ");
        // Console.Write("'nNo repeated spaces'n======'n" + text);
        //[Step 3]: Hande Tab spaces. Tabs needs to treated with care because 
        //in some files tabs have special meaning (for eg Tab seperated files)
        if(preserveTabs)
        {
            text = Regex.Replace(text, @" *'t *", "'t");
        }
        else
        {
            text = Regex.Replace(text, @"[ 't]+", " ");
        }
        //Console.Write("'nTabs preserved'n======'n" + text);
        //[Step 4]: Reduce repeated new lines (and other white spaces around them)
                  //into a single new line.
        text = Regex.Replace(text, @"(['t ]*('n)+['t ]*)+", "'n");
        Console.Write("'nClean New Lines'n======'n" + text);    
    }
}

请在此处查看此代码的作用:https://dotnetfiddle.net/eupjIU

如果你调整famos算法-在这种情况下比较"相似"的字符串-区分大小写&不在乎多空间,也可以忍受NULL。不要相信基准测试——这是一项数据比较密集型任务,aprox。1/4GB数据,整个动作的加速率接近100%(评论部分与该算法5/10min)。其中一些人的收入相差不到30%。会告诉构建最好的算法需要进行反汇编,并检查编译器在发布或调试构建中会做什么。这里还有一个简单一半的fulltrim作为类似(C问题)的答案,区分大小写。

public static bool Differs(string srcA, string srcB)
{
    //return string.Join(" ", (a?.ToString()??String.Empty).ToUpperInvariant().Split(new char[0], StringSplitOptions.RemoveEmptyEntries).ToList().Select(x => x.Trim()))
    //    != string.Join(" ", (b?.ToString()??String.Empty).ToUpperInvariant().Split(new char[0], StringSplitOptions.RemoveEmptyEntries).ToList().Select(x => x.Trim()));
    if (srcA == null) { if (srcB == null) return false; else srcA = String.Empty; } // A == null + B == null same or change A to empty string
    if (srcB == null) { if (srcA == null) return false; else srcB = String.Empty; }
    int dstIdxA = srcA.Length, dstIdxB = srcB.Length; // are there any remaining (front) chars in a string ?
    int planSpaceA = 0, planSpaceB = 0; // state automaton 1 after non-WS, 2 after WS
    bool validA, validB; // are there any remaining (front) chars in a array ?
    char chA = ''0', chB = ''0';
spaceLoopA:
        if (validA = (dstIdxA > 0)) {
            chA = srcA[--dstIdxA];
            switch (chA) {
                case '!': case '"': case '#': case '$': case '%': case '&': case '''': case '(': case ')': case '*': case '+': case ',': case '-':
                case '.': case '/': case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case ':':
                case ';': case '<': case '=': case '>': case '?': case '@': case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': case 'G':
                case 'H': case 'I': case 'J': case 'K': case 'L': case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': case 'S': case 'T':
                case 'U': case 'V': case 'W': case 'X': case 'Y': case 'Z': case '[': case '''': case ']': case '^': case '_': case '`': // a-z will be | 32 to Upper
                case '{': case '|': case '}': case '~':
                    break; // ASCII except lowercase
                case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': case 'g': case 'h': case 'i':
                case 'j': case 'k': case 'l': case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
                case 's': case 't': case 'u': case 'v': case 'w': case 'x': case 'y': case 'z':
                    chA = (Char)(chA & ~0x20);
                    break;
                case ''u0020': case ''u00A0': case ''u1680': case ''u2000': case ''u2001':
                case ''u2002': case ''u2003': case ''u2004': case ''u2005': case ''u2006':
                case ''u2007': case ''u2008': case ''u2009': case ''u200A': case ''u202F':
                case ''u205F': case ''u3000': case ''u2028': case ''u2029': case ''u0009':
                case ''u000A': case ''u000B': case ''u000C': case ''u000D': case ''u0085':
                    if (planSpaceA == 1) planSpaceA = 2; // cycle here to address multiple WS before non-WS part
                    goto spaceLoopA;
                default:
                    chA = Char.ToUpper(chA);
                    break;
        }}
spaceLoopB:
        if (validB = (dstIdxB > 0)) { // 2nd string / same logic
            chB = srcB[--dstIdxB];
            switch (chB) {
                case '!': case '"': case '#': case '$': case '%': case '&': case '''': case '(': case ')': case '*': case '+': case ',': case '-':
                case '.': case '/': case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case ':':
                case ';': case '<': case '=': case '>': case '?': case '@': case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': case 'G':
                case 'H': case 'I': case 'J': case 'K': case 'L': case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': case 'S': case 'T':
                case 'U': case 'V': case 'W': case 'X': case 'Y': case 'Z': case '[': case '''': case ']': case '^': case '_': case '`': // a-z will be | 32 to Upper
                    break;
                case '{': case '|': case '}': case '~':
                    break; // ASCII except lowercase
                case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': case 'g': case 'h': case 'i':
                case 'j': case 'k': case 'l': case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
                case 's': case 't': case 'u': case 'v': case 'w': case 'x': case 'y': case 'z':
                    chB = (Char)(chB & ~0x20);
                    break;
                case ''u0020': case ''u00A0': case ''u1680': case ''u2000': case ''u2001':
                case ''u2002': case ''u2003': case ''u2004': case ''u2005': case ''u2006':
                case ''u2007': case ''u2008': case ''u2009': case ''u200A': case ''u202F':
                case ''u205F': case ''u3000': case ''u2028': case ''u2029': case ''u0009':
                case ''u000A': case ''u000B': case ''u000C': case ''u000D': case ''u0085':
                    if (planSpaceB == 1) planSpaceB = 2;
                goto spaceLoopB;
                default:
                    chB = Char.ToUpper(chB);
                    break;
        }}
        if (planSpaceA != planSpaceB) return true; // both should/not have space now (0 init / 1 last non-WS / 2 last was WS)
        if (validA) { // some (non-WS) in A still
            if (validB) {
            if (chA != chB) return true; // both have another char to compare, are they different ?
            } else return true; // not in B not - they are different
        } else { // A done, current last pair equal => continue 2 never ending loop till B end (by WS only to be same)
            if (!validB) return false; // done and end-up here without leaving by difference => both are same except some WSs arround
            else return true; // A done, but non-WS remains in B - different
        }  // A done, B had no non-WS or non + WS last follow - never ending loop continue
        planSpaceA = 1; planSpaceB = 1;
        goto spaceLoopA; // performs better
    }
}

您可以使用indexOf首先获取空白序列的起始位置,然后使用replace方法将空白更改为"。从那里,您可以使用所获取的索引,并在该位置放置一个空白字符。

对于那些只想复制pase并继续的人:

    private string RemoveExcessiveWhitespace(string value)
    {
        if (value == null) { return null; }
        var builder = new StringBuilder();
        var ignoreWhitespace = false;
        foreach (var c in value)
        {
            if (!ignoreWhitespace || c != ' ')
            {
                builder.Append(c);
            }
            ignoreWhitespace = c == ' ';
        }
        return builder.ToString();
    }

我的版本(从Stian的答案改进而来)。应该很快。

public static string TrimAllExtraWhiteSpaces(this string input)
{
    if (string.IsNullOrEmpty(input))
    {
        return input;
    }
    var current = 0;
    char[] output = new char[input.Length];
    var charArray = input.ToCharArray();
    for (var i = 0; i < charArray.Length; i++)
    {
        if (!char.IsWhiteSpace(charArray[i]))
        {
            if (current > 0 && i > 0 && char.IsWhiteSpace(charArray[i - 1]))
            {
                output[current++] = ' ';
            }
            output[current++] = charArray[i];
        }
    }
    return new string(output, 0, current);
}

不需要复杂的代码!这里有一个简单的代码,可以删除任何重复:

public static String RemoveCharOccurence(String s, char[] remove)
{
    String s1 = s;
    foreach(char c in remove)
    {
        s1 = RemoveCharOccurence(s1, c);
    }
    return s1;
}
public static String RemoveCharOccurence(String s, char remove)
{
    StringBuilder sb = new StringBuilder(s.Length);
    Boolean removeNextIfMatch = false;
    foreach(char c in s)
    {
        if(c == remove)
        {
            if(removeNextIfMatch)
                continue;
            else
                removeNextIfMatch = true;
        }
        else
            removeNextIfMatch = false;
        sb.Append(c);
    }
    return sb.ToString();
}

它非常简单,只需使用.Replace()方法:

string words = "Hello     world!";
words = words.Replace("''s+", " ");

输出>>"你好,世界!"

我能想到的最简单的方法:

Text = Text.Replace("'<Space>'<Space>", "'<Space>").Replace("'<Space>'<Space>", "'<Space>");
// Replace 2 '<Space>s with 1 space, twice