"'f',十六进制值0x0C,是无效字符"-不能清洗

本文关键字:quot 字符 清洗 无效 不能 十六进制 0x0C | 更新日期: 2023-09-27 18:12:59

我得到了可怕的"''f',十六进制值0x0C,是一个无效的字符"当试图保存我的Xdocument内容到文件。

我已经谷歌了它,我已经试图通过转换Xdocument到字符串清理它的非ascii字符(见下文)来清理它。我从另一张海报上复制了这个方法。

当我尝试将字符串传递给清理方法时,现在会抛出异常。它似乎不想转换成字符串。我试着输出到一个文本文件,这样我就可以看到问题字符是什么,但它随后抛出了异常。有什么想法吗?

public void combineContentXmlWithS1000Dtemplate(XElement content)
{
    XDocument XDoc = XDocument.Load(GlobalVars.pathToDMshells + "''descript.xml" );
    content.Descendants("para").Where(e => string.IsNullOrEmpty(e.Value)).Remove(); // remove all empty para elements
    XDoc.XPathSelectElement("/dmodule/content").Add(content); // adds the new tree to the S1000D template XML
    writeHeaderData(XDoc);
    //System.IO.File.WriteAllText(GlobalVars.pathToOutput + "''Log.text", XDoc.ToString()); // It even threw exception here
    string cleanedXML = CleanInvalidXmlChars(XDoc.ToString()); // clean the doc of non ascii characters
    XDocument FinishedDM = XDocument.Parse(cleanedXML);
    saveMyS1000Dfile(FinishedDM);
}
public static string CleanInvalidXmlChars(string StrInput)
{
    //Returns same value if the value is empty.
    if (string.IsNullOrWhiteSpace(StrInput))
    {
        return StrInput;
    }
    // From xml spec valid chars:
    // #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]    
    // any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
    string RegularExp = @"[^'x09'x0A'x0D'x20-'xD7FF'xE000-'xFFFD'x10000-x10FFFF]";
    return Regex.Replace(StrInput, RegularExp, String.Empty);
}

"'f',十六进制值0x0C,是无效字符"-不能清洗

最后这对我很有效:

object missing = System.Reflection.Missing.Value;
            object findText = "'f";
            object replaceText = "^p^p";
            currentDocument.Range().Find.Execute(ref findText,
                true, true, true, ref missing, ref missing, ref missing,
                ref missing, ref missing, ref replaceText, Word.WdReplace.wdReplaceAll,
                ref missing, ref missing, ref missing, ref missing);