office互操作-编码不影响阅读word文档c#

本文关键字:word 文档 影响 互操作 编码 office | 更新日期: 2023-09-27 17:50:32

尝试从word文档中读取unicode字符,但获得符号(????)。

下面是我的代码:

   Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
            object miss = System.Reflection.Missing.Value;
             object enc = Microsoft.Office.Core.MsoEncoding.msoEncodingEUCJapanese; 
            object path = @"C:'Users'file.doc"
            object readOnly = true;
            Microsoft.Office.Interop.Word.Document docs = word.Documents.Open(ref path, ref miss, ref readOnly, ref miss, ref miss,
                ref miss, ref miss, ref miss, ref miss, ref miss, ref enc, ref miss, ref miss, ref miss, ref miss, ref miss);
            string totaltext = "";
            for (int i = 0; i < docs.Paragraphs.Count; i++)
            {
                totaltext += " 'r'n " + docs.Paragraphs[i + 1].Range.Text.ToString();
                Console.WriteLine(totaltext);
            }
           // Console.WriteLine(totaltext);
            docs.Close();
            word.Quit();

office互操作-编码不影响阅读word文档c#

考虑到这些评论,听起来问题很可能只是Console.WriteLine

尝试写入文件:

// This will use Encoding.UTF8 by default.
using (var writer = File.CreateText("test.txt"))
{
    for (int i = 0; i < docs.Paragraphs.Count; i++)
    {
        writer.WriteLine(docs.Paragraphs[i + 1].Range.Text.ToString());
    }
}

然后在记事本中打开文件,指定UTF-8作为编码,我想你会看到一切正确。