导出的PDF中未显示非英文字体

本文关键字：文字体字体显示 PDF | 更新日期: 2024-09-24 10:16:38

我使用以下源代码使用wkhtmltopdf.exe将HTML转换为PDF。问题是，PDF显示"？"代替了所有非英语字符，如中文、日语、俄语、阿拉伯语。当输出为HTML时，字符将正确显示。我尝试为HTML设置不同的编码（utf-8、utf-16、gb2312），但PDF无法呈现非英语语言。

我在wkhtmltopdf论坛上读到关于在服务器上安装中文字体的内容，但看起来它们不适合Windows服务器环境。此外，字体似乎可以在服务器上使用，因为HTML呈现正确？

有什么想法可以让它发挥作用吗？

代码：

private void WritePDF(string html)
    {
        string inFileName,
                outFileName,
                tempPath;
        Process p;
        System.IO.StreamWriter stdin;
        ProcessStartInfo psi = new ProcessStartInfo();

        tempPath = Request.PhysicalApplicationPath 
            + ConfigurationManager.AppSettings[Constants.AppSettings.ExportToPdfTempFolder];
        inFileName = Session.SessionID + ".htm";
        outFileName = Session.SessionID + ".pdf";
        // run the conversion utility
        psi.UseShellExecute = false;
        psi.FileName = Server.MapPath(ConfigurationManager.AppSettings[Constants.AppSettings.ExportToPdfExecutablePath]);
        psi.CreateNoWindow = true;
        psi.RedirectStandardInput = true;
        psi.RedirectStandardOutput = true;
        psi.RedirectStandardError = true;
        //psi.StandardOutputEncoding = System.Text.Encoding.gb;
        // note that we tell wkhtmltopdf to be quiet and not run scripts
        // NOTE: I couldn't figure out a way to get both stdin and stdout redirected so we have to write to a file and then clean up afterwards
        psi.Arguments = "-q -n - " + tempPath + outFileName;
        p = Process.Start(psi);
        try
        {
            stdin = p.StandardInput;
            stdin.AutoFlush = true;
            stdin.Write(html);
            stdin.Close();
            if (p.WaitForExit(15000))
            {
                // NOTE: the application hangs when we use WriteFile (due to the Delete below?); this works
                Response.BinaryWrite(System.IO.File.ReadAllBytes(tempPath + outFileName));
            }
        }
        finally
        {
            p.Close();
            p.Dispose();
        }
        // delete the pdf
        System.IO.File.Delete(tempPath + outFileName);
    }

导出的PDF中未显示非英文字体

Wkhtmltopdf绝对可以渲染非英语字符，如中文、日语、俄语、阿拉伯语。在大多数情况下，它们不会显示，因为HTML模板缺少具有适当字符集定义的元标记。默认情况下，.NET使用UTF-8编码，在这种情况下，HTML模板应该包含以下元标记：

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

顺便说一句，您可以使用像NReco-PdfGenerator这样的.NET包装器（我是这个库的作者），而不是直接调用wkhtmltopdf。

确保您的字体支持这些字符，并且您的源代码是UTF-8，它应该可以工作——我已经使用朝鲜语、中文、波兰语和其他各种字符测试了wkhtmltopdf，它一直都可以工作。看看我对另一个类似问题的回答https://stackoverflow.com/a/11862584/694325

我写我的html源代码，但除此之外，我的PDF生成与您的非常相似。我会检查一下各处是否都是utf-8。

using (TextWriter tw = new StreamWriter(path, false, System.Text.Encoding.UTF8))
{
    tw.WriteLine(contents);
}

从这样的来源生成的PDF似乎可以毫无问题地工作。