如何在c#中使用WkHTMLToSharp / wkhtmltopdf将HTML文件转换为PDF

本文关键字:HTML wkhtmltopdf 文件 转换 PDF WkHTMLToSharp | 更新日期: 2023-09-27 18:02:43

我正在动态生成HTML文件,我想从最终文件创建一个PDF。我使用以下代码生成HTML文件:

    public static void WriteHTML(string cFile, List<Movie> mList)
    {
        int lineID = 0;
        string strHeader, strMovie, strGenre, tmpGenre = null;
        string strPDF = null;
        // initiates streamwriter for catalog output file
        FileStream fs = new FileStream(cFile, FileMode.Create);
        StreamWriter catalog = new StreamWriter(fs);
        strHeader = "<style type='"text/css'">'r'n" + "<!--'r'n" + "tr#odd {'r'n" + "   background-color:#e2e2e2;'r'n" + "  vertical-align:top;'r'n" + "}'r'n" + "'r'n" + "tr#even {'r'n" + "   vertical-align:top;'r'n" + "}'r'n" + "div#title {'r'n" + "  font-size:16px;'r'n" + "    font-weight:bold;'r'n" + "}'r'n" + "'r'n" + "div#mpaa {'r'n" + "    font-size:10px;'r'n" + "}'r'n" + "'r'n" + "div#genre {'r'n" + " font-size:12px;'r'n" + "    font-style:italic;'r'n" + "}'r'n" + "'r'n" + "div#plot {'r'n" + "   height: 63px;'r'n" + "  font-size:12px;'r'n" + "    overflow:hidden;'r'n" + "}'r'n" + "-->'r'n" + "</style>'r'n" + "'r'n" + "<html>'r'n" + "    <body>'r'n" + "     <table>'r'n";
        catalog.WriteLine(strHeader);
        strPDF = strHeader;
        foreach (Movie m in mList)
        {
            tmpGenre = null;
            strMovie = lineID == 0 ? "          <tr id='"odd'" style='"page-break-inside:avoid'">'r'n" : "          <tr id='"even'" style='"page-break-inside:avoid'">'r'n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;
            foreach (string genre in m.Genres)
                tmpGenre += ", <a href='"" + genre + ".html'" target='"_blank'">" + genre + "</a>";
            strGenre = tmpGenre != null ? tmpGenre.Substring(2) : null;
            strMovie = "                <td>'r'n" + "                   <img src='".''images''" + m.ImageFile + "'" width='"75'" height='"110'">'r'n" + "               </td>'r'n" + "              <td>'r'n" + "                   <div id='"title'">" + m.Title + "</div>'r'n" + "                    <div id='"mpaa'">" + m.Certification + " " + m.MPAA + "</div>'r'n" + "                  <div id='"genre'">" + strGenre + "</div>'r'n" + "                   <div id='"plot'">" + m.Plot + "</div>'r'n" + "              </td>'r'n" + "          </tr>'r'n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;
            lineID = lineID == 0 ? 1 : 0;
        }
        string closingHTML = "      </table>'r'n" + "   </body>'r'n" + "</html>";
        catalog.WriteLine(closingHTML);
        strPDF += closingHTML;
        WritePDF(strPDF, cFile + ".PDF");
        catalog.Close();
    }

完成后,我想调用以下函数来生成PDF文件:

public static void WritePDF(string cFile, string pdfFile)
{
    WkHtmlToPdfConverter w = new WkHtmlToPdfConverter();
    byte[] strHTML = w.Convert(cFile);
    File.WriteAllBytes(pdfFile, strHTML);
    w.Dispose();
}

我发现。convert函数将HTML代码转换为PDF,而不是文件。其次,当我直接传入HTML代码时,图像不会出现在PDF中。我知道gif文件有问题,但这些都是jpg文件。

我读了很多关于wkhtmltopdf有多好,写WkHTMLToSharp的家伙在SO上发布了他的项目,但我对缺乏相关文档感到失望。

我希望能够传递一个文件转换,改变边距(我知道这是可能的,我只需要找出正确的设置),让它正确转换图像,最重要的是,不要打破我的项目跨多个页面(支持"page-break-inside:avoid"或类似的东西)。

我很想看看其他人是如何使用它的!

如何在c#中使用WkHTMLToSharp / wkhtmltopdf将HTML文件转换为PDF

我已经编写了一个关于如何从HTML创建PDF的示例。我刚刚更新了它,也可以打印图像。

https://github.com/hmadrigal/playground-dotnet/tree/master/MsDotNet.PdfGeneration

(在我的博客文章中,我解释了大部分项目https://hmadrigal.wordpress.com/2015/10/16/creating-pdf-reports-from-html-using-dotliquid-markup-for-templates-and-wkhtmltoxsharp-for-printing-pdf/)

基本上你有两个选择:

1:使用file://和文件的全路径。

<img alt="profile" src="{{ employee.PorfileFileName | Prepend: "Assets'ProfileImage'" | ToLocalPath  }}" />

2:使用URL数据(https://en.wikipedia.org/wiki/Data_URI_scheme)

<img alt="profile" src="data:image/png;base64,{{ employee.PorfileFileName | Prepend: "Assets'ProfileImage'" | ToLocalPath | ToBase64 }}" />

欢呼,草

使用WkHtmlToXSharp.

从Github下载最新的DLL

public static string ConvertHTMLtoPDF(string htmlFullPath, string pageSize, string orientation)
{
   string pdfUrl = htmlFullPath.Replace(".html", ".pdf");
   try
   {
       #region USING WkHtmlToXSharp.dll
       //IHtmlToPdfConverter converter = new WkHtmlToPdfConverter();
       IHtmlToPdfConverter converter = new MultiplexingConverter();
       converter.GlobalSettings.Margin.Top = "0cm";
       converter.GlobalSettings.Margin.Bottom = "0cm";
       converter.GlobalSettings.Margin.Left = "0cm";
       converter.GlobalSettings.Margin.Right = "0cm";
       converter.GlobalSettings.Orientation = (PdfOrientation)Enum.Parse(typeof(PdfOrientation), orientation);
       if (!string.IsNullOrEmpty(pageSize))
           converter.GlobalSettings.Size.PageSize = (PdfPageSize)Enum.Parse(typeof(PdfPageSize), pageSize);
       converter.ObjectSettings.Page = htmlFullPath;
       converter.ObjectSettings.Web.EnablePlugins = true;
       converter.ObjectSettings.Web.EnableJavascript = true;
       converter.ObjectSettings.Web.Background = true;
       converter.ObjectSettings.Web.LoadImages = true;
       converter.ObjectSettings.Load.LoadErrorHandling = LoadErrorHandlingType.ignore;
       Byte[] bufferPDF = converter.Convert();
       System.IO.File.WriteAllBytes(pdfUrl, bufferPDF);
       converter.Dispose();
       #endregion
   }
   catch (Exception ex)
   {
       throw new Exception(ex.Message, ex);
   }
   return pdfUrl;
}

你可以使用Spire.Pdf。

此组件可以将html转换为pdf。

 PdfDocument pdfdoc = new PdfDocument();
 pdfdoc.LoadFromHTML(fileFullName, true, true, true);
 //String url = "http://www.e-iceblue.com/";
 //pdfdoc.LoadFromHTML(url, false, true, true);
 pdfdoc.SaveToFile("FromHTML.pdf");

我们也使用wkhtmltopdf并且能够正确渲染图像。但是,默认情况下,图像渲染是禁用的。

您必须在您的转换器实例上指定这些选项:

var wk = _GetConverter()
wk.GlobalSettings.Margin.Top = "20mm";
wk.GlobalSettings.Margin.Bottom = "10mm";
wk.GlobalSettings.Margin.Left = "10mm";
wk.GlobalSettings.Margin.Right = "10mm";
wk.GlobalSettings.Size.PaperSize = PdfPaperSize.A4;
wk.ObjectSettings.Web.PrintMediaType = true;
wk.ObjectSettings.Web.LoadImages = true;
wk.ObjectSettings.Web.EnablePlugins = false;
wk.ObjectSettings.Web.EnableJavascript = true;
result = wk.Convert(htmlContent);