生成PDF基于HTML代码(iTextSharp, PDFSharp?)

本文关键字:PDFSharp iTextSharp PDF 基于 HTML 代码 生成 | 更新日期: 2023-09-27 18:10:06

PDFSharp可以像iTextSharp一样生成PDF文件*考虑HTML格式*吗?(粗体(strong),间距(br)等)

以前我使用iTextSharp,并大致以这样的方式处理(下面的代码):

 string encodingMetaTag = "<meta http-equiv='"Content-Type'" content='"text/html; charset=utf-8'" />";
 string htmlCode = "text <div> <b> bold </ b> or <u> underlined </ u> <div/>";
 var sr = new StringReader (encodingMetaTag + htmlCode);
 var pdfDoc = new Document (PageSize.A4, 10f, 10f, 10f, 0f);
 var = new HTMLWorker htmlparser (pdfDoc);
 PdfWriter.GetInstance (pdfDoc, HttpContext.Current.Response.OutputStream);
 pdfDoc.Open ();
 htmlparser.Parse (sr);
 pdfDoc.Close ();

合并到适当的HTML表单中,以处理类对象HTMLWorker的PDF文档。那么PDFSharp又如何呢? PDFSharp有没有类似的解决方案?

生成PDF基于HTML代码(iTextSharp, PDFSharp?)

我知道这个问题很老了,但这里有一个干净的方法…

你可以使用HtmlRenderer和PDFSharp来完成这个任务:

Bitmap bitmap = new Bitmap(1200, 1800);
Graphics g = Graphics.FromImage(bitmap);
HtmlRenderer.HtmlContainer c = new HtmlRenderer.HtmlContainer();
c.SetHtml("<html><body style='font-size:20px'>Whatever</body></html>");
c.PerformPaint(g);
PdfDocument doc = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
doc.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
xgr.DrawImage(img, 0, 0);
doc.Save(@"C:'test.pdf");
doc.Close();
        

有些人报告说,最终的图像看起来有点模糊,显然是由于自动抗锯齿。这里有一个关于如何解决这个问题的帖子:http://forum.pdfsharp.com/viewtopic.php?f=2&t=1811&start=0

不,PDFsharp目前不包含解析HTML文件的代码。

老问题,但以上都不适合我。然后我尝试了HtmlRenderer的generatepdf方法结合pdfsharp。希望能有所帮助:必须安装一个名为HtmlRenderer.pdfsharp的nuget。

var doc = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf("Your html in a string",PageSize.A4);
  PdfPage page = new PdfPage();
  XImage img = XImage.FromGdiPlusImage(bitmap);
  doc.Pages.Add(page);
  XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
  xgr.DrawImage(img, 0, 0);
  doc.Save(Server.MapPath("test.pdf"));
  doc.Close();

如果你只想要一个特定的HTML字符串写入PDF,而不是其余的,你可以使用HtmlContainer从TheArtOfDev HtmlRenderer。这个代码片段使用了v1.5.1

using PdfSharp.Pdf;
using PdfSharp;
using PdfSharp.Drawing;
using TheArtOfDev.HtmlRenderer.PdfSharp;
//create a pdf document
using (PdfDocument doc = new PdfDocument())
{
    doc.Info.Title = "StackOverflow Demo PDF";
    //add a page
    PdfPage page = doc.AddPage();
    page.Size = PageSize.A4;
    //fonts and styles
    XFont font = new XFont("Arial", 10, XFontStyle.Regular);
    XSolidBrush brush = new XSolidBrush(XColor.FromArgb(0, 0, 0));
    using (XGraphics gfx = XGraphics.FromPdfPage(page))
    {
        //write a normal string
        gfx.DrawString("A normal string written to the PDF.", font, brush, new XRect(15, 15, page.Width, page.Height), XStringFormats.TopLeft);
        //write the html string to the pdf
        using (var container = new HtmlContainer())
        {
            var pageSize = new XSize(page.Width, page.Height);
            container.Location = new XPoint(15,  45);
            container.MaxSize = pageSize;
            container.PageSize = pageSize;
            container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br><br><a href='"http://www.google.nl'">www.google.nl</a>");
            using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
            {
                container.PerformLayout(measure);
            }
            gfx.IntersectClip(new XRect(0, 0, page.Width, page.Height));
            container.PerformPaint(gfx);
        }
    }
    //write the pdf to a byte array to serve as download, attach to an email etc.
    byte[] bin;
    using (MemoryStream stream = new MemoryStream())
    {
        doc.Save(stream, false);
        bin = stream.ToArray();
    }
}

在我去年开发的一个项目中,我使用wkhtmltopdf (http://wkhtmltopdf.org/)从html生成pdf,然后我读取该文件并将其返回给用户。

这对我来说很好,对你来说可能是个好主意。

我知道这是一个非常古老的问题,但我意识到没有人说实际上有一种准确的方法来将HTML呈现为PDF。根据我的测试,我发现你需要下面的代码来成功地做到这一点。

Bitmap bitmap = new Bitmap(790, 1800);
Graphics g = Graphics.FromImage(bitmap);
XGraphics xg = XGraphics.FromGraphics(g, new XSize(bitmap.Width, bitmap.Height));
TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer c = new TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer();
c.SetHtml("Your html in a string here");
PdfDocument pdf = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
pdf.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(pdf.Pages[0]);
c.PerformLayout(xgr);
c.PerformPaint(xgr);
xgr.DrawImage(img, 0, 0);
pdf.Save("test.pdf");

还有另一种方法,但是你可能会遇到大小问题。

PdfDocument pdf = PdfGenerator.GeneratePdf(text, PageSize.A4);
pdf.Save("test.pdf");

你们听说过这个吗?我可能很晚才回复,但我觉得这有帮助。它非常简单,效果很好。

var htmlContent = String.Format("<body>Hello world: {0}</body>", 
        DateTime.Now);
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
var pdfBytes = htmlToPdf.GeneratePdf(htmlContent);

编辑:我带着使用'PDFSharp'将HTML代码转换为PDF的问题来到这里,发现'PDFSharp'无法做到这一点,然后我发现了NReco,它对我有用,所以我觉得它可能会帮助像我这样的人。

HTML Renderer for PDF使用PdfSharp可以从HTML生成PDF

  1. 作为图像,或
  2. 作为文本

插入到PDF。

要渲染为图像,请参考Diego回答的代码。

要以文本形式呈现,请参考下面的代码:

static void Main(string[] args)
{
    string html = File.ReadAllText(@"C:'Temp'Test.html");
    PdfDocument pdf = PdfGenerator.GeneratePdf(html, PageSize.A4, 20, null, OnStylesheetLoad, OnImageLoadPdfSharp);
    pdf.Save(@"C:'Temp'Test.pdf");
}
public static void OnImageLoadPdfSharp(object sender, HtmlImageLoadEventArgs e)
{
    var imgObj = Image.FromFile(@"C:'Temp'Test.png");
    e.Callback(XImage.FromGdiPlusImage(imgObj));    
}
public static void OnStylesheetLoad(object sender, HtmlStylesheetLoadEventArgs e)
{
    e.SetStyleSheet = @"h1, h2, h3 { color: navy; font-weight:normal; }";
}

HTML代码

<html>
    <head>
        <title></title>
        <link rel="Stylesheet" href="StyleSheet" />      
    </head>
    <body>
        <h1>Images
            <img src="ImageIcon" />
        </h1>
    </body>
</html>

不幸的是,HtmlRenderer并不是一个适合在。net 5.0项目中使用的库:

System.IO.FileLoadException: 'Could not load file or assembly 'HtmlRenderer,
Version=1.5.0.6, Culture=neutral, PublicKeyToken=null'. The located assembly's 
manifest definition does not match the assembly reference. (0x80131040)'

另外,我发现依赖包HtmlRender。PdfSharp有以下警告信息:

Package 'HtmlRenderer.PdfSharp 1.5.0.6' was restored using 
'.NETFramework,Version=v4.6.1, .NETFramework,Version=v4.6.2, 
.NETFramework,Version=v4.7, .NETFramework,Version=v4.7.1, 
.NETFramework,Version=v4.7.2, .NETFramework,Version=v4.8' instead of the project 
target framework 'net5.0'. This package may not be fully compatible with your project.

顺便说一下,我设法将HTML渲染为PDF使用另一个库IronPDF:

License.LicenseKey = "license key";
var renderer = new ChromePdfRenderer();
PdfDocument pdf = await renderer.RenderHtmlAsPdfAsync(youtHtml);
pdf.SaveAs("your html as pdf.pdf");

带有License.LicenseKey的行是不必要的,您可以删除它,但是您的pdf将在每页末尾生成带有IronPDF水印的pdf。但是IronPDF提供了获得试用许可证的密钥

经过长时间的努力,我成功使用了Polybioz.HtmlRenderer.PdfSharp.Core

=比; HtmlRenderer.PdfSharp。Core是HtmlRenderer的部分移植。PdfSharp for .NET Core

它可以在Net5.0中工作:)

我的解决方案,直接从VDWWD和其他人的灵感

          PdfPage page = new PdfPage();
        PdfOutline outline = new PdfOutline();
        page = document.AddPage();
        XGraphics gfx_ = XGraphics.FromPdfPage(page);
        using (var container = new HtmlContainer())
        {
            var pageSize = new XSize(page.Width, page.Height);
            var x = 5;
            var y = 100;

            container.Location = new XPoint(x, y);
            container.MaxSize = pageSize;
            container.PageSize = pageSize;

                        const string latinstuff =
            "Facin exeraessisit la consenim iureet dignibh eu <b>facilluptat</b> vercil dunt autpat. " +
            "Ecte magna faccum dolor sequisc iliquat, quat, quipiss equipit accummy niate magna " +
            "facil iure eraesequis am velit, quat atis dolore dolent luptat nulla adio odipissectet " +
            "lan venis do essequatio conulla facillandrem <u>zzriusci</u> bla ad minim inis nim velit eugait " +
            "aut aut lor at ilit ut nulla ate te eugait alit augiamet ad magnim iurem il eu feuissi.'n" +
            "Guer sequis duis eu feugait luptat lum adiamet, si tate dolore mod eu facidunt adignisl in " +
            "henim dolorem nulla faccum vel inis dolutpatum iusto od min ex euis adio exer sed del " +
            "dolor ing enit veniamcon vullutat praestrud molenis ciduisim doloborem ipit nulla consequisi.'n" +
            "Nos adit pratetu eriurem delestie del ut lumsandreet nis exerilisit wis nos alit venit praestrud " +
            "dolor sum volore facidui blaor erillaortis ad ea augue corem dunt nis  iustinciduis euisi.'n" +
            "Ut ulputate volore min ut nulpute dolobor sequism olorperilit autatie modit wisl illuptat dolore " +
            "min ut in ute doloboreet ip ex et am dunt at.";

            //container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br><br><a href='"http://www.google.nl'">www.google.nl</a>");
            string text = "This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br>" +
                $"<br><a href='"http://www.google.nl'">www.google.nl</a>{DateTime.Now.ToLongTimeString()}";
            text +=  latinstuff;
            text += "<p style='"text-align:center;'">" + latinstuff  + "</p>";                
            text += "<p style='"text-align:justify;'">" + latinstuff + "</p>";
            text += "<div width='"70mm'" style='"text-align:justify;'"><p>" + latinstuff + "</p></div>";
            container.SetHtml(text);
            using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
            {
                container.PerformLayout(measure);
            }
            gfx_.IntersectClip(new XRect(0, 0, page.Width + 400, page.Height));
            container.PerformLayout(gfx_);
            container.PerformPaint(gfx_);
        }

唯一剩下的想法是粗体没有呈现("facilluptat"(应加粗)在拉丁文字,见我的页面的打印屏幕

我推荐你NReco。PdfGenerator,因为它有免费和付费的许可证,它很容易从nuget安装。

主页:https://www.nrecosite.com/pdf_generator_net.aspx

文档:https://www.nrecosite.com/doc/NReco.PdfGenerator/

如果你想从html文件创建PDF,试试:

String html = File.ReadAllText("main.html");
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
htmlToPdf.GeneratePdf(html, null, "C:/Users/Tmp/Desktop/mapa.pdf");