生成PDF基于HTML代码(iTextSharp, PDFSharp?)
本文关键字:PDFSharp iTextSharp PDF 基于 HTML 代码 生成 | 更新日期: 2023-09-27 18:10:06
库PDFSharp可以像iTextSharp一样生成PDF文件*考虑HTML格式*吗?(粗体(strong),间距(br)等)
以前我使用iTextSharp,并大致以这样的方式处理(下面的代码):
string encodingMetaTag = "<meta http-equiv='"Content-Type'" content='"text/html; charset=utf-8'" />";
string htmlCode = "text <div> <b> bold </ b> or <u> underlined </ u> <div/>";
var sr = new StringReader (encodingMetaTag + htmlCode);
var pdfDoc = new Document (PageSize.A4, 10f, 10f, 10f, 0f);
var = new HTMLWorker htmlparser (pdfDoc);
PdfWriter.GetInstance (pdfDoc, HttpContext.Current.Response.OutputStream);
pdfDoc.Open ();
htmlparser.Parse (sr);
pdfDoc.Close ();
将合并到适当的HTML表单中,以处理类对象HTMLWorker的PDF文档。那么PDFSharp又如何呢? PDFSharp有没有类似的解决方案?
我知道这个问题很老了,但这里有一个干净的方法…
你可以使用HtmlRenderer和PDFSharp来完成这个任务:
Bitmap bitmap = new Bitmap(1200, 1800);
Graphics g = Graphics.FromImage(bitmap);
HtmlRenderer.HtmlContainer c = new HtmlRenderer.HtmlContainer();
c.SetHtml("<html><body style='font-size:20px'>Whatever</body></html>");
c.PerformPaint(g);
PdfDocument doc = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
doc.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
xgr.DrawImage(img, 0, 0);
doc.Save(@"C:'test.pdf");
doc.Close();
有些人报告说,最终的图像看起来有点模糊,显然是由于自动抗锯齿。这里有一个关于如何解决这个问题的帖子:http://forum.pdfsharp.com/viewtopic.php?f=2&t=1811&start=0
不,PDFsharp目前不包含解析HTML文件的代码。
老问题,但以上都不适合我。然后我尝试了HtmlRenderer的generatepdf
方法结合pdfsharp。希望能有所帮助:必须安装一个名为HtmlRenderer.pdfsharp
的nuget。
var doc = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf("Your html in a string",PageSize.A4);
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
doc.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
xgr.DrawImage(img, 0, 0);
doc.Save(Server.MapPath("test.pdf"));
doc.Close();
如果你只想要一个特定的HTML字符串写入PDF,而不是其余的,你可以使用HtmlContainer
从TheArtOfDev HtmlRenderer。这个代码片段使用了v1.5.1
using PdfSharp.Pdf;
using PdfSharp;
using PdfSharp.Drawing;
using TheArtOfDev.HtmlRenderer.PdfSharp;
//create a pdf document
using (PdfDocument doc = new PdfDocument())
{
doc.Info.Title = "StackOverflow Demo PDF";
//add a page
PdfPage page = doc.AddPage();
page.Size = PageSize.A4;
//fonts and styles
XFont font = new XFont("Arial", 10, XFontStyle.Regular);
XSolidBrush brush = new XSolidBrush(XColor.FromArgb(0, 0, 0));
using (XGraphics gfx = XGraphics.FromPdfPage(page))
{
//write a normal string
gfx.DrawString("A normal string written to the PDF.", font, brush, new XRect(15, 15, page.Width, page.Height), XStringFormats.TopLeft);
//write the html string to the pdf
using (var container = new HtmlContainer())
{
var pageSize = new XSize(page.Width, page.Height);
container.Location = new XPoint(15, 45);
container.MaxSize = pageSize;
container.PageSize = pageSize;
container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br><br><a href='"http://www.google.nl'">www.google.nl</a>");
using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
{
container.PerformLayout(measure);
}
gfx.IntersectClip(new XRect(0, 0, page.Width, page.Height));
container.PerformPaint(gfx);
}
}
//write the pdf to a byte array to serve as download, attach to an email etc.
byte[] bin;
using (MemoryStream stream = new MemoryStream())
{
doc.Save(stream, false);
bin = stream.ToArray();
}
}
在我去年开发的一个项目中,我使用wkhtmltopdf (http://wkhtmltopdf.org/)从html生成pdf,然后我读取该文件并将其返回给用户。
这对我来说很好,对你来说可能是个好主意。
我知道这是一个非常古老的问题,但我意识到没有人说实际上有一种准确的方法来将HTML呈现为PDF。根据我的测试,我发现你需要下面的代码来成功地做到这一点。
Bitmap bitmap = new Bitmap(790, 1800);
Graphics g = Graphics.FromImage(bitmap);
XGraphics xg = XGraphics.FromGraphics(g, new XSize(bitmap.Width, bitmap.Height));
TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer c = new TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer();
c.SetHtml("Your html in a string here");
PdfDocument pdf = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
pdf.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(pdf.Pages[0]);
c.PerformLayout(xgr);
c.PerformPaint(xgr);
xgr.DrawImage(img, 0, 0);
pdf.Save("test.pdf");
还有另一种方法,但是你可能会遇到大小问题。
PdfDocument pdf = PdfGenerator.GeneratePdf(text, PageSize.A4);
pdf.Save("test.pdf");
你们听说过这个吗?我可能很晚才回复,但我觉得这有帮助。它非常简单,效果很好。
var htmlContent = String.Format("<body>Hello world: {0}</body>",
DateTime.Now);
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
var pdfBytes = htmlToPdf.GeneratePdf(htmlContent);
编辑:我带着使用'PDFSharp'将HTML代码转换为PDF的问题来到这里,发现'PDFSharp'无法做到这一点,然后我发现了NReco,它对我有用,所以我觉得它可能会帮助像我这样的人。
HTML Renderer for PDF使用PdfSharp可以从HTML生成PDF
- 作为图像,或 作为文本
插入到PDF。
要渲染为图像,请参考Diego回答的代码。
要以文本形式呈现,请参考下面的代码:
static void Main(string[] args)
{
string html = File.ReadAllText(@"C:'Temp'Test.html");
PdfDocument pdf = PdfGenerator.GeneratePdf(html, PageSize.A4, 20, null, OnStylesheetLoad, OnImageLoadPdfSharp);
pdf.Save(@"C:'Temp'Test.pdf");
}
public static void OnImageLoadPdfSharp(object sender, HtmlImageLoadEventArgs e)
{
var imgObj = Image.FromFile(@"C:'Temp'Test.png");
e.Callback(XImage.FromGdiPlusImage(imgObj));
}
public static void OnStylesheetLoad(object sender, HtmlStylesheetLoadEventArgs e)
{
e.SetStyleSheet = @"h1, h2, h3 { color: navy; font-weight:normal; }";
}
HTML代码
<html>
<head>
<title></title>
<link rel="Stylesheet" href="StyleSheet" />
</head>
<body>
<h1>Images
<img src="ImageIcon" />
</h1>
</body>
</html>
不幸的是,HtmlRenderer并不是一个适合在。net 5.0项目中使用的库:
System.IO.FileLoadException: 'Could not load file or assembly 'HtmlRenderer,
Version=1.5.0.6, Culture=neutral, PublicKeyToken=null'. The located assembly's
manifest definition does not match the assembly reference. (0x80131040)'
另外,我发现依赖包HtmlRender。PdfSharp有以下警告信息:
Package 'HtmlRenderer.PdfSharp 1.5.0.6' was restored using
'.NETFramework,Version=v4.6.1, .NETFramework,Version=v4.6.2,
.NETFramework,Version=v4.7, .NETFramework,Version=v4.7.1,
.NETFramework,Version=v4.7.2, .NETFramework,Version=v4.8' instead of the project
target framework 'net5.0'. This package may not be fully compatible with your project.
顺便说一下,我设法将HTML渲染为PDF使用另一个库IronPDF:
License.LicenseKey = "license key";
var renderer = new ChromePdfRenderer();
PdfDocument pdf = await renderer.RenderHtmlAsPdfAsync(youtHtml);
pdf.SaveAs("your html as pdf.pdf");
带有License.LicenseKey
的行是不必要的,您可以删除它,但是您的pdf将在每页末尾生成带有IronPDF水印的pdf。但是IronPDF提供了获得试用许可证的密钥
经过长时间的努力,我成功使用了Polybioz.HtmlRenderer.PdfSharp.Core
=比; HtmlRenderer.PdfSharp。Core是HtmlRenderer的部分移植。PdfSharp for .NET Core
它可以在Net5.0中工作:)
我的解决方案,直接从VDWWD和其他人的灵感
PdfPage page = new PdfPage();
PdfOutline outline = new PdfOutline();
page = document.AddPage();
XGraphics gfx_ = XGraphics.FromPdfPage(page);
using (var container = new HtmlContainer())
{
var pageSize = new XSize(page.Width, page.Height);
var x = 5;
var y = 100;
container.Location = new XPoint(x, y);
container.MaxSize = pageSize;
container.PageSize = pageSize;
const string latinstuff =
"Facin exeraessisit la consenim iureet dignibh eu <b>facilluptat</b> vercil dunt autpat. " +
"Ecte magna faccum dolor sequisc iliquat, quat, quipiss equipit accummy niate magna " +
"facil iure eraesequis am velit, quat atis dolore dolent luptat nulla adio odipissectet " +
"lan venis do essequatio conulla facillandrem <u>zzriusci</u> bla ad minim inis nim velit eugait " +
"aut aut lor at ilit ut nulla ate te eugait alit augiamet ad magnim iurem il eu feuissi.'n" +
"Guer sequis duis eu feugait luptat lum adiamet, si tate dolore mod eu facidunt adignisl in " +
"henim dolorem nulla faccum vel inis dolutpatum iusto od min ex euis adio exer sed del " +
"dolor ing enit veniamcon vullutat praestrud molenis ciduisim doloborem ipit nulla consequisi.'n" +
"Nos adit pratetu eriurem delestie del ut lumsandreet nis exerilisit wis nos alit venit praestrud " +
"dolor sum volore facidui blaor erillaortis ad ea augue corem dunt nis iustinciduis euisi.'n" +
"Ut ulputate volore min ut nulpute dolobor sequism olorperilit autatie modit wisl illuptat dolore " +
"min ut in ute doloboreet ip ex et am dunt at.";
//container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br><br><a href='"http://www.google.nl'">www.google.nl</a>");
string text = "This is a <b>HTML</b> string <u>written</u> to the <font color='"red'">PDF</font>.<br>" +
$"<br><a href='"http://www.google.nl'">www.google.nl</a>{DateTime.Now.ToLongTimeString()}";
text += latinstuff;
text += "<p style='"text-align:center;'">" + latinstuff + "</p>";
text += "<p style='"text-align:justify;'">" + latinstuff + "</p>";
text += "<div width='"70mm'" style='"text-align:justify;'"><p>" + latinstuff + "</p></div>";
container.SetHtml(text);
using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
{
container.PerformLayout(measure);
}
gfx_.IntersectClip(new XRect(0, 0, page.Width + 400, page.Height));
container.PerformLayout(gfx_);
container.PerformPaint(gfx_);
}
唯一剩下的想法是粗体没有呈现("facilluptat"(应加粗)在拉丁文字,见我的页面的打印屏幕
我推荐你NReco。PdfGenerator,因为它有免费和付费的许可证,它很容易从nuget安装。
主页:https://www.nrecosite.com/pdf_generator_net.aspx
文档:https://www.nrecosite.com/doc/NReco.PdfGenerator/
如果你想从html文件创建PDF,试试:
String html = File.ReadAllText("main.html");
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
htmlToPdf.GeneratePdf(html, null, "C:/Users/Tmp/Desktop/mapa.pdf");