提取内存中的页面iTextSharp

本文关键字：iTextSharp 内存提取 | 更新日期: 2023-09-27 17:59:48

有人能帮我解决这个问题吗。如何从pdf中提取一些页面，并将其作为字节数组或流返回，而不使用物理文件作为输出。

以下是使用文件流的方法：

public static void ExtractPages(string sourcePdfPath, string outputPdfPath, int      startPage, int endPage)
{
    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage = null;
   try
   {
    reader = new PdfReader(sourcePdfPath);
    sourceDocument = new Document(reader.GetPageSizeWithRotation(startPage));
    pdfCopyProvider = new PdfCopy(sourceDocument, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
    sourceDocument.Open();
        for(int i = startPage; i <= endPage; i++)
        {
             importedPage = pdfCopyProvider.GetImportedPage(reader, i);
            pdfCopyProvider.AddPage(importedPage);
        }
        sourceDocument.Close();
        reader.Close();
    }
    catch(Exception ex)
    {
        throw ex;
    }
}

我需要类似的东西：

public static byte[] ExtractPages(string sourcePdfPath, int startPage, int endPage)
{
....
    return byte[];
}

提取内存中的页面iTextSharp

将new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create)替换为MemoryStream并返回。

有些东西像(未经测试，但应该有效(：

public static byte[] ExtractPages(string sourcePdfPath, int startPage, int endPage)
{
    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage = null;
    MemoryStream target = new MemoryStream();
    reader = new PdfReader(sourcePdfPath);
    sourceDocument = new Document(reader.GetPageSizeWithRotation(startPage));
    pdfCopyProvider = new PdfCopy(sourceDocument, target);
    sourceDocument.Open();
    for(int i = startPage; i <= endPage; i++)
    {
        importedPage = pdfCopyProvider.GetImportedPage(reader, i);
        pdfCopyProvider.AddPage(importedPage);
    }
    sourceDocument.Close();
    reader.Close();
    return target.ToArray();
}