iTextSharp 5.5.9 Redaction - Object Reference not found

本文关键字:Reference not found Object Redaction iTextSharp | 更新日期: 2023-09-27 18:03:07

我从如何编辑一个大矩形的PDF由iTextSharp?

和生成:

    iTextSharp.text.pdf.PdfReader reader;
    reader = new iTextSharp.text.pdf.PdfReader(new System.IO.FileStream(txtPDFFile.Text, System.IO.FileMode.Open));
    string path = System.IO.Path.GetDirectoryName(txtPDFFile.Text);
    System.IO.Stream fsOut = new System.IO.FileStream(System.IO.Path.Combine(path,"redacted.pdf"), System.IO.FileMode.OpenOrCreate);
    iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, fsOut);
List<iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation> cleanUpLocations = new    List<iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation>();
        cleanUpLocations.Add(new iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpLocation(1, new iTextSharp.text.Rectangle(77f, 77f, 200f, 200f), iTextSharp.text.BaseColor.GRAY));
        iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor cleaner = new iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor(cleanUpLocations, stamper);
        cleaner.CleanUp();
        stamper.Close();
        reader.Close();

所以我从链接的文章中得知,我应该使用不同的输入文件,我正在做的。

但是在cleaner.CleanUp()中,我得到一个没有找到的对象引用:

   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.FormXObjectDoHandler.HandleXObject(PdfContentStreamProcessor processor, PdfStream stream, PdfIndirectReference refi)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.DisplayXObject(PdfName xobjectName)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.Do.Invoke(PdfContentStreamProcessor processor, PdfLiteral oper, List`1 operands)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
   at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.CleanUpPage(Int32 pageNum, IList`1 cleanUpLocations)
   at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.CleanUp()
   at Com.EDS.DocSol.PDFExtract.PDFExtractForm.btnRedaction_Click(Object sender, EventArgs e) in D:'Users'me'Code'PDFExtract'PDFExtract'PDFExtractForm.cs:line 106
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.ButtonBase.WndProc(Message& m)
   at System.Windows.Forms.Button.WndProc(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
   at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
   at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
   at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
   at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
   at System.Windows.Forms.Application.Run(Form mainForm)
   at Com.EDS.DocSol.PDFExtract.Program.Main(String[] args) in D:'Users'me'Code'PDFExtract'PDFExtract'Program.cs:line 140
   at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

我不明白为什么。我没有改变的矩形。我不确定那个地方是否真的有什么东西。我有一些代码会先添加注释,然后尝试应用它。但它也会得到相同的对象引用错误。

在上面的代码....我是否需要在应用之前先创建一个编校注释,或者这段代码是否选择了我想要编校的框并一次应用它?

我想要的矩形(它是一个地址块),实际上是:iTextSharp.text。矩形(45,650,200,750);

iTextSharp 5.5.9 Redaction - Object Reference not found

对于OP的原始观测,对象引用未找到

at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.FormXObjectDoHandler.HandleXObject(PdfContentStreamProcessor processor, PdfStream stream, PdfIndirectReference refi)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.DisplayXObject(PdfName xobjectName)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.Do.Invoke(PdfContentStreamProcessor processor, PdfLiteral oper, List`1 operands)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpContentOperator.Invoke(PdfContentStreamProcessor pdfContentStreamProcessor, PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.InvokeOperator(PdfLiteral oper, List`1 operands)
at iTextSharp.text.pdf.parser.PdfContentStreamProcessor.ProcessContent(Byte[] contentBytes, PdfDictionary resources)

在清理处理器解析表单xobject的内容流时发生:内容流中的某些指令似乎是无效的(很可能是指令参数无效),即PDF很可能只是被破坏了。

此行为不能在脱敏版本的OP文档中复制。特别是,脱敏版本在每个页面上只包含一个不与编校区域相交的表单对象。当扩展编校区域以部分地与表单xobject相交时,确实会得到一个异常。但它是一个不同的,清楚地表明System.Drawing.Graphics.FromImage不能处理xobject形式中显示的位图图像的格式。

因此,无效的形式xobject内容似乎在脱敏过程中被删除了。因此,为了针对手头的问题加强清理代码,需要原始文档。


在评论中,OP表示他还试图以另一种方式调用清理过程,即在PDF中添加编校注释,然后在没有PdfCleanUpLocation的情况下调用清理过程。他添加了这样的编校注释:

PdfReader reader = new PdfReader(new FileStream(txtPDFFile.Text, FileMode.Open));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(txtPDFFile.Text + ".pdf", FileMode.OpenOrCreate)))
{
    // Add the annotations
    int page = 1;
    Rectangle rect = new Rectangle(45, 650, 200, 750);
    PdfAnnotation annotation = new PdfAnnotation(stamper.Writer, rect);
    annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));
    stamper.AddAnnotation(annotation, page);
} //Using

清理现在也会遇到对象引用未找到的情况,但这次是

at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnot(Int32 page, Int32 annotIndex, PdfDictionary annotDict)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnots(Int32 page, PdfDictionary pageDict)
at iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup.PdfCleanUpProcessor.ExtractLocationsFromRedactAnnots()

在这种情况下,原因是清理代码中的错误。例如,adobereader生成的编校注释通常包含一个附加参数QuadPoints,其中包含许多四边形,这些四边形指定注释矩形内实际要编校的区域;如果此参数不存在,则整个矩形将被编校。

iTextSharp在这个上下文中有这样的代码:

PdfArray quadPoints = annotDict.GetAsArray(PdfName.QUADPOINTS);
if (quadPoints.Size != 0) {
    markedRectangles.AddRange(TranslateQuadPointsToRectangles(quadPoints));
} else { 
    ... add a range for the annotation rectangle ...
}

不幸的是,如果注释没有QuadPointsannotDict.GetAsArray返回null, quadPoints.Size的评估异常失败。应该是

if (quadPoints != null && quadPoints.Size != 0) {

OP可以通过将QuadPoints条目与空数组添加到他的编订中来解决这个问题:

...
annotation.Put(PdfName.SUBTYPE, new PdfName("Redact"));
annotation.Put(PdfName.QUADPOINTS, new PdfArray()); // <<<<<<<<
stamper.AddAnnotation(annotation, page);
...

注意:这仅仅是iTextSharp问题的一个解决方案,如果带有注释的PDF用于其他用途,则不应该这样做。严格地说,空的QuadPoints条目表示没有任何内容要编辑。


作为题外话,OP的代码中有一个问题:当为PdfStamper创建文件流时,他使用FileMode.OpenOrCreate:

System.IO.Stream fsOut = new System.IO.FileStream(System.IO.Path.Combine(path,"redacted.pdf"), System.IO.FileMode.OpenOrCreate);

using (PdfStamper stamper = new PdfStamper(reader, new FileStream(txtPDFFile.Text + ".pdf", FileMode.OpenOrCreate)))

如果已经有一个同名的文件,而且比新的PDF长,那么结果将是旧文件的大小,旧内容仍然在额外的空间中。也就是说,新文件有效地悬挂垃圾内容,使无效的PDF,如Adobe Reader提供修复。

一般应该用FileMode.Create代替。从它的文档:

FileMode.Create相当于请求如果文件不存在,使用System.IO.FileMode.CreateNew;否则,使用System.IO.FileMode.Truncate