扁平化PDF没有/AcroForm字段与/Annot小部件

本文关键字:Annot 小部 AcroForm PDF 没有 扁平化 字段 | 更新日期: 2023-09-27 18:17:30

我正在使用iTextSharp 5.5.10进行扁平化。这段代码代表了在SO和Google搜索中找到的PDF下面的代码的各种努力。

void Process()
{
    PdfReader reader = new PdfReader(mInFileName);
    // 1 type of fields
    AcroFields formFields = reader.AcroFields;
    int countAcroFields = formFields.Fields.Count;
    // 2 another type of fields
    PRAcroForm form = reader.AcroForm;
    List<PRAcroForm.FieldInformation> fieldList = form.Fields;
    // 3 i think this is the same as 2
    PdfDictionary acroForm = reader.Catalog.GetAsDict(PdfName.ACROFORM);
    PdfArray acroFields = (PdfArray)PdfReader.GetPdfObject(acroForm.Get(PdfName.FIELDS), acroForm);
    // 4 another type of fields
    XfaForm xfaForm = formFields.Xfa;
    bool flatten = false;
    if (countAcroFields > 0)
    {
        //No fields found
    }
    else if (fieldList.Count > 0)
    {
        //No fields found
    }
    else if (xfaForm.XfaPresent == true)
    {
        //No xfaForms found
        ReadXfa(reader);
    }
    else
    {
        // Yes, there are annotations and this code extracts them but does NOT flatten them
        PdfDictionary page = reader.GetPageN(1);
        PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);
        if ( annotsArray == null)
            return;
        else
        {
            List<string> namedFieldToFlatten = new List<string>();
            foreach (var item in annotsArray)
            {
                var annot = (PdfDictionary)PdfReader.GetPdfObject(item);
                var type = PdfReader.GetPdfObject(annot.Get(PdfName.TYPE)).ToString(); //Expecting be /Annot
                var subtype = PdfReader.GetPdfObject(annot.Get(PdfName.SUBTYPE)).ToString(); //Expecting be /Widget
                var fieldType = PdfReader.GetPdfObject(annot.Get(PdfName.FT)).ToString(); //Expecting be /Tx
                if (annot.Get(PdfName.TYPE).Equals(PdfName.ANNOT) &&
                    annot.Get(PdfName.SUBTYPE).Equals(PdfName.WIDGET))
                {
                    if (annot.Get(PdfName.FT).Equals(PdfName.TX))
                    {
                        flatten = true;
                        var textLabel = PdfReader.GetPdfObject(annot.Get(PdfName.T)).ToString(); //Name of textbox field
                        namedFieldToFlatten.Add(textLabel);
                        var fieldValue = PdfReader.GetPdfObject(annot.Get(PdfName.V)).ToString(); //Value of textbox
                        Console.WriteLine($"Found Label={textLabel} Value={fieldValue}");
                    }
                }
            }
            if (flatten == true)
            {
                // Flatten the PDF [11/9/2016 15:10:06]
                string foldername = Path.GetDirectoryName(mInFileName);
                string basename = Path.GetFileNameWithoutExtension(mInFileName);
                string outName = $"{foldername}''{basename}_flat.pdf";
                using (var fStream = new FileStream(outName, FileMode.Create))
                {
                    //This totally removes the fields instead of flattening them
                    var stamper = new PdfStamper(reader, fStream) { FormFlattening = true, FreeTextFlattening = true, AnnotationFlattening = true };
                    var stamperForm = stamper.AcroFields;
                    stamperForm.GenerateAppearances = true;
                    foreach (var item in namedFieldToFlatten)
                    {
                        stamper.PartialFormFlattening(item);
                    }
                    stamper.Close();
                    reader.Close();
                }
            }
        }
    }
}

关于如何将"字段"转换为文本flatten在Acrobat中有什么提示吗?

这是我想要平面化的PDF。

这是代码的输出

扁平化PDF没有/AcroForm字段与/Annot小部件

问题是你的表单坏了。它确实有/Annots,这些注释是小部件注释,这些注释也有用于字段字典中的条目,但是在PDF中没有表单。或者更确切地说:有一个表单,但是/Fields数组是空的。

由于/Fields数组为空,所以没有字段需要平化。小部件注释被删除,您将看不到任何内容。这不是一个iText错误:你需要在填写之前修复表单。

在Java中,您可以这样修改表单:

PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.getCatalog();
PdfDictionary form = root.getAsDict(PdfName.ACROFORM);
PdfArray fields = form.getAsArray(PdfName.FIELDS);
PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
    page = reader.getPageN(i);
    annots = page.getAsArray(PdfName.ANNOTS);
    for (int j = 0; j < annots.size(); j++) {
        fields.add(annots.getAsIndirectObject(j));
    }
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();

将它移植到c#应该是相当容易的。我不是c#开发人员,但这是一个(未经测试的)尝试做一个端口:

PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.Catalog;
PdfDictionary form = root.GetAsDict(PdfName.ACROFORM);
PdfArray fields = form.GetAsArray(PdfName.FIELDS);
PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.NumberOfPages; i++) {
    page = reader.GetPageN(i);
    annots = page.GetAsArray(PdfName.ANNOTS);
    for (int j = 0; j < annots.Size; j++) {
        fields.Add(annots.GetAsIndirectObject(j));
    }
}
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create));
stamper.Close();
reader.Close();

一旦你像这样固定了表单,你就可以正确地把它平铺。