扁平化PDF没有/AcroForm字段与/Annot小部件
本文关键字:Annot 小部 AcroForm PDF 没有 扁平化 字段 | 更新日期: 2023-09-27 18:17:30
我正在使用iTextSharp 5.5.10进行扁平化。这段代码代表了在SO和Google搜索中找到的PDF下面的代码的各种努力。
void Process()
{
PdfReader reader = new PdfReader(mInFileName);
// 1 type of fields
AcroFields formFields = reader.AcroFields;
int countAcroFields = formFields.Fields.Count;
// 2 another type of fields
PRAcroForm form = reader.AcroForm;
List<PRAcroForm.FieldInformation> fieldList = form.Fields;
// 3 i think this is the same as 2
PdfDictionary acroForm = reader.Catalog.GetAsDict(PdfName.ACROFORM);
PdfArray acroFields = (PdfArray)PdfReader.GetPdfObject(acroForm.Get(PdfName.FIELDS), acroForm);
// 4 another type of fields
XfaForm xfaForm = formFields.Xfa;
bool flatten = false;
if (countAcroFields > 0)
{
//No fields found
}
else if (fieldList.Count > 0)
{
//No fields found
}
else if (xfaForm.XfaPresent == true)
{
//No xfaForms found
ReadXfa(reader);
}
else
{
// Yes, there are annotations and this code extracts them but does NOT flatten them
PdfDictionary page = reader.GetPageN(1);
PdfArray annotsArray = page.GetAsArray(PdfName.ANNOTS);
if ( annotsArray == null)
return;
else
{
List<string> namedFieldToFlatten = new List<string>();
foreach (var item in annotsArray)
{
var annot = (PdfDictionary)PdfReader.GetPdfObject(item);
var type = PdfReader.GetPdfObject(annot.Get(PdfName.TYPE)).ToString(); //Expecting be /Annot
var subtype = PdfReader.GetPdfObject(annot.Get(PdfName.SUBTYPE)).ToString(); //Expecting be /Widget
var fieldType = PdfReader.GetPdfObject(annot.Get(PdfName.FT)).ToString(); //Expecting be /Tx
if (annot.Get(PdfName.TYPE).Equals(PdfName.ANNOT) &&
annot.Get(PdfName.SUBTYPE).Equals(PdfName.WIDGET))
{
if (annot.Get(PdfName.FT).Equals(PdfName.TX))
{
flatten = true;
var textLabel = PdfReader.GetPdfObject(annot.Get(PdfName.T)).ToString(); //Name of textbox field
namedFieldToFlatten.Add(textLabel);
var fieldValue = PdfReader.GetPdfObject(annot.Get(PdfName.V)).ToString(); //Value of textbox
Console.WriteLine($"Found Label={textLabel} Value={fieldValue}");
}
}
}
if (flatten == true)
{
// Flatten the PDF [11/9/2016 15:10:06]
string foldername = Path.GetDirectoryName(mInFileName);
string basename = Path.GetFileNameWithoutExtension(mInFileName);
string outName = $"{foldername}''{basename}_flat.pdf";
using (var fStream = new FileStream(outName, FileMode.Create))
{
//This totally removes the fields instead of flattening them
var stamper = new PdfStamper(reader, fStream) { FormFlattening = true, FreeTextFlattening = true, AnnotationFlattening = true };
var stamperForm = stamper.AcroFields;
stamperForm.GenerateAppearances = true;
foreach (var item in namedFieldToFlatten)
{
stamper.PartialFormFlattening(item);
}
stamper.Close();
reader.Close();
}
}
}
}
}
关于如何将"字段"转换为文本flatten在Acrobat中有什么提示吗?
这是我想要平面化的PDF。
这是代码的输出
问题是你的表单坏了。它确实有/Annots
,这些注释是小部件注释,这些注释也有用于字段字典中的条目,但是在PDF中没有表单。或者更确切地说:有一个表单,但是/Fields
数组是空的。
由于/Fields
数组为空,所以没有字段需要平化。小部件注释被删除,您将看不到任何内容。这不是一个iText错误:你需要在填写之前修复表单。
在Java中,您可以这样修改表单:
PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.getCatalog();
PdfDictionary form = root.getAsDict(PdfName.ACROFORM);
PdfArray fields = form.getAsArray(PdfName.FIELDS);
PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
page = reader.getPageN(i);
annots = page.getAsArray(PdfName.ANNOTS);
for (int j = 0; j < annots.size(); j++) {
fields.add(annots.getAsIndirectObject(j));
}
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();
将它移植到c#应该是相当容易的。我不是c#开发人员,但这是一个(未经测试的)尝试做一个端口:
PdfReader reader = new PdfReader(src);
PdfDictionary root = reader.Catalog;
PdfDictionary form = root.GetAsDict(PdfName.ACROFORM);
PdfArray fields = form.GetAsArray(PdfName.FIELDS);
PdfDictionary page;
PdfArray annots;
for (int i = 1; i <= reader.NumberOfPages; i++) {
page = reader.GetPageN(i);
annots = page.GetAsArray(PdfName.ANNOTS);
for (int j = 0; j < annots.Size; j++) {
fields.Add(annots.GetAsIndirectObject(j));
}
}
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create));
stamper.Close();
reader.Close();
一旦你像这样固定了表单,你就可以正确地把它平铺。