I am trying to take a pdf form fill it out and then add it as a page to one document. This process would be repeated for every set of data associated with one file (which could have from 100-10000+ records to be turned into a pdf).
I have tried the below,
Filling out the form and importing all pages into a pdf that holds all pages. Then after all pages are added flattening and saving the pdf with all pages. This is fast on the filling and importing but very very slow on saving and flattening.
Filling out the form and importing 50 pages into one pdf. Once the 50 are flattened and saved (this does not take that long), I then try to merge the pdfs into one. This works but then once we get up in the 1000 page range, the merge starts to take a long time.
Filling out the form and then trying to create a template page from the filled out form and drawing it on the output pdf. This is fast but I can not see how to get the data on the form to also stay when the pdf template page is created (if this is even possible).
Populating a class that holds all the data that needs to go on the form and also holds the coordinates to where on the pdf the text should be drawn. This seems to be effective and fast but time consuming to set up the location of every position data needs to be filled on the pdf.
I believe my answer maybe a mix between some of the above, and was wondering if there is any help that can be given for this issue.
|
Filling out the form and importing all pages into a pdf that holds all pages. Then after all pages are added flattening and saving the pdf with all pages. This is fast on the filling and importing but very very slow on saving and flattening. |
Yes. It will take a responsible time to saving and flattening the large PDF document using PDF library.
|
|
Filling out the form and importing 50 pages into one pdf. Once the 50 are flattened and saved (this does not take that long), I then try to merge the pdfs into one. This works but then once we get up in the 1000 page range, the merge starts to take a long time. |
We can manage memory while merging large PDF documents. By Setting the EnableMemoryOptimization property of the PdfLoadedDocument to true reduces the memory usage when its instance is closed. If the document has more content, then merge function will take responsible time. Please share the input form document with form data to enhance the performance.
|
|
Filling out the form and then trying to create a template page from the filled out form and drawing it on the output pdf. This is fast but I can not see how to get the data on the form to also stay when the pdf template page is created (if this is even possible).
Populating a class that holds all the data that needs to go on the form and also holds the coordinates to where on the pdf the text should be drawn. This seems to be effective and fast but time consuming to set up the location of every position data needs to be filled on the pdf.
|
We request you to share more details about this approach or exact requirement such as complete code snippet (form template), output document, to analyze on our end. So, that it will be helpful for us to analyze and assist you further on this.
|
Thanks for the input on the options. We ended up also trying another option that we are planning to go with due to processing time.
We are going to take a pdf page that has the form that needs to be filled out, then every field has a location specifying the x and y coordinates that the text needs to be at for each field. It will then draw the strings onto the template page for each record. It seems to be quick and saving is fast. After I get the data into a object that represents that data, I then do a loop through each page of the template form. I draw this page on the output pdf and then populate the form. Populating the form takes the data in the object along with the location that data needs to go on what page of the form, and draw it on the form. Since the main template is already drawn on the output PDF, I just continue to the next record and repeat till completed.
If there is anything that maybe more optimal then the above approach, please let me know.
In regards to the last option, what was tried was filling out a pdf form (this form has many different fields). Then after the form was filled out (using the forms fields) the below code would run. This code would take the pages from the filled form and make it a template page. The create template function seemed to remove any fields and data that was in them instead of just keeping the data without the fields (this was to try and get around the slowness of flattening). Once all pages were added we would then save the pdf.
foreach (PdfLoadedPage page in form.Pages)
{
var pageTemplate = outputPdf.Pages.Add();
pageTemplate.Graphics.DrawPdfTemplate(page.CreateTemplate(), Syncfusion.Drawing.PointF.Empty, new Syncfusion.Drawing.SizeF(page.Size.Width, page.Size.Height));
}
|
static void Main(string[] args)
{
PdfDocument document = new PdfDocument();
Stream filledForm = Program.FillPDFForm("form.pdf");
PdfLoadedDocument ldoc = new PdfLoadedDocument(filledForm);
foreach (PdfLoadedPage page in ldoc.Pages)
{
var pageTemplate = document.Pages.Add();
pageTemplate.Graphics.DrawPdfTemplate(page.CreateTemplate(), PointF.Empty, new SizeF(page.Size.Width, page.Size.Height));
}
document.Save("sample.pdf");
document.Close(true);
ldoc.Close(true);
filledForm.Dispose();
}
private static Stream FillPDFForm(String fileName)
{
PdfLoadedDocument doc = new PdfLoadedDocument(fileName);
(doc.Form.Fields[0] as PdfLoadedTextBoxField).Text = "sample";
doc.Form.Flatten = true;
MemoryStream stream = new MemoryStream();
doc.Save(stream);
doc.Close(true);
stream.Position = 0;
return stream;
} |
I have implemented the above mixed with the other options. So now I create pdfs that have 50 pages up to the number of pages that I need. I make sure to flatten and save the pdfs to a drive. Then in another step I pull each pdfs data into a stream and go through each page saving it to an output pdf as a template. I save and close the output pdf at the end.
This seems to be consistent but, compared to placing the text by coordinates, the pdf file size for the same amount of records is a lot more. The pdf that was not generated from a form is only 538 KB, where the one that was generated by the new method is 11 MB. These pdfs both have 50 pages. This is quite a jump considering I am using the same create template function. With a pdf that has 299 pages, the difference is 1.7 MB to 70 MB. Is there a way after flattening that I can remove a lot of this extra data from the pdfs?
|
//Disable the incremental update
document.FileStructure.IncrementalUpdate = false;
|