We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

Merging a large number of PDFs

We have a scenario in which a large number of small documents need to be generated, and then also combined into larger documents. The total number of small documents may be as large as 10,000. Generating the files is no trouble, but trying to create a single PDF that contains all of the documents is causing an out of memory exception before it can complete the merge and save.

Is there an upper limit on the number of PDF files that can be either appended or merged? What would be the best recommended approach using the PDF components to accomplish this type of merge?


5 Replies

GM Geetha M Syncfusion Team March 29, 2012 07:06 AM UTC

Hi Chris,

With the current approach, if you open all 10000 documents and merge them, a total of 10000 + 10000 documents will be in the memory till the final PDF is closed. Instead of that, we recommend you to open a document, merge and close it and then proceed to the next document. This way, you can reduce the memory usage.

Also, there is no limit to the number of files to be merged.

Please try this and let us know if you have any questions.

Regards,
Geetha



DA dan April 4, 2012 08:54 PM UTC

So if you want a merge 1 small document into a large one, it won't use that much memory because it keeps the data on the disk?

I create some bills/invoices reports which use a very old program to send to the printer. We use win2pdf to create the PDF but sometimes it runs out of memory. On your reports to PDF can you send it to PDF and keep it on the file rather than memory?

Some of our clients need 20,000 page reports. Would this be do-able? Sometimes those 20,000 page reports are 5 merged documents



GM Geetha M Syncfusion Team April 5, 2012 06:20 AM UTC

Hi Dan,

Thank you for the details.

Here is the code snippet to merge and close the files:

string[] files = Directory.GetFiles(folder);

PdfLoadedDocument document = new PdfLoadedDocument(files[0]);

for (int i = 1; i < 5; i++)
{
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(files[i]);
document.Append(loadedDocument);
document.Save("Sample.pdf");
document.Close(true);
loadedDocument.Close(true);

if (i < 4)
document = new PdfLoadedDocument(@"Sample.pdf");
}

Process.Start("Sample.pdf");

Using this way, we can avoiding keeping all files in the memory. Also, if the number of resources in the memory is more, it tend to throw out of memory. For example, a very large image paginating to few pages could possibly create exception due to more consumption of memory. Hence the page numbers does not always form the criteria.

Please let me know if you have any questions.

Regards,
Geetha




SM Scott Mury July 11, 2012 10:17 PM UTC

Come on Syncfusion.  Fix your code please.   I'm already searching for a new Software provider because of this.

I'm encountering similar issues.    From what i can see there does in fact appear to be some kind of issue with the Syncfusion code.  A memory leak or something.    Has Syncfusion dug into their code to see if find out what the underlying issue is?

I'm trying to merge the same 50KB document into a single document 1000 pages long while keeping the main document open.  This will consume over 1 GB of memory on my development machine, but complete in 90 seconds. 

This amount of memory consumption does not make sense to me.  Even if 1000 50 KB documents are read into memory that only accounts for a fraction of 1 GB memory consumption.

Also, I've tried Gesha's code smippet AND PERFORMANCE IS HORRIBLE even though the Memory issue has disappeared.

To merge 50 1 page 60 KB documents together into 1 50 page document it took a blistering 15 minutes.  That is quite possibly the worst performing code i've seen on any level on any platform to do such a seemingly simple operation.  you cannot tell me that there is not an issue with Syncfusion because from my perspective there is.

If I'm wrong, then please post a code snippet that allows me to merge MANY single page documents in a single PDF without crashing our code on a workstation or server OR taking absolutely forever. 



GM Geetha M Syncfusion Team July 12, 2012 12:42 PM UTC

Hi Scott,

Thank you for the details.

We were able to reproduce the problem. Since the merged document requires access to resources of imported document, during save, the memory is high. The process of save immediately after merge does not require the PdfLoadedDocument to be kept in memory, however eventually if the document grows bigger it takes more time to open and merge. We have logged a feature request to modify the save process and recommend you to follow up with the direct trac incident for further details on the feature.

Please let us know if you have any questions.

Regards
Geetha

Loader.
Live Chat Icon For mobile
Up arrow icon