When generating a PDF, use a FileStream and periodically flush memory contents to disk

Hello,

I'm using the PDF library to generate documents. I have a requirement where I have to generate a PDF document listing all of a customer's data in tabular form. That means I need to generate a PDF with a PdfGrid containing up to 20,000,000 rows. Even generating a document with a PdfGrid containing 100,000 rows takes almost 10 minutes.

The PDF library (via PdfDocument) appears to create the entire document in memory, and only then allows saving it to disk when the document is fully created. 

Is it possible to use a FileStream as the underlying stream and periodically flush the document to disk while it is being generated? Like, after I add 10,000 rows to the PdfGrid, can I flush the current memory contents to disk?

Are there other ways to reduce the amount of memory used when generating large PDFs?

Thank you,

Jon



8 Replies

SV Surya Venkatesan Syncfusion Team November 24, 2021 11:01 AM UTC

Hi Jon, 
 
Thank you for contacting Syncfusion support. 
 
In ASP.NET Core platform, we can save the PDF document using FileStream instead of saving it to memory stream to reduce the memory consumed by PDF DOM during the save process. While rendering the grid, It is not possible to periodically flush the document to disk while it is being generated, instead of we request you to create multiple PDF grids and draw it to the PDF document. The grid rendering only performs at the time of drawing into the PDF document. 
 
Regards, 
Surya V 



JS Jon Sagara replied to Surya Venkatesan November 24, 2021 06:19 PM UTC

Thank you for the response.

I have a reproduction of the issue here: 

https://github.com/jonsagara/SyncfusionGenerateLargePdf

Is it possible for you to take a look at it and make recommendations to increase performance? 



SV Surya Venkatesan Syncfusion Team November 25, 2021 02:34 PM UTC

Hi Jon, 
 
Thank you for your update. 
 
Currently, we are analyzing your requirement to increase the performance of attached project and we will update the further details on November 29th 2021. 
 
Regards, 
Surya V 



GK Gowthamraj Kumar Syncfusion Team November 29, 2021 01:56 PM UTC

Hi Jon,  
  
Thank you for your patience.  
  
We have tried to reproduce the reported issue with provided document, we were able to get more time while creating a PDF Grid with ~3Lakhs + rows on our end. As we said earlier, the grid rendering only performs at the time of drawing into the PDF document. It is not possible to periodically flush the document to disk while rendering the grid.  Currently, we are checking the other posssiblities to reduce the time taken for creating PDF document with n number of grid rows and we will update the further details on December 1st  2021. 
 
Regards. 
Gowthamraj K 



JS Jon Sagara November 29, 2021 03:16 PM UTC

Great. Thank you.



GK Gowthamraj Kumar Syncfusion Team December 1, 2021 04:06 PM UTC

Hi Jon, 

Thank you for your patience.  

As per our current architecture, it is not possible to periodically flush the memory during the rendering process. On further analysis about workaround solution, we can optimize the time taken for creating a large PDF files with n number of PDF grids row. We can create a multiple PDF files with limited row range for PdfGrid and then finally merge it into a single PDF document.  We have attached the modified sample to improve the performance for large PDFs document with n number of PDF Grid rows. In this approach, the every individual document may contain spaces in their last page while merging the multiple PDFs documents. This is our behaviour for merging the documents.   

Please find the below sample and iteration logs for your reference, 
 
Please try the above workaround sample on your end and let us know the result. 

Regards, 
Gowthamraj K 



JS Jon Sagara December 1, 2021 05:51 PM UTC

Hi Gowthamraj,

Thank you for the time and effort you have put into coming up with a solution. Unfortunately, having breaks in the middle of the document where they were merged together doesn't work for us. We need the grid to be contiguous. It's unfortunate because the library works perfectly for all of our other use cases.

Is there a chance that you'll be able to improve the performance of the library to make my original code work?

Thanks,

Jon




GK Gowthamraj Kumar Syncfusion Team December 2, 2021 12:56 PM UTC

Hi Jon,

Thank you for your update.

No. It is not possible to improve the performance by disposing the underlying objects while rendering the grid. At per current architecture, when creating a PDF Grid with large number of rows, it will take the responsible amount of time to complete the whole grid rendering process, we could not flush the document to disk.

 
Regards, 
Gowthamraj K 


Loader.
Up arrow icon