How to optimize performance of Converting Word Document to PDF file

Question

Hi,I am trying to convert the word document file to a PDF file using the below code.   private static void CovertIntoPdf(string documentPath, string path)        {            FileStream docStream = new FileStream(documentPath, FileMode.Open, FileAccess.Read);            WordDocument document = new WordDocument(docStream, Syncfusion.DocIO.FormatType.Automatic);            DocIORenderer render = new DocIORenderer();            render.Settings.ChartRenderingOptions.ImageFormat = ExportImageFormat.Jpeg;            Console.WriteLine("Started " + DateTime.Now.ToString("yyyy.MM.dd HH:mm:ss:ffff"));            PdfDocument pdfDocument = render.ConvertToPDF(document);            Console.WriteLine("Ended " + DateTime.Now.ToString("yyyy.MM.dd HH:mm:ss:ffff"));            render.Dispose();            document.Dispose();            FileStream stream = new FileStream(path, FileMode.Create, FileAccess.ReadWrite);            var encodedData = ConvertPdfStreamToBase64(pdfDocument);            pdfDocument.Save(stream);            pdfDocument.Close();        }

PdfDocument pdfDocument = render.ConvertToPDF(document);​

This particular method is taking almost 15 seconds to process the file. File size is only 195KB. For this, It is affecting the performance of my application.Could you please help with some other alternative method to convert the Docx files to PDF format?ThankYou in Advance.

Lokesh Baskar · Accepted Answer

Hi Sumit,Thank you for contacting Syncfusion support.Essential DocIO keeps the entire document contents (paragraphs, images, tables and all other supported items along with the formatting) in main memory. Some documents may seem to have very less contents, but Essential DocIO uses more memory due to rich set of formatting applied to the contents. In case of opening a Word document file, you may think the file size is small but, Essential DocIO utilizes very large memory. Whereas it is a zip archive file with extension “docx”, Essential DocIO internally decompress it and populate the content in the document object model utilizing remarkable main memory. The main memory utilized by an instance will not be released until the instance is removed from the document.Additionally in Word to PDF conversion process, internally DocIO measures and layout each contents in the Word document and render into PDF document. The time taken and memory consumption for these process are varied based on elements in the document. So, at your side it may takes time for this documents, which is dependent on the elements and formatting’s in it.If you have any concerns about this means, then share your input Word document. Thereby, we will check on the feasible solution and will provide you the appropriate solution at the earliest.Please let us know if you have any other questions.Regards,Lokesh B

Sander Groenenberg · Answer

Hi, We have the same issue on our side and I guess this has to do with the docx template that we use and the amount of data (80 pages).What do you mean with:"

The main memory utilized by an instance will not be released until the instance is removed from the document.

"Can you please provide a code example?We do the pdfDocument.Close(true), is that not enough?Our (k8s) pod takes 1GB for every call we make and does not seem to release it.I will also create a support ticket to check our situation.

Manikandan Ravichandran · Answer

Hi Sumit,Currently, we are checking on your query and we will share the details for this on 17th January 2022.Regards,Manikandan Ravichandran

Lokesh Baskar · Answer

Hi Sander,On further checking with the 138 page of document which contains more numbers of tables, images and paragraph with formatting, the time taken for converting a word document to PDF is 14 seconds and memory goes up to 900MB during conversion. After using of GC.Collect the memory reduced to 360MB. 
Please refer to the below screenshot. 
Before Start conversion: 81MB used 
  
Time taken for conversion: 
  
During conversion: 
  
After GC collector it reduced to 390MB: 
  
Code snippet used: 
 So, kindly share your input document to check on the problem and will provide you the appropriate solution at the earliest. 
Regards,Lokesh B

Muhammad Sufyan MALIK · Answer

Hi,
I dont know why, only 21 pages without any images taking more than 1 minute for converting only.

using same method as mention above.
Is there any method for optimization ?.

sharing only conversion part of the code,

WordDocument document = new(byteDocument, FormatType.Automatic);

document.Replace("{{cFullName}}", customerName, caseSensitive, wholeWord);

document.Replace("{{cTckno}}", customerIdentity, caseSensitive, wholeWord);

document.Replace("{{cTaxOfficeAndNumber}}", customerTaxInfo, caseSensitive, wholeWord);

document.Replace("{{cAddress}}", customerAddress, caseSensitive, wholeWord);

document.Replace("{{cPhoneNumber}}", customerPhone, caseSensitive, wholeWord);

document.Replace("{{cEmail}}", customerEmail, caseSensitive, wholeWord);

document.Replace("{{cAuthorizedNameSurname}}", customerAuthFullname, caseSensitive, wholeWord);

document.Replace("{{cAuthorizedTCKN}}", customerAuthIdentity, caseSensitive, wholeWord);

document.Replace("{{cAuthorizedPhone}}", customerAuthPhone, caseSensitive, wholeWord);

document.Replace("{{packageNotIncludedAddress}}", packageNotIncludedAddress, caseSensitive, wholeWord);

document.Replace("{{orderPhoneNumber}}", packageType, caseSensitive, wholeWord);

document.Replace("{{orderCreatedDate}}", orderStartDate, caseSensitive, wholeWord);

document.Replace("{{orderEndDate}}", orderEndDate, caseSensitive, wholeWord);

document.Replace("{{orderMonth}}", orderMonth, caseSensitive, wholeWord);

document.Replace("{{packageType}}", packageType, caseSensitive, wholeWord);

document.Replace("{{monthlyCost}}", "", caseSensitive, wholeWord);

document.Replace("{{totalCost}}", totalCostStr, caseSensitive, wholeWord);

document.Replace("{{branchName}}", locationBranchName, caseSensitive, wholeWord);

document.Replace("{{taxOffice}}", locationTaxOffice, caseSensitive, wholeWord);

document.Replace("{{taxNumber}}", locationTaxNumber, caseSensitive, wholeWord);

document.Replace("{{locationAddress}}", locationAddress, caseSensitive, wholeWord);

document.Replace("{{agreementNumber}}", agreementNumber, caseSensitive, wholeWord);

foreach (WSection section in document.Sections)

{

section.HeadersFooters.Footer.ChildEntities.Clear();

}

DocIORenderer render = new();

render.Settings.ChartRenderingOptions.ImageFormat = (Syncfusion.OfficeChart.ExportImageFormat)ExportImageFormat.Jpeg;

render.Settings.OptimizeIdenticalImages = true;

Console.WriteLine("Started " + DateTime.Now.ToString("yyyy.MM.dd HH:mm:ss:ffff"));

PdfDocument pdfDocument = render.ConvertToPDF(document);

Console.WriteLine("Ended " + DateTime.Now.ToString("yyyy.MM.dd HH:mm:ss:ffff"));

render.Dispose();

document.Dispose();

MemoryStream outputStream = new();

pdfDocument.Save(outputStream);

pdfDocument.Close(true);

var pdfFile = outputStream.ToArray();

outputStream.Close();

I have attached the file which I am converting.

Started 2024.07.26 17:11:47:0103

Ended 2024.07.26 17:12:22:7659

Kindly help me out in this problem.
Thanks

Attachment: Sanal_Ofis_Sozlesme_For_General_31d85682.rar

Dharanya Sakthivel · Answer

Hi Sumit,

Based on the provided code snippet and input document, we have created a sample
and converted it to PDF using the given code. The conversion takes 6.44
seconds on our end. Please refer to the screenshot below and the attached
sample for more details.

We recommend upgrading to our latest version (v26.2.5) if you are using
an older version.

Regards,
Dharanya.
Attachment: WordtoPDFperformance_bd88959f.zip