We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date
close icon

Reusing images in PDF rendered from WordDocument using DocIORenderer

I am trying to convert a .docx to PDF with a letterhead in .net core.
I have created a .docx and put an image in the header thats spans the complete page.
That works perfectly even when I add pages.
I use DocIORenderer to render to a .pdf file.
However, the docs state that the image should be reused in the resulting pdf and it looks like it is not:
The background image is relatively large, and just adding pages significantly add to the pdf file size (600Kb+ per page).
In this code I just load a .docx and store it in another .docx and convert it to a .pdf.
The resulting .docx is not significantly impacted for every page that I add to the source document (not in code), but the PDF is.
How do I get the PDF converter to reuse the image in the PDF instead of importing it yet again for every page?

This is the code I use:

WordDocument document;
            using (var fs = new FileStream(@"C:\syncfusion doc\CERTIFICAAT.docx", FileMode.Open))
            {

                document = new WordDocument(fs, FormatType.Docx);
            }         

            //Saves the document
            using (var fss = new FileStream(@"C:\syncfusion doc\result-CERTIFICAAT.docx", FileMode.Create))
            {
                document.Save(fss, FormatType.Docx);
            }
            // convert it to PDF
            //Instantiation of DocIORenderer for Word to PDF conversion

            DocIORenderer render = new DocIORenderer();

            //Sets Chart rendering Options.

            render.Settings.ChartRenderingOptions.ImageFormat = ExportImageFormat.Jpeg;
            //Converts Word document into PDF document

            PdfDocument pdfDocument = render.ConvertToPDF(document);            
            //Releases all resources used by the Word document and DocIO Renderer objects

            render.Dispose();

            //Saves the PDF file
            using (var fspdfo = new FileStream(@"C:\syncfusion doc\result-CERTIFICAAT.pdf", FileMode.Create))
            {
                pdfDocument.Save(fspdfo);
            }
        

            document.Close();
            document.Dispose();

======== edit =======

In fact, this happens in non-core platforms as well, if you turn the EnableFastRendering property of the DocToPDFConverter renderer to true (unavailable for core).

To reproduce:
* create a blank Word docx.
* Add an image in the header (use a large image so you can notice the increase in file size easier). I used it as a full page background image (letterhead)
* convert the docx to pdf using either the code in the OP or using the DocToPDFConverter with the EnableFastRendering set to true
* note the file size. It should be just above the file size of the image.
* Add 20 or so pages to the docx (Word will repeat the image on the new pages as well)
* convert the file again and notice the increase in file size.

My original docx was 660KB with one page (image .jpg file size was 640KB).
After converting it to pdf, the pdf file was around the same size (730KB)
After adding 20 pages or so to the docx, the resulting pdf after conversion was a whopping 19MB.


7 Replies

VA Vijayasurya Anandhan Syncfusion Team August 8, 2019 08:38 AM UTC

Hi Bob,

Thank you for contacting Syncfusion support.

Since you have mentioned that the Word document has quality identical (duplicate) images. In Non-Core platforms, we request to use the OptimizeIdenticalImages API (Sets a value to true will optimize the memory for duplicate images in Word to PDF conversion), which will resolve the reported issue with “Increased file size of the generated PDF with identical images”. For more reference, we request to use the below code example to overcome the reported issue in Non-Core platforms.

 
//Loads an existing Word document
WordDocumentwordDocument = newWordDocument("Sample.docx", FormatType.Docx);

//Initializes the ChartToImageConverter for converting charts during Word to pdf conversion
wordDocument.ChartToImageConverter = newChartToImageConverter();


//Creates an instance of the DocToPDFConverter - responsible for Word to PDF conversion
DocToPDFConverterconverter = newDocToPDFConverter();

//Sets true to enable fast rendering .
converter.Settings.EnableFastRendering = true;

//Sets a value indicating whether to optimize the memory usage for the identical images in Word to PDF conversion .
converter.Settings.OptimizeIdenticalImages = true;

//Converts Word document into PDF document
PdfDocumentpdfDocument = converter.ConvertToPDF(wordDocument);

//Saves the PDF file to file system
pdfDocument.Save("WordtoPDF.pdf");

//Closes the instance of document objects
pdfDocument.Close(true);

wordDocument.Close(); 
 

At present, we do not have support to identify the identical (duplicate) images in .NET Core. We have already logged this as a feature. We have planned to implement this feature in our upcoming 2019 Volume 3 release, which is estimated to be available in the month of September tentatively.

EnableFastRendering property not available in Core:

In Non-Core platforms, by default we convert Word document to PDF using EMF rendering approach and so to achieve Word to PDF conversion using direct PDF approach we have provided the EnableFastRendering property.

Whereas, in Core by default we are converting the Word document to PDF faster using direct PDF approach rather than EMF rendering approach and so we have not provided this property in Core.

Please use our below UG documentation page for more details about fast Rendering:
https://help.syncfusion.com/file-formats/docio/word-to-pdf#fast-rendering   

Please let us know if you have any questions.

Regards,
Vijayasurya A



BO Bob August 11, 2019 09:47 PM UTC

Thank you for replying!

However, the example code provided by you does not work as expected for this situation; every new page I add in the docx will increase the resulting pdf file size with approx. the image file size.
Therefore I think the image in the head of the docx (which will be shown on every page automatically) will be included in the PDF separately on each and every page, which poses a show stopper for any document with more than a couple of pages.
Even in non-core projects (just tested winforms as well, see attached file), even with the EnableFastRendering  AND OptimizeIdenticalImages options set to true.

Steps to reproduce (project also attached):

* Create a new winforms project (.NET 4.7.2)
* Add the three needed syncfusion nuget packages (I used current 17.2.0.40 version)
* Insert your snippet
* load and convert the docx file (supplied in ZIP) with a large image in the head. The document has 7 pages to illustrate the problem. For completeness, I supplied a 1-page docx as well to observe the difference. The code is copied verbatim from your code snippet, except for the document paths.

I use win10 Pro, Visual Studio 2019 Community 16.2.1

My questions:
is what I am trying to do supported? If so, what should I change to make this work? If not supported, how should I solve this "letterhead" problem. I cannot image I am the only one trying to solve this.

Regards,
Bob



PR Poorani Rajendran Syncfusion Team August 15, 2019 11:16 AM UTC

Hi Bob,

Thank you for your update.

We can reproduce the reported generated PDF file size is too large depends on the image in header from our side. We will validate on this issue and update you the further details on19th August, 2019

Regards,
Poorani Rajendran
 



VA Vijayasurya Anandhan Syncfusion Team August 19, 2019 02:02 PM UTC

Hi Bob,

Thank you for your patience.

We will reduce the generating PDF file size while using identical images in Word to PDF conversion and this implementation will get included in our upcoming 2019 Volume 3 release which is estimated to be available in the month of October.

Please let us know if you still require patch for this implementation.

Regards,
Vijayasurya A



PG Pon Geetha A J Syncfusion Team August 20, 2019 04:54 AM UTC

Dear Vijayasurya, 
 
Thank you for your reply. 
 
Good to hear the problem will be fixed.
However, it is such an important part of our application that, if possible, a hot fix/patch sooner than October would be greatly appreciated.
 
Please realize that moving on with the evaluation on our side without a confirmed fix increases our financial exposure and would turn out bad if the fix would somehow not solve the problem. 
 
So yes, to answer your question, we still require the patch so we can be confident the fix in October will solve the problem. 
 
Regards, 
Bob 



VA Vijayasurya Anandhan Syncfusion Team August 20, 2019 01:00 PM UTC

Hi Bob,

Thank you for your update.

Please find the details for your queries below as follows:

As per your request, we will provide the patch for the reported issue with “Generated PDF size is large when use identical images in direct PDF approach”, which is estimated to be available on 23rd August 2019.

Note:
Since you have mentioned your product version as “v17.2.0.40” (Weekly release). At present, we are not providing a patch for weekly NuGet release and so we have provided you patch for the reported issue in version “17.2.0.34” version.

As mentioned earlier, at present, we do not have support to identify identical images in .NET Core. We have already logged this as a feature and also, we have planned to implement this feature in our upcoming 2019 Volume 3 release, which is estimated to be available in the month of October tentatively.

Please let us know if you have any questions.

Regards,
Vijayasurya A



VA Vijayasurya Anandhan Syncfusion Team August 23, 2019 06:42 PM UTC

Hi Bob,

Thank you for your patience

As per your request, we have made implementation to use the OptimizeIdenticalImages API (Sets a value to true will optimize the memory for duplicate images in Word to PDF conversion) in .NET Core applications and the patch for the feature in the mentioned version 17.2.0.34 can be downloaded from the following link

Recommended approach - exe will perform automatic configuration
Please find the patch setup from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionPatch_17.2.0.34_1177288_8232019121141381_F146533.exe

Advanced approach – use only if you have specific needs and can directly replace existing assemblies for your build environment
Please find the patch assemblies alone from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionPatch_17.2.0.34_1177288_8232019121141381_F146533.zip

Please find the NuGet packages from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionNuget_17.2.0.34_1177288_8232019121141381_F146533.zip


Assembly Version:
17.2.0.34

Installation Directions :
This patch should replace the files “Syncfusion.Compression.Protable.dll”,“Syncfusion.OfficeChart.Protable.dll”,“Syncfusion.DocIO.Protable.dll”,“Syncfusion.DocIORenderer.Protable.dll”, ”,“Syncfusion.PDF.Protable.dll”

For more reference, we request to use the below code example to make use of the feature in Core platforms.
 
//Open the file as Stream
FileStream docStream = new FileStream(@"Template.docx", FileMode.Open, FileAccess.Read);

//Loads file stream into Word document

WordDocument wordDocument = new WordDocument(docStream, Syncfusion.DocIO.FormatType.Automatic);

//Instantiation of DocIORenderer for Word to PDF conversion
DocIORenderer render = new DocIORenderer();

//Sets true to optimize the memory usage for identical images render.Settings.OptimizeIdenticalImages = true;

//Converts Word document into PDF document
PdfDocument pdfDocument = render.ConvertToPDF(wordDocument);

//Releases all resources used by the Word document and DocIO Renderer objects
render.Dispose();
wordDocument.Dispose();

//Saves the PDF file
MemoryStream outputStream = new MemoryStream();
pdfDocument.Save(outputStream);

//Closes the instance of PDF document object
pdfDocument.Close();
 
Non-Core Platform:

As per your request, we  have made implementation for the reported issue with “Generated PDF size is large when use identical images in direct PDF approach”, and the patch for the fix in the mentioned version 17.2.0.34 can be downloaded from the following link

Recommended approach - exe will perform automatic configuration
Please find the patch setup from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionPatch_17.2.0.34_1177288_8232019120300865_F146533.exe

Advanced approach – use only if you have specific needs and can directly replace existing assemblies for your build environment
Please find the patch assemblies alone from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionPatch_17.2.0.34_1177288_8232019120300865_F146533.zip

Please find the NuGet packages from below location:
http://syncfusion.com/Installs/support/patch/17.2.0.34/1177288/F146533/SyncfusionNuget_17.2.0.34_1177288_8232019120300865_F146533.zip


Assembly Version:
17.2.0.34

Installation Directions :
This patch should replace the files “Syncfusion.Compression.Base.dll”,“Syncfusion.OfficeChart.Base.dll”, “Syncfusion.DocIO.Base.dll”,“Syncfusion.DocToPDFConverter.Base.dll”,“Syncfusion.PDF.Base.dll” under the following folder.
$system drive:\ Files\Syncfusion\Essential Studio\$Version # \precompiledassemblies\$Version#\4.0
Eg : $system drive:\Program Files\Syncfusion\Essential Studio\ 17.2.0.34\precompiledassemblies\ 17.2.0.34\4.0
 
To automatically run the Assembly Manager, please check the Run assembly manager checkbox option while installing the patch. If this option is unchecked, the patch will replace the assemblies in precompiled assemblies’ folder only. Then, you will have to manually copy and paste them to the preferred location or you will have to run the Syncfusion Assembly Manager application (available from the Syncfusion Dashboard, installed as a shortcut in the Application menu) to re-install assemblies.

Note
:
To change how you receive bug fixes, ask your license management portal admin to change your project’s patch delivery mode.
https://www.syncfusion.com/account/license

Disclaimer:
Please note that we have created this patch for version
17.2.0.34 specifically to resolve the issue reported in the Forum 146533

If you have received other patches for the same version for other products, please apply all patches in the order received.

This fix will include in our 2019 Volume 3 release which will be available in end of September 2019.

For more reference, we request to use the below code example to overcome the reported issue in Non-Core platforms.

 
//Loads an existing Word document
WordDocumentwordDocument = newWordDocument("Sample.docx",FormatType.Docx);

//Initializes the ChartToImageConverter for converting charts during Word to pdf conversion
wordDocument.ChartToImageConverter = newChartToImageConverter();


//Creates an instance of the DocToPDFConverter - responsible for Word to PDF conversion
DocToPDFConverterconverter = newDocToPDFConverter();

//Sets true to enable fast rendering .
converter.Settings.EnableFastRendering = true;

//Sets a value indicating whether to optimize the memory usage for the identical images in Word to PDF conversion .
converter.Settings.OptimizeIdenticalImages = true;

//Converts Word document into PDF document
PdfDocumentpdfDocument = converter.ConvertToPDF(wordDocument);

//Saves the PDF file to file system
pdfDocument.Save("WordtoPDF.pdf");

//Closes the instance of document objects
pdfDocument.Close(true);

wordDocument.Close(); 


 
Please let us know if you need further assistance in this.

Regards,
Vijayasurya A


 


Loader.
Live Chat Icon For mobile
Up arrow icon