Converting Docx to PDF in Linux Azure deployment doesn't convert text content and leaves text content blank

Hello,


I'm currently encountering an issue where converting from a Docx file to a PDF isn't correctly converting the file. Images in the file are carried over to the PDF no problem. But text content fails to convert over and all that is contained in the converted PDF are images and no text. Everything works great locally on my windows machine but when I deploy it to our Linux Alpine azure instance it doesn't work. There are no error messages just a lack of text content in the converted PDF. I suspect it lacks some sort of font setting or dependency but as far as I can tell its using every listed dependency required for Docx to PDF conversion. I know that the documentation lists SkiaSharp.NativeAssets.Linux not SkiaSharp.NativeAssets.Linux.NoDependencies but using the former gave me errors.


        <PackageVersion Include="Syncfusion.Licensing" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.DocIO.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.Compression.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.DocIORenderer.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.OfficeChart.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.SkiaSharpHelper.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.Pdf.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="Syncfusion.XlsIO.Net.Core" Version="20.3.0.56" />

        <PackageVersion Include="SkiaSharp.HarfBuzz" Version="2.88.2" />

        <PackageVersion Include="SkiaSharp.NativeAssets.Linux.NoDependencies" Version="2.88.2" />

        <PackageVersion Include="HarfBuzzSharp.NativeAssets.Linux" Version="2.8.2.2" />


2 Replies

DL Dylan Lyon August 29, 2023 09:16 PM UTC

public MemoryStream ConvertWordtoPDF(MemoryStream sourceStreamPath)

    {

            using (WordDocument document = new WordDocument(sourceStreamPath, FormatType.Docx))

            {

                using (DocIORenderer render = new DocIORenderer())

                {

                    using (PdfDocument pdfDocument = render.ConvertToPDF(document))

                    {

                        MemoryStream stream = new MemoryStream();

                        pdfDocument.Save(stream);

                        stream.Position = 0;

                        return stream;

                    }

                }

            }

    }



I should mention this is how I'm converting the file. I take the docx file and convert it to a byte array to send over an api client. I use that byte array as input to a memory stream for this function.



AA Akash Arul Syncfusion Team August 31, 2023 09:20 AM UTC

Hi Dylan,

On further analyzing, we found that you are using SkiaSharp.NativeAssets.Linux.NoDependencies. When using this package, we can’t get the font information from the environment properly. So, the contents are not preserved in the output. To preserve the contents, we suggest you to use
SubstituteFont event.

By having list of fonts in some folder, we suggest you use font substitution event to set the fonts.
To know more about font substitution, please refer our documentation from the below link.
https://help.syncfusion.com/file-formats/docio/word-to-pdf#font-substitution
https://www.syncfusion.com/kb/8485/how-to-perform-font-substitution-in-word-to-pdf-conversion

Otherwise, if Word document is docx format, we suggest you to embedding the necessary fonts in the input Word document using Microsoft Word application and then convert this Word document to PDF using DocIO.

Graphical user interface, application

Description automatically generated


Regards,
Akash.


Loader.
Up arrow icon