how to save html to pdf in azure blob

I am using HTML to PDF converter https://help.syncfusion.com/file-formats/pdf/converting-html-to-pdf

In azure function to convert a html link to pdf. I am able to convert the same also and getting the response as html response but I am unable to upload the same in azure blob. I tried it to convert in memory stream but it's not uploading the file but generating a zero kb file, can anyone help me out in this please 




8 Replies

GK Gowthamraj Kumar Syncfusion Team April 28, 2022 12:46 PM UTC

Hi Santanu,


We recommend the below code snippet to save the generated PDF document in the Azure blob,

string filePath = "saved pdf path";

var credentials = new StorageCredentials("storageaccount","accesskey");

var client = new CloudBlobClient(new Uri(“Blob url”), credentials);

// Retrieve a reference to a container. (You need to create one using the mangement portal, or call container.CreateIfNotExists())

var container = client.GetContainerReference("folderpath");

// Retrieve reference to a blob named "myfile.pdf".

var blockBlob = container.GetBlockBlobReference("myfile.pdf");

// Create or overwrite the "myblob" blob with contents from a local file.

using (var fileStream = System.IO.File.OpenRead(filePath))

{

     blockBlob.UploadFromStream(fileStream);

}


Please try the above code snippet on your end and let us know the result.


Regards,

Gowthamraj K



SK Santanu Kumar Das April 28, 2022 01:54 PM UTC

Thanks for your reply. I just used this code which is working fine locally but when I am uploading the same in function it's giving some error:


string htmlText;

string htmlLink = "https://umtermshtml.z30.web.core.windows.net/index.html?lead_id="+ lead_id;




using (var client = new WebClient())

{

htmlText = client.DownloadString(htmlLink);

}




//Initialize HTML to PDF converter

HtmlToPdfConverter htmlConverter = new HtmlToPdfConverter(HtmlRenderingEngine.WebKit);


WebKitConverterSettings settings = new WebKitConverterSettings();


//Set WebKit path

settings.WebKitPath = Path.Combine(executionContext.FunctionAppDirectory, "QtBinariesWindows");


string baseURL = Path.Combine(executionContext.FunctionAppDirectory, "Data");


//Assign WebKit settings to HTML converter

htmlConverter.ConverterSettings = settings;


PdfDocument document;

document = htmlConverter.Convert(htmlText, baseURL);


System.IO.MemoryStream ms = new System.IO.MemoryStream();


//Save the PDF document

document.Save(ms);


string Blob_Container_Name = "agreement";

CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(Environment.GetEnvironmentVariable("Storage_Conn_String"));


var cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();


var cloudBlobContainer = cloudBlobClient.GetContainerReference(Blob_Container_Name);

var blobexists = await cloudBlobContainer.ExistsAsync();


Int32 unixTimestamp = (int)DateTime.UtcNow.Subtract(new DateTime(1970, 1, 1)).TotalSeconds;


string blobName_Front = la_no + "_frnt_" + unixTimestamp.ToString() + ".pdf";


ms.Position = 0;

/// front file

var cloudBlockBlob_Front = cloudBlobContainer.GetBlockBlobReference(blobName_Front);

cloudBlockBlob_Front.Properties.ContentType = "application/pdf";

await cloudBlockBlob_Front.UploadFromByteArrayAsync(ms.ToArray(), 0, ms.ToArray().Length);


var storedPolicy_Front = new SharedAccessBlobPolicy()

{

SharedAccessExpiryTime = DateTime.UtcNow.AddDays(10000),

Permissions = SharedAccessBlobPermissions.Read |

SharedAccessBlobPermissions.List

};


var filesugnature_Front = cloudBlockBlob_Front.GetSharedAccessSignature(storedPolicy_Front);


string pdffile = cloudBlockBlob_Front.Uri.AbsoluteUri.ToString() + filesugnature_Front.ToString();



Also with the normal demo given on Syncfution website I have created the following, which is working fine locally but not working when I am uploading it to azure you can also check the same through:

https://urbanmoney.azurewebsites.net/api/check_pdf

Here goes my code:


using System;

using System.IO;

using System.Threading.Tasks;

using Microsoft.AspNetCore.Mvc;

using Microsoft.Azure.WebJobs;

using Microsoft.Azure.WebJobs.Extensions.Http;

using Microsoft.AspNetCore.Http;

using Microsoft.Extensions.Logging;

using Newtonsoft.Json;

using System.Net.Http;

using System.Net.Http.Headers;

using System.Net;

using Syncfusion.HtmlConverter;

using Syncfusion.Pdf;

using Syncfusion.Pdf.Graphics;


namespace Urbanmoney_AZ.Functions

{

public static class check_pdf

{

[FunctionName("check_pdf")]

public static async Task Run(

[HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req,

ILogger log, ExecutionContext executionContext)

{

log.LogInformation("C# HTTP trigger function processed a request.");


//HTML string and Base URL

string htmlText = "\"Syncfusion_logo\"

Hello World

";


//Initialize HTML to PDF converter

HtmlToPdfConverter htmlConverter = new HtmlToPdfConverter(HtmlRenderingEngine.WebKit);


WebKitConverterSettings settings = new WebKitConverterSettings();


//Set WebKit path

settings.WebKitPath = Path.Combine(executionContext.FunctionAppDirectory, "QtBinariesWindows");


string baseURL = Path.Combine(executionContext.FunctionAppDirectory, "Data");


//Assign WebKit settings to HTML converter

htmlConverter.ConverterSettings = settings;


PdfDocument document;


try

{

//Convert URL to PDF

document = htmlConverter.Convert(htmlText, baseURL);

}

catch (Exception ex)

{

document = new PdfDocument();

document.PageSettings.Margins.All = 0;

PdfPage page = document.Pages.Add();

page.Graphics.DrawString(ex.Message, new PdfStandardFont(PdfFontFamily.Helvetica, 10), PdfBrushes.Red, new Syncfusion.Drawing.PointF(10, 10));

page.Graphics.DrawString(ex.Message, new PdfStandardFont(PdfFontFamily.Helvetica, 10), PdfBrushes.Red, new Syncfusion.Drawing.PointF(10, 100));

}


System.IO.MemoryStream ms = new System.IO.MemoryStream();


//Save the PDF document

document.Save(ms);


ms.Position = 0;


HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);

response.Content = new ByteArrayContent(ms.ToArray());

response.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")

{

FileName = "Output.pdf"

};

response.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/pdf");


return response;

}

}

}





GK Gowthamraj Kumar Syncfusion Team April 29, 2022 02:54 PM UTC

Hi Santanu,


Currently, we are analyzing on this requirement and we will update the further details on May 4th 2022.

Meanwhile, we request you to share the error message, proper accessible link (provided link not accessible) to analyze this on our end. So that it will be helpful.


Regards,

Gowthamraj K



GK Gowthamraj Kumar Syncfusion Team May 4, 2022 02:13 PM UTC

Hi Santanu,


We are still analyzing this requirement about “Saving the PDF file in Azure blob storage” and we will update the further details on May 6th 2022.

As we request, please share the error message, and a proper accessible link (provided link not accessible) to analyze this on our end. So that it will be helpful.  


Regards,

Gowthamraj K



GK Gowthamraj Kumar Syncfusion Team May 6, 2022 11:54 AM UTC

Hi Santanu,


To upload the PDF files in Azure Blob storage, we need to connect with the Blob Storage container we need a connection string. So let's open the Azure Storage resource in portal ->Access Keys -> Click on Show Keys - > Copy the Key 1 Connection String.



Open the local.settings.json file in our function app and paste the connection string of our Azure Storage resource as value of “AzureWebJobsStorage” key. Also add another key called “ContainerName” and paste the name of container we have created earlier.


{

    "IsEncrypted": false,

    "Values": {

        "AzureWebJobsStorage": "<replace your blob storage connection key here>",

        "ContainerName": "file-upload", // Container name

        "FUNCTIONS_WORKER_RUNTIME": "dotnet"

    }

}


To interact with Azure storage we have to first install the below NuGet package.


Now add the below code to upload a file into the container in our blob storage.

string path = Path.Combine(executionContext.FunctionAppDirectory, "QtBinariesWindows");

 

string html1 = "<!DOCTYPE html><html><head><h2>Html to PDF Azure function sample</h2></body><html>";

 

//Initialize HTML to PDF converter

HtmlToPdfConverter htmlConverter = new HtmlToPdfConverter(HtmlRenderingEngine.WebKit);

WebKitConverterSettings settings = new WebKitConverterSettings();

 

//Set WebKit path

settings.WebKitPath = path;

//Assign WebKit settings to HTML converter

htmlConverter.ConverterSettings = settings;

 

//Convert URL to PDF

PdfDocument document = htmlConverter.Convert(html1, "");

 

//Save the PDF document

document.Save(ms);

ms.Position = 0;

 

string Connection = Environment.GetEnvironmentVariable("AzureWebJobsStorage");

string containerName = Environment.GetEnvironmentVariable("ContainerName");

 

var blobClient = new BlobContainerClient(Connection, containerName);

var blob = blobClient.GetBlobClient("HtmlAzure.pdf");

await blob.UploadAsync(ms);


Configure the storage account connection string into Configuration setting of Function app. Click on Configuration. For this app we have to set value of Storage connection string and container name in configuration. So search for "AzureWebJobsStorage" and click on Edit button and paste your storage connection string. Click on "+ New application setting" and give name as "ContainerName" and Value as "file-upload".

Finally click on Save button to save changes.


Then publish the application, go to Azure portal and select the Functions Apps. After running the service, click the Get function URL -> Copy. Paste the same in the browser.


Then the converted file will uploaded in the Azure blob storage container, please find the below screenshot,



We have attached the sample for your reference, if any exception occurs in between the conversion, the sample with writing the exception in a new PDF document. Please try the sample on your end and let us know the result.
Sample: https://www.syncfusion.com/downloads/support/directtrac/general/ze/FunctionAppWebKit-1033289812


Please try the above solution and let us know the result.


Regards,

Gowthamraj K.



GK Gowthamraj Kumar Syncfusion Team May 6, 2022 11:55 AM UTC

Hi Santanu


If you are facing an exception “Syncfusion.Pdf.PdfException: Html conversion failed” while converting the HTML to PDF document. It may occurs due to QtBinariesWindows folder is not copied properly to the Azure SDK. So, the reported HTML Conversion failed exception occurs.

We can resolve this issue by copying all the files in the QtBinariesWindows and set the proper path. Please ensure all the files and inner folder are properly copied in Azure server using console in Azure portal. If it is not copied, kindly add all the files manually to resolve that issue. This is not an issue in our library.  Please follow the below steps to copy the files properly to resolve the reported issue.  

  1. Open the Azure portal in browser.
  2. Navigate to the deployed Azure function in Azure portal.
  3. Open the console of Azure function.

              

  

  1. As we said earlier, QtBinaries assemblies are not copied properly while publish to Azure functions.
  2. So, we need to manually copy all the assemblies from azure portal.
  3. QtBinariesWindows from location “c:\home\site\wwwroot\QtBinariesWindows” does not have all the assemblies, so we need to copy all the assemblies and inner folder from bin folder.
  1. Using cd command, navigate to the location “c:\home\site\wwwroot\bin\QtBinariesWindows” and run the below command.

cp *.* c:\home\site\wwwroot\QtBinariesWindows\ 


  


  1. Then we need to copy the files from “platform” and “imageformats” folders using the same approach.
  1. Using cd command, navigate to the location “c:\home\site\wwwroot\bin\QtBinariesWindows\platforms” and run the below command.

cp *.* c:\home\site\wwwroot\QtBinariesWindows\platforms 


  1. Using cd command, navigate to the location “c:\home\site\wwwroot\bin\QtBinariesWindows\imageformats” and run the below command.

cp *.* c:\home\site\wwwroot\QtBinariesWindows\imageformats 


  1. Ensure the all the files are copied to “c:\home\site\wwwroot\QtBinariesWindows” location using dir command. Also, ensure the inner folders (imageformats, platforms) has all the files.

            

  1. Now try the conversion from Azure function URL.
  2. We need to copy this assemblies for the first time only.


Please refer to the below links to troubleshooting the WebKit HTML converter.   

Troubleshooting: https://help.syncfusion.com/file-formats/pdf/convert-html-to-pdf/webkit#troubleshooting   


Please try the above suggestion and let us know the result.


Regards,

Gowthamraj K



SK Santanu Kumar Das May 10, 2022 11:08 AM UTC

Hi


I tried the same and now I have a new issue, the issue is I am trying to convert a file that takes data from an API using Javascript


The URL is given below:


https://umtermshtml.z30.web.core.windows.net/index_.html?loanno=L100253


I am using the following function to convert the same:


 public byte[] HTMLtoPDF(string url)

        {

            ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;

            //Initialize the HTML to PDF converter

            HtmlToPdfConverter htmlConverter = new HtmlToPdfConverter(HtmlRenderingEngine.WebKit);

            WebKitConverterSettings webKitSettings = new WebKitConverterSettings();

            webKitSettings.WebKitPath = HostingEnvironment.MapPath("~/QtBinaries");

            string baseURL = HostingEnvironment.MapPath("~/Data");

            //Assign the WebKit settings to HTML to PDF converter

            webKitSettings.EnableJavaScript = true;

            webKitSettings.AdditionalDelay = 5000;

            webKitSettings.MediaType = MediaType.Screen;

            htmlConverter.ConverterSettings = webKitSettings;


            PdfDocument document = htmlConverter.Convert(url);

            MemoryStream stream = new MemoryStream();

            //Save and close the PDF document

            document.Save(stream);

            document.Close(true);

            return stream.ToArray();

        }


But after converting the same its not populate the data retried through the API (Check the last page)

Can you help us with the same, please





SG Sivaram Gunabalan Syncfusion Team May 11, 2022 02:03 PM UTC

Hi Santanu,


We have checked the reported issues with HTML to PDF converter. Our HTML converter internally make use of Qt WebKit rendering engine for converting HTML to PDF. WebKit rendering engine preserves the output PDF document like how the input HTML file/URL is displayed on the WebKit based web browsers (safari, internal tool). The reported javascript data retrieving issue are occurs in WebKit rendering engine itself. The same behavior as replicates in our converter. We have attached a screenshot of the provided webpages view in fancy browser for your reference. So we could not proceed further with WebKit rendering engine.

We have attached the Qt WebKit browser for your reference. Please find the browser from below link,

Qt WebKit browser: https://www.syncfusion.com/downloads/support/directtrac/general/ze/FancyBrowser-1169646589


Steps to use the WebKit browser (internal tool): 

  1. Download and Extract the browser from above link.
  2. Copy and Paste the fancybrowser.exe into QtBinariesWindows folder.
  3. Run the fancybrowser.exe. 
  4. Browser window will open with default URL (www.google.com ) and you can use your input URL and check the web page behavior.

Fancybrowser screenshot:

Please let us know if you need any further assistance on this.

Regards,

Sivaram G


Loader.
Up arrow icon