We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. (Last updated on: June 24, 2019).
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

Tagged PDF from html

Thread ID:

Created:

Updated:

Platform:

Replies:

146802 Aug 20,2019 04:56 PM UTC Aug 21,2019 03:17 PM UTC ASP.NET MVC - EJ 2 3
loading
Tags: PDF
Atari Elen
Asked On August 20, 2019 05:58 PM UTC

Am I doing something wrong here? The document saves correctly but there are no tags and all of the text is showing as an image in the pdf. It's not accessible at all. I've tried the IE conversion also and updated the two registry keys but no luck there either. Here's what I'm doing (based off thishttps://help.syncfusion.com/file-formats/pdf/working-with-tagged-pdf#tagged-pdf-support-for-converting-html-to-pdf):
PdfDocument doc = new PdfDocument();
string baseUrl = "";
string htmlText = "...";  //string with all the html in it

using (HtmlConverter html = new HtmlConverter())
{
html.ConvertToTaggedPDF(doc, htmlText, baseUrl);
};
doc.Save(@"c:\temp\htmlfiles\Sample.pdf");
doc.Close(true);

Prakash Viswanathan [Syncfusion]
Replied On August 21, 2019 09:17 AM UTC

Hi Atari, 

Thank you for contacting Syncfusion support.  

We have checked the HTML to Tagged PDF conversion, it is working as expected. We have attached the sample and output PDF documents for your reference.  

IE rendering engine make use of MSHTML (IE rendering engine) to convert HTML to vector images. From the images, we will render the PDF document. Microsoft has changed its behavior from IE 9 and above, that IE may generates bitmap images instead of vector images.  From the bitmap images, we could not extract the text details. So, that the bitmap image may rendered in the PDF document. If bitmap image is rendered in the PDF document, the tags will not be added for the contents.  

To overcome this behavior, we have to enable legacy drawing from the registry settings. Please follow the steps in below KB or use advanced legacy drawing tool to avoid bitmap PDF document. Once you enabled legacy drawing, you will get an output document with text selectable. The tags for the content will be added on the PDF document.  



  1. Download and extract the updated legacy drawing tool.
  2. Run the tools as administrator and select all the checkboxes.
  3. Then click the Enable GDI rendering button to update the registry for all the users.

Limitation with IE: However, if the input HTML contains HTML-5 or CSS3, IE(MSHTML) will only generate bitmapped output (PDF text could not be selected) and it is the behavior of IE. 

Please let us know if you need any further assistance on this.  

Regards, 
Prakash V 


Atari Elen
Replied On August 21, 2019 12:30 PM UTC

Thank you! It turns out it was one of the meta tags that was being auto-generated by the site: meta http-equiv="X-UA-Compatible" content="IE=edge"
Once I removed that meta tag then the document saved correctly.

Prakash Viswanathan [Syncfusion]
Replied On August 21, 2019 03:17 PM UTC

Hi Atari, 

Thank you for the update.  
Please let us know if you need any further assistance on this.  

Regards, 
Prakash V 


CONFIRMATION

This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

Please sign in to access our forum

This page will automatically be redirected to the sign-in page in 10 seconds.

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon

Live Chat Icon For mobile
Live Chat Icon