We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

Tagged PDF from html

Am I doing something wrong here? The document saves correctly but there are no tags and all of the text is showing as an image in the pdf. It's not accessible at all. I've tried the IE conversion also and updated the two registry keys but no luck there either. Here's what I'm doing (based off thishttps://help.syncfusion.com/file-formats/pdf/working-with-tagged-pdf#tagged-pdf-support-for-converting-html-to-pdf):
PdfDocument doc = new PdfDocument();
string baseUrl = "";
string htmlText = "...";  //string with all the html in it

using (HtmlConverter html = new HtmlConverter())
html.ConvertToTaggedPDF(doc, htmlText, baseUrl);

3 Replies

PV Prakash Viswanathan Syncfusion Team August 21, 2019 09:17 AM UTC

Hi Atari, 

Thank you for contacting Syncfusion support.  

We have checked the HTML to Tagged PDF conversion, it is working as expected. We have attached the sample and output PDF documents for your reference.  

IE rendering engine make use of MSHTML (IE rendering engine) to convert HTML to vector images. From the images, we will render the PDF document. Microsoft has changed its behavior from IE 9 and above, that IE may generates bitmap images instead of vector images.  From the bitmap images, we could not extract the text details. So, that the bitmap image may rendered in the PDF document. If bitmap image is rendered in the PDF document, the tags will not be added for the contents.  

To overcome this behavior, we have to enable legacy drawing from the registry settings. Please follow the steps in below KB or use advanced legacy drawing tool to avoid bitmap PDF document. Once you enabled legacy drawing, you will get an output document with text selectable. The tags for the content will be added on the PDF document.  

  1. Download and extract the updated legacy drawing tool.
  2. Run the tools as administrator and select all the checkboxes.
  3. Then click the Enable GDI rendering button to update the registry for all the users.

Limitation with IE: However, if the input HTML contains HTML-5 or CSS3, IE(MSHTML) will only generate bitmapped output (PDF text could not be selected) and it is the behavior of IE. 

Please let us know if you need any further assistance on this.  

Prakash V 

AE Atari Elen August 21, 2019 12:30 PM UTC

Thank you! It turns out it was one of the meta tags that was being auto-generated by the site: meta http-equiv="X-UA-Compatible" content="IE=edge"
Once I removed that meta tag then the document saved correctly.

PV Prakash Viswanathan Syncfusion Team August 21, 2019 03:17 PM UTC

Hi Atari, 

Thank you for the update.  
Please let us know if you need any further assistance on this.  

Prakash V 

Live Chat Icon For mobile
Up arrow icon