DocIO support only welformatted xhtml: <br> tag

In HTML5, <br> is preferred over <br/>.

We use the same html to display on a page, and within a Word document created with DoCIO. However. the

 paragraph.AppendHTML() method only supports xhtml. 

I think it is time that the HTML format is updated to Html5.


7 Replies

LB Lokesh Baskar Syncfusion Team January 24, 2022 06:30 PM UTC

Hi Pieter,

The given HTML string is not a well formatted. DocIO support only well formatted HTML strings.

In Word library (DocIO) we use XML reader for reading the content from input HTML. So, the input HTML should meet XML 1.0 standard.

To check whether the HTML string is supported in DocIO, we can validate it against XHTML 1.0 Strict and Transitional schema. Please refer the following code snippets. 
 
//Loads the template document 
WordDocument document = new WordDocument(); 
document.EnsureMinimal(); 
//Html string to be inserted 
string htmlstring = "<br> is preferred over </br>"; 
//Validates the Html string 
bool isValidHtml = document.LastSection.Body.IsValidXHTML(htmlstring, XHTMLValidationType.None); 
  
If this validation fails, then those HTML cannot be processed with DocIO.

Regards,
 
Lokesh B


PV Pieter van Kampen January 24, 2022 07:16 PM UTC

Hi Lokesh,

thank you. I understand how it works. However, according to the HTML specs <br> is preferred. So my remark is merely a suggestion to consider changing your approach, so that valid HTML5 is supported. There are many packages available that can do this. 

Best regards,


Pieter 



LB Lokesh Baskar Syncfusion Team January 26, 2022 07:46 PM UTC

Hi  Pieter,

Thank you for your update. We are glad to know that your problem has been resolved. 
Please let us know if you have any other questions. As always, we will be happy to assist you. 

Regards, 
Lokesh B  



AM Artur Michalski October 6, 2022 12:43 PM UTC

I also find this a bit odd, since the method name is AppendHtml (and not append AppendXml) so I think it

be good to accept valid Html :)



SB Suriya Balamurugan Syncfusion Team October 7, 2022 08:33 AM UTC

Hi Artur,

In Word library (DocIO) we use XmlReader for parsing the content from input HTML. So, the input HTML should meet XML standard (have proper open and close tags), even if you specify XHTMLValidationType parameter as XHTMLValidationType.None.

Therefore, Essential DocIO supports only XHTML 1.0 standard for the HTML. We internally perform the XHTML 1.0 validation for the input HTML in DocIO and then process the document further. If the input HTML fails to meet the XHTML 1.0 complaints then exception will be thrown with respect the unsupported elements.

Please refer our UG documentation to know more details about this,
https://help.syncfusion.com/file-formats/docio/html

Regards,
Suriya Balamurugan.




JO John November 4, 2024 09:36 PM UTC

Just a noteI am using the Syncfusion Rich text editor in ASP.Net Core which adds <br> tags but it seems like the DocIO library won't accept it, it would be create if the products could be aligned.



DS Dharanya Sakthivel Syncfusion Team November 5, 2024 01:22 PM UTC

Hi John,

Currently, DocIO library will accept the <br> tag without closing. Kindly refer to the below code snippet and the generated result document in the attachment.

  using (WordDocument document = new WordDocument())

  {

      document.EnsureMinimal();

      // Define a string containing HTML with a line break tag

      string htmlstring = "You can use the <br> tag to add line breaks wherever needed.";

      // Append the HTML string to the last paragraph of the document

      document.LastParagraph.AppendHTML(htmlstring);

 

      using (FileStream fileStreamOutput = File.Create(@"C:\HTML-br-tag.docx"))

      {

          document.Save(fileStreamOutput, FormatType.Docx);

      }

  }


Kindly refer to the documentation to perform Word to HTML and HTML to Word Conversions

Convert Word to HTML and vice versa in C# | Syncfusion

Regards,
Dharanya.



Attachment: HTMLbrtag_ee519e0d.docx

Loader.
Up arrow icon