IsValidXHTML returns always false

i try to validate a html content by WSection.IsValidXHTML method.
The method returns always false with error message: "The 'http://www.w3.org/XML/1998/namespace:lang' attribute is not declared."
Sample:

            var wordFile = File.OpenRead("demo.docx");
            var document = new WordDocument(wordFile, FormatType.Automatic);

            var section = document.AddSection();


            var html = $"<html><body><h1>test</h1></body></html>";

            var isValid = section.Body.IsValidXHTML(html, XHTMLValidationType.Transitional, out var msg); //false with error message from above


what format is accepted by this validation?
no matter what HTML i do pass into this method, it returns the same error.

4 Replies 1 reply marked as answer

HC Hemalatha Chiranjeevulu Syncfusion Team November 12, 2020 07:57 AM UTC

Hi Pascal,

Thank you for contacting Syncfusion support.

To resolve this problem, we suggest you to use XHTMLValidationType.None to skip the schema validations in HTML to Word conversion using DocIO.

Please refer the below UG documentation link to know more about XHTMLValidation used in DocIO:
 https://help.syncfusion.com/file-formats/docio/html#xhtml-validation
 

Please let us know if you have any other questions.

Regards,
Hemalatha C
 


Marked as answer

PS Pascal Seifert November 12, 2020 08:30 AM UTC

sure i could just skip the validation, but this does not help.
i need to know, if i can safely call section.Body.InsertXHTML() method later, and this does only work by validating the input first.


PS Pascal Seifert November 12, 2020 08:31 AM UTC

what is the expected input for IsValidXHTML("input", XHTMLValidationType.Transitional) method to return true?


HC Hemalatha Chiranjeevulu Syncfusion Team November 13, 2020 10:55 AM UTC

Hi Pascal,

Thank you for your update.

In Word library (DocIO), we use XML reader for parsing the content from input HTML. So, the input HTML should meet XML standard (should have proper open and close tags).

To check whether the HTML string is supported in DocIO, we can validate it against XHTML 1.0 Strict and Transitional schema. But in NET Core applications “XHTMLValidationType.Transitional” is limitation. So, we suggest you to use the
XHTMLValidationType.None in HTML to Word conversion in NET Core applications using DocIO.

The functionality of XHTMLValidationType.None is to validate the HTML file against XHTML format alone, it doesn’t perform any schema validation

Please refer the below UG documentation link to know more about XHTMLValidation used in DocIO:
 https://help.syncfusion.com/file-formats/docio/html#xhtml-validation

Please let us know if you have any other questions.

Regards,
Hemalatha C
 


Loader.
Up arrow icon