Error when try to convert HTML to SFDT using java WordProcessorHelper.loadString

Hi,

I am using the documentEditor control (Vue) to manage word documents and the java libraries to perform the backend operations.

I am trying to convert a text copied from a word document, but when I call the

WordProcessorHelper.loadString method to convert from Html to Sfdt I obtain an error

This is the text in HTML format obtained from the clipboard after copied it form the attached document:

<p class="MsoNormal"><span style="font-size:12.0pt;line-height:107%;font-family:\n"Arial",sans-serif;color:red">Text to test<o:p></o:p></span></p>\n\n<p class="MsoNormal"><span style="font-size:12.0pt;line-height:107%;font-family:\n"Arial",sans-serif;color:red"><o:p> </o:p></span></p>\n\n<p class="MsoNormal"><span style="font-size:12.0pt;line-height:107%;font-family:\n"Arial",sans-serif;color:red"><span style="mso-spacerun:yes"> </span>Syncfusion\npaste<o:p></o:p></span></p>

This is the error obtained:

java.lang.UnsupportedOperationException: DocIO support only welformatted xhtml \nDetails:\nParseError at [row,col]:[2,57]\nMessage: http://www.w3.org/TR/1999/REC-xml-names-19990114\"."}] 


It seems that the HTML text read from the clipboard it is not well formatted. 
Nevertheless, I tried to do the same in the documentEditor example (vue) available here: https://ej2.syncfusion.com/vue/demos/#/bootstrap5/document-editor/default.html

and I saw that in this case the paste worked well. I noticed that in this example the text sent to convert it is not HTML but RTF, as you can see next:


What can I do to perform the conversion witohut errors?, Is it necessary to convert from HTML to RTF?, Is there some function in the java libraries to do this?


Thank you in advance for your answer

Regards

Gaspar




Attachment: SF_Text_to_paste_7b5bb48f.zip

5 Replies

AE Ajithamarlin Edward Syncfusion Team July 6, 2022 05:43 AM UTC

Hi Gaspar,


Sorry for the delay.


Document editor internally using our Syncfusion DocIO library to paste the contents with formatting, Syncfusion DocIO library need a well formatted Html to be pasted.


Can you please share us a sample HTML content that you tried to convert into SFDT, it may help us to validate further and provide the solution at earliest.


Also please share us your document editor version details.


Regards,

Ajithamarlin E



GB Gaspar Blein July 8, 2022 07:29 AM UTC

Hi Ajithamarlin


Thank you for your answer,


In my first question I attached the Word document from I am copying the text and the html value that I obtain from the clipboard is this:


<p
class="MsoNormal"><span
style="font-size:12.0pt;line-height:107%;font-family:\n&quot;Arial&quot;,sans-serif;color:red">Text
to test<o:p></o:p></span></p>\n\n<p
class="MsoNormal"><span
style="font-size:12.0pt;line-height:107%;font-family:\n&quot;Arial&quot;,sans-serif;color:red"><o:p>&nbsp;</o:p></span></p>\n\n<span
style="font-size:12.0pt;line-height:107%;font-family:&quot;Arial&quot;,sans-serif;\nmso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;color:red;\nmso-ansi-language:ES;mso-fareast-language:EN-US;mso-bidi-language:AR-SA"><span
style="mso-spacerun:yes">&nbsp;</span>Syncfusion
paste</span>


This html it is not well formed and as you said when I try to convert to sfdt format it fails. This is clear.


So, my question is, if this html text is not well formed why it works when I paste it into the demo example that Syncfusion provides here:

https://ej2.syncfusion.com/vue/demos/#/bootstrap5/document-editor/default.html


Inspecting with the Chrome console what are the data that is sent and what is the data received after pasting, I observed that the format of the text sent is RTF but not HTML as you can see in the image below



So, my doubt is if in the Synfusion demo example you are making a conversion from HTML to RTF before sent

 it to convert to SFDT and if this is the case, if Syncfusion provides a way to make this conversion from HTML to

 RTF


At this moment our version of the vue document editor is 20.1.51 and the docio java is 20.1.0.47


Regards


Gaspar



AE Ajithamarlin Edward Syncfusion Team July 12, 2022 05:18 AM UTC

Hi Gasper,


Whenever we copy some content in the system, the copied content will be preserved in the window.clipboardData or the event.clipboardData.


Whenever we copy some contents, the contents will be copied in 3 formats


1.plain text

2.html content

3.Rtf content.


Based on the application we are going to paste,the formats will be considered.


For example if we paste any content in the notepad application, only text content we will paste no formats will be preserved, in this case plain text format was considered


Whereas if we paste the copied contents in Document editor the copied RTF data was considered for pasting.


We are not making any conversion from html to RTF, we have considered the RTF content in the clipboard data instead of considering the HTML content.


Regards,

Ajithamarlin E



GB Gaspar Blein July 12, 2022 06:57 AM UTC

Hi   Ajithamarlin,


Ok, Now I understand. 


What is happening is that I am using the javascript API to read the clipboard content and I have only available the text plain and the httml format. So, I thought you were making some transformation before paste the content.


Thank you so much for your answer

Regards

Gaspar






AE Ajithamarlin Edward Syncfusion Team July 13, 2022 04:17 PM UTC

Hi Gaspar,


Thanks for the update.


Regards,

Ajithamarlin E


Loader.
Up arrow icon