BoldSignEasily embed eSignatures in your .NET applications. Free sandbox with native SDK available.
1. We Hungarians use some strange characters (ő and ű) that don't make it to the HTML. They are nicely present in PdfLoadedPage:ExtractText(). Actually I can correct this from ExtractText() but it's hacking isn't it? |
We have used open source software “OPX” to convert PDF to HTML file. So we have preserved the characters only supported in OPX. |
2. Acrobat DC has options to skip page numbers and heading/footing. Is there any hope Syncfusion.Pdf can do that too? |
We can able to skip the page numbers and header/footer in PDF document by drawing the content in PdfTemplate and draws it in PDF document as like header and footer. Please find the below sample which illustrate this,
|
3. Memory consumption grows upon repeated conversion even if
I use the same converter and settings objects
ldoc.EnableMemoryOptimization = true;
and call after each document:
ldoc.Close(true);
GC.Collect();
GC.WaitForPendingFinalizers();
Application.DoEvents(); |
Could you please share us the PDF document and complete code snippet/sample to replicate this issue. It will helpful for us to provide the precise solution on this. |
For skipping headers and page numbers, I meant during text extraction and html generation. |
We have used open source software “OPX” to convert PDF to HTML, So we could not able to skip the header and page number during HTML generation. |
|
We were able to reproduce the reported issue and suspect that this to be a defect. Currently we are validating on this and will update the further details on 12th February, 2020. |
I also attach another pdf that completely fails (40064_2016_Article_3041.zip). |
We are internally make use of open source xpdf to convert PDF to HTML, and the conversion fails due to exception occurs in that open source library itself. So we can’t proceed further to resolve this issue in our end. |
For memory leak, this is the sample "barcode.pdf" you provide with PdfToHtmlOPX1940268788.zip: |
We have checked the reported memory leak issue, but it does not take more memory in our end. Also, we have ensured the memory taken by the PdfLoadedDocument is actual memory needed to process the document. There is no bottle neck in our implementation. So, could not optimize this further. |