We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

PdfLoadedPage.ExtractText() not extracting text

Hi,

I have a pdf file with Encoding Identity-H that I am trying to parse using the PdfLoadedPage.ExtractText() method. But it is not retirning the correct text on the page rather funny characters are returned. It works fine with ANSII encoding but fails for file in Identity-H. Please could you let me know why would ExtractText() not work and is there any workaround? FYI, I am using syncfusion 7.4.

Thanks,
Khalid


5 Replies

KA Khalid Ashraf April 19, 2012 04:57 PM UTC

Please could you also advise if 7.4 does not support Identity-H then which earlies version supports it?

Thanks,
Khalid



SM Suresh M Syncfusion Team April 20, 2012 09:09 AM UTC

Hi Khalid,

Thank you using Syncfusion products.

Can you please provide us the document to reproduce the issue? Also create a direct trac incident for the follow up of the issue.

Please let us know if you have any concerns.

Thanks,
Suresh




KA Khalid Ashraf April 20, 2012 09:50 AM UTC

As requested, PFA the file. Please treat this request as highest priority as its a production issue.

Thanks,
Khalid



Payslips-Masked1_93801054.zip


KA Khalid Ashraf April 20, 2012 09:54 AM UTC

Incident 93535 has also been created.



SM Suresh M Syncfusion Team April 24, 2012 03:47 AM UTC

Hi Khalid,

Thank you for your response.

The fix for the issue in the extraction of the text from the provided PDF document is available in the version 9.4.0.62 and later versions. Please upgrade to these versions to get the issue resolved.

Please let us know if you have any concerns.

Thanks,
Suresh



Loader.
Live Chat Icon For mobile
Up arrow icon