We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

PdfLoadedPage.ExtractText() extracts extra letter

Hi,

I am using PdfLoadedPage.ExtractText() method. If the word in the PDF contains the letter y, this letter is written in pairs. For example, myth is extracted as "myyth".


How can I fix this?

Thanks.


2 Replies 1 reply marked as answer

IJ Irfana Jaffer Sadhik Syncfusion Team March 10, 2023 05:53 AM UTC

We have already logged a similar issue The unwanted characters being returned during text extraction from PDF”  and the fix for the reported problem is planned to include in our upcoming weekly NuGet release, which will be available on March 14th, 2023.


Please use the below feedback link to track the status of the reported bug.

https://www.syncfusion.com/feedback/41795/the-unwanted-characters-being-returned-during-text-extraction-from-pdf


If the issue persists after referring to the NuGet, it is possible that it may be related to the specific PDF document. To assist you further, we kindly request that you share the input document with us


Note: If you require a patch for the reported issue in any of our Essential Studio Main or SP release versions, then kindly let us know the version, so that we can provide a patch in that version based on our SLA policy.


Disclaimer: “Inclusion of this solution in the weekly release may change due to other factors including but not limited to QA checks and works reprioritization.”


Marked as answer

IJ Irfana Jaffer Sadhik Syncfusion Team March 15, 2023 04:59 AM UTC

We have included the fix for the reported issue “The unwanted characters being returned during text extraction from PDF” in our latest weekly NuGet release (v20.4.0.54).


Please use the below link to download our latest weekly NuGet,     https://www.nuget.org/packages/Syncfusion.Pdf.Net.Core/20.4.0.54


Loader.
Live Chat Icon For mobile
Up arrow icon