Hi Team,
Do you have any APIs to find below between 2 documents
Hi Mahesh,
We can extract the text to retrieve the color, font name, size, bounds and its properties containing text by using TextLine API. For further details, please refer to the UG documentation:
Working with Text Extraction | Syncfusion
Class TextLine - API Reference (syncfusion.com)
As of now, to get hyperlinks, you should use PdfLoadedAnnotation API. For further details, please refer to the UG documentation:
Working with Annotations | Syncfusion
PdfLoadedUriAnnotation Class - C# PDF Library API Reference | Syncfusion
Regards,
Jeyalakshmi T
Hi Jeyalakshmi,
Syncfusion APIs to extract text from PDF is working fine if PDF has only plain text.
Please find the sample attached where it has unordered list/bullet points it failed to read text as it is. Some characters are missing/question marl or junk character is appearing in that poistion.
Existing is read as Exising
Hi Mahesh,
We are trying to replicate the problem on our end using our test documents and we are not able to reproduce it. We suspect that the issue is document-specific. Therefore, we request you to share the input PDF document with us so that we can replicate the problem on our end. It will be more helpful for us to analyze further and provide you with a prompt solution.
Sample:
https://www.syncfusion.com/downloads/support/directtrac/general/ze/Console_Sample575139362
Regards,
Jeyalakshmi T
Apologies. forgot to attach actual PDF we are trying with.
Please find the same attached
Hi Mahesh,
After a thorough review of the provided document, we discovered that the word appears as "Exisng" instead of "Existing." Consequently, our output is as expected. For your reference, we have attached a screenshot of the document. To replicate the issue on our end, we kindly request you to share the problematic input document. This will assist us in further analysis and allow us to provide a prompt solution.
Regards,
Jeyalakshmi T