How can I get original Font Size for Highlight?

Hi,

I hope I can explain this well enough...  :-)

I am using your library to read PDF files that have been highlighted by readers (humans) using the "Copy selected text into Highlight, Cross-Out, and Underline comment pop-ups" option in Adobe Reader.  As a result, these PDFs are filled with "PdfLoadedTextMarkupAnnotation"s and I can read the text directly from these objects...

But I need to be able to identify what highlights are chapter titles, headings, subheadings, etc...  The only way I can reasonably think to do this is to identify the underlying font size of the original next.

How can I use a PdfLoadedTextMarkupAnnotation to locate the text within the document and identify its font size (and maybe other details like color, font type, etc.)?

Thanks,

Russell

7 Replies

RU Russell December 10, 2020 08:42 PM UTC

I've made some progress on this, maybe.

First, I get the PdfLoadedTextMarkupAnnotation.Text, then I can turn around and search for that text in the document:

                        List searchItems = new List();
                        searchItems.Add("Search Text from PdfLoadedTextMarkupAnnotation");
                        TextSearchResultCollection searchResults = new TextSearchResultCollection();
                        var result = this.ThisPDFDocument.FindText(searchItems, out searchResults);

I can then read the searchResults[0][0].Bounds.Height value...  but...  this has two huge challenges - I can't find a version that searches on a single page, so with multi-hundred page documents, false matches are very likely.  Additionally, I'm not able to successfully search for multi-line text (some PdfLoadedTextMarkupAnnotation highlights are multiple lines of text from the source PDF) - and I wonder if I were able to figure out how to search multiline text, would the value be increased for each line of text?

Thanks for any advice you can provide.

Russell




DD Divya Dhayalan Syncfusion Team December 11, 2020 02:27 PM UTC

Hi Russell, 
 
Thanks for contacting Syncfusion support. 
 
Please find the below table for details. 
Query 
Detail 
I can then read the searchResults[0][0].Bounds.Height value...  but...  this has two huge challenges - I can't find a version that searches on a single page, so with multi-hundred page documents, false matches are very likely. 
We can search for a particular page using PdfLoadedDocument FindText(List<string> searchItems, int pageIndex, out List<MatchedItem> searchResults) method. We also created a sample for this, and it can be downloaded from the below link: https://www.syncfusion.com/downloads/support/forum/160544/ze/PdfConsole-1403719267 
 
 
 
Additionally, I'm not able to successfully search for multi-line text (some PdfLoadedTextMarkupAnnotation highlights are multiple lines of text from the source PDF) - and I wonder if I were able to figure out how to search multiline text, would the value be increased for each line of text? 
Sorry for the inconvenience. 
 
On analyzing further, we do not have support for “Searching for multiline Text”. We have logged a feature request for this request. At present, we do not have any immediate plans for this. At the planning stage for every release cycle, we review all open features and identify features for implementation based on specific parameters including product vision, technological feasibility, and customer interest. We will let you know when this feature is implemented. 
 
You can track the status of this feature request here: https://www.syncfusion.com/feedback/12599/find-the-bounds-of-multiline-text 
 
 
 
Please let us know if the provided information is helpful and if you need any further assistance on this. 
 
Regards, 
Divya 



RU Russell December 18, 2020 08:50 PM UTC

I still can't seem to find a way to get the original font size (of what has been highlighted).  :-(

Thank you for your continued assistance,

Russell


DD Divya Dhayalan Syncfusion Team December 21, 2020 05:32 PM UTC

Hi Russell, 
 
“FindText” method doesn’t expose any properties for FontSize and FontColor. However, we are analysing the possibilities to achieve your requirement to get font details for highlighted text, we will update further details on 23rd December 2020. 
 
Regards, 
Divya


RU Russell December 21, 2020 10:00 PM UTC

That is wonderful, Divya.  The SyncFusion Team is so awesome to listen and help when they can.

Happy Holidays!

Russell


DG Deepak Gunasekaran Syncfusion Team December 22, 2020 06:14 PM UTC

Hi Russell, 

Most Welcome. As we mentioned earlier, we are currently checking on this requirement and will update further details once completed. 

Regards, 
Deepak G 



DD Divya Dhayalan Syncfusion Team December 24, 2020 07:40 PM UTC

Hi Russell, 
 
Thanks for your patience.

On analyzing to get font details from the highlighted text markup annotation, it is not feasible to get font details from highlighted text markup annotation as per the PDF structure. But it is possible by "Fetching the font details of the text using FindText". At present we do not have this support to get the font details using the FindText() method. Will you please confirm with us whether this feature helps to achieve your requirement, which would help us to proceed further.

Regards,
Divya
 


Loader.
Up arrow icon