ExtractText with position

Hi,

I was trying to use your PDF library to extract a text from a document. I'm also interested in a position of text within the document. If I understand your documentation correctly this code should print out the number of lines in a document. It is however not working for me (printing out 0, or incorrect number of lines). Could you please look into it?

public void ExtractTextTest(PdfLoadedDocument doc)
{
for (int pageIndex = 0; pageIndex < doc.PageCount; pageIndex++)
{
var lines = new TextLineCollection();

var page = doc.Pages[pageIndex];
string text = page.ExtractText(true);
page.ExtractText(out lines);

Console.WriteLine($"Lines count: {lines.TextLine.Count}");
}
}

I have also attached a PDF which I use for testing.

Thanks in advance

Attachment: doc_with_text.pdf_c7f47f4a.zip

4 Replies

MK Muralitharan Karikalan Syncfusion Team January 11, 2021 10:53 AM UTC

Hi Michal, 
 
Thank you for contacting Syncfusion support. 
 
We were able to reproduce the issue, “Number of lines in a page is wrong while using ExtractText API”. Currently, we are validating the issue and we will update the further details on January 13th,2021.   
 
Regards, 
Muralitharan K 



MK Muralitharan Karikalan Syncfusion Team January 13, 2021 04:15 PM UTC

Hi Michal,    
  
We have confirmed that the issue “Number of lines in a page is wrong while using ExtractText API” is a defect and logged a defect report for this issue. The patch for this issue will be delivered on February 03, 2021.   
Please find the feedback link below,  
   
Regards,  
Muralitharan K  



MK Muralitharan Karikalan Syncfusion Team February 3, 2021 05:01 PM UTC

Hi Michal, 
 
Sorry for the inconvenience caused, 
 
We have resolved the issue, "Number of lines in a page is wrong while using ExtractText API" . At present we are in testing phase and we will update patch on February 9th 2021 without any further delay. 

Regards, 
Muralitharan K 



AV Ashokkumar Viswanathan Syncfusion Team February 9, 2021 06:54 PM UTC

Hi Michal, 
 
We have resolved the issue, “Number of lines in a page is wrong while using ExtractText API” and updated the consolidated patch for the issue in the forum 159356 . Kind follow on the mentioned forum for further updates.  
 
Regards,  
Ashok Kumar Viswanathan.  


Loader.
Up arrow icon