ExtractText not working for pdfDocument object?

Question

Hi,I am using version 8.303.0.21 of Sync PDF. I have a few questions here:1) Does ExtractText function only works on pdfLoadedDocument object? and not pdfdocument?2) Seems that after importPage "pFinalDoc.ImportPage(pTempDoc, j)", I am not able to do a extractText on pdfDocument. It is giving me "Nothing".Any advice is appreciateed. Thanks.    Sub test(ByVal msInputFile As MemoryStream)        Dim pDoc As Syncfusion.Pdf.Parsing.PdfLoadedDocument = New Parsing.PdfLoadedDocument(msInputFile)        Dim found As Boolean        Dim searchKey As String        Dim searchList As New SortedList(Of String, Byte())        Dim m As MemoryStream        Dim pFinalDoc As PdfDocument        Dim pTempDoc As Syncfusion.Pdf.Parsing.PdfLoadedDocument        Dim s As String = String.Empty        For i As Integer = 0 To pDoc.Pages.Count - 1            'create a new PDF doc            pFinalDoc = New Syncfusion.Pdf.PdfDocument()            'search if there is any existing PDF having the same key info            searchKey = pDoc.Pages(i).ExtractText().Substring(0, 10)            found = searchList.Keys.Contains(searchKey)            If (found = True) Then                'already existing, load existing pages                pTempDoc = New Parsing.PdfLoadedDocument(searchList(searchKey))                For j As Integer = 0 To pTempDoc.Pages.Count - 1                    pFinalDoc.ImportPage(pTempDoc, j)                    s &= pFinalDoc.Pages(j).ExtractText()                 NextIf (pFinalDoc.Pages(0).ExtractText() = Nothing) thenmsgbox "Error"End if            End If            'add current page            pFinalDoc.ImportPage(pDoc, i)            'save final doc to memory in order to get byte array            m = New MemoryStream()            pFinalDoc.Save(m)            If (found = True) Then                'set to the new value                searchList(searchKey) = m.ToArray()            Else                searchList.Add(searchKey, m.ToArray())            End If            pFinalDoc.Close()            m = Nothing            pFinalDoc = Nothing        Next    End Sub

Angappan G · Answer

Hi HY,Thank you for your interest in Essential PDF.We regret for the delay in getting back to you1.The ExtractText method will only work with the PdfLoadedDocument not with PdfDocument class objects.2.We can't use the ExtractText method with PdfDocument even after importing the contents of the existing document because the method will only work with PdfLoadedDocument class.Please let us know if you have any queries.Regards,Angappan.

Rodrigo T · Answer

Hi, that topic is very useful, please insert into main documentation of pdf.PdfDocument.ExtractText.

Using pdf.PdfDocument.ExtractText, formatted text and others returns dirty.

Using pdf.PdfLoadedDocument.ExtractText, all works fine.

Or still have (2017) a bug into pdf.PdfDocument.ExtractText comparing to out of pdf.PdfLoadedDocument.ExtractText.

Thanks!

Sabari Anand Senthamarai Kannan · Answer

Hi Rodrigo, 

Thank you for contacting Syncfusion products. 

The text extraction from the PDF document cannot be performed using the PdfDocument class after imported from the PdfLoadedDocument object. It can only be performed using the PdfLoadedDocument class. We will update the same in our UG documentation and it will be refreshed within a week. 

Please let us know if you need any further assistance. 

Regards, 
Sabari Anand