Text Recognition error

Hello,
I need to extract text from pdf Invoices.
But often I get System.FormatException.

My code:

using (FileStream oFs = new FileStream(item.FilePath, FileMode.Open, FileAccess.Read))
                        {
                            loadedDocument = new PdfLoadedDocument(oFs);
                            PdfLoadedPageCollection loadedPages = loadedDocument.Pages;
                            foreach (PdfLoadedPage loadedPage in loadedPages)
                            {
                                extractedText += loadedPage.ExtractText();
                            }
                            loadedDocument.Close(true);
                        }
                        if (extractedText.Trim().Length > 0) {
                            DocumentTextRecognition dt = new DocumentTextRecognition() {Id = item.IdDocumentText };
                            dt.Flgread = true;
                            dt.Updated = DateTime.Now;
                            dt.Textread = extractedText.Trim();
                            _backgroundService.UpdateDocumentText(dt).Wait();
                        }
StackTrace:
   at System.Number.ParseSingle(String value, NumberStyles options, NumberFormatInfo numfmt)
   at Syncfusion.Pdf.PdfPageBase.RenderText(PdfRecordCollection recordCollection, PdfPageResources m_pageResources)
   at Syncfusion.Pdf.PdfPageBase.ExtractText()

MessageException:
Input string was not in a correct format

Thanks,
David



6 Replies

DD Davide D Angelo March 15, 2018 09:51 AM UTC

Hello,
I find that if I use PdfGraphics (Graphics.DrawImage and SetTransparency) on a page I will get an error.
Why? Is there any solution in this case?
Thanks,
David


SL Sowmiya Loganathan Syncfusion Team March 16, 2018 01:08 PM UTC

Hi Davide,  
 
Thank you for contacting Syncfusion Support.  
 
We were unable to reproduce “System.Format exception with PdfGraphics (DrawImage and SetTranspereancy) and extract the text on a page”. We have shared the sample in which we have tried to reproduce the issue with System. Format exception while extract the text from PDF page. 
 
Please find the sample for the same from below location: 
 
 
We suspect that the both the issue will be with the specific PDF document. So can you please try the above sample and revert us with the PDF document and the modified sample, if the issue still occurs at your end. These will be helpful for us to investigate more on your requirement and assist you better.  
 
Please let us know if you need any other assistance.  
 
Regards, 
Sowmiya L 



DD Davide D Angelo March 19, 2018 03:13 PM UTC

Hello, 
I'm unable to test your solution because I have an error while running it.
Anyway, in the attachment there's one page from your pdf ("HTTP Succinctly") file, where we applied "DrawImage" and "DrawString" from PdfGraphics.

I get this error:
"
StackTrace:
   at System.Number.ParseSingle(String value, NumberStyles options, NumberFormatInfo numfmt)
   at Syncfusion.Pdf.PdfPageBase.RenderText(PdfRecordCollection recordCollection, PdfPageResources m_pageResources)
   at Syncfusion.Pdf.PdfPageBase.ExtractText()

MessageException:
Input string was not in a correct format
"

Thanks,
David



Attachment: example_65c7c1db.zip


AA Akshaya Arivoli Syncfusion Team March 20, 2018 01:02 PM UTC

Hi Davide, 

Thank you for you update. 

We regret to let you know that we are unable to reproduce the “Input string was not in a correct format” exception with the PDF document provided in the forum. We have shared the extracted text output in the below link, 


Please refer the output and revert us with the .NET Standard and the Essential Studio version in which issue is reproducing at your end. These will be helpful for us to investigate more on you issue and assist you better. 

Regards, 
Akshaya 



DD Davide D Angelo March 21, 2018 03:39 PM UTC

Hello, I started a new solution in Asp Net Core, with .Net Standard 16.1.0.26 for SyncFusion products

Entire solution is in the attachment.
Thanks,
David

Attachment: WebApplication2_c4de4b2e.zip


AA Akshaya Arivoli Syncfusion Team March 23, 2018 07:30 AM UTC

Hi Davide, 


Thank you for your update. 

A support incident to track the status of reported issue has been created under your account. Please log on to our support website to check for further updates.      


Please let us know if you have any concern on this. 

Regards, 
Akshaya 


Loader.
Up arrow icon