We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

Using pdf with text for ocr

I use a pdf file for ocr and if the pdf is some kind of text(I can copy it) it does not work, but if the text is an image in pdf it works.
What to do ?

1 Reply

SK Sasi Kumar Sekar Syncfusion Team November 7, 2016 12:22 PM UTC

Hi, 
 
Thank you for your update, 
 
We have Ocr-ed the different types of input PDF files in our side, the mentioned text is not coping issue is not reproduce.  
Please find the code snippet and sample for Ocr the PDF file. 
Code snippet: 
string Tesserctbinaries = Server.MapPath("~/Tesseract binaries"); 
string Testdata = Server.MapPath("~/Tessdata"); 
//Initialize the OCR processor 
using (OCRProcessor processor = new OCRProcessor(Tesserctbinaries)) 
 { 
   //Load the PDF document  
   PdfLoadedDocument lDoc = new     PdfLoadedDocument(Server.MapPath("~/App_Data/Region.pdf")); 
   //Language to process the OCR 
   processor.Settings.Language = Languages.English; 
   processor.PerformOCR(lDoc, Testdata); 
   lDoc.Save(Server.MapPath("~/Output/sample.pdf")); 
 }          
 
 
Sample link: 
 
So kindly provide the input document it will helpful us to analyze and provide the solution earlier. 
 
Regards, 
Sasi kumar S. 


Loader.
Up arrow icon