2X faster development
The ultimate WinForms UI toolkit to boost your development speed.
The OCR process can be performed for individual pages of the PDF document to acquire text for each page separately. Please find the code example and sample below for the same. C#: string resulttext = string.Empty; string out_filename = @"..\..\Data\result.txt"; //Load the existing PDF document. PdfLoadedDocument lDoc = new PdfLoadedDocument(@"..\..\Data\Region.pdf"); for (int i = 0; i < lDoc.Pages.Count; i++) { // Initialize the OCR processor using (OCRProcessor processor = new OCRProcessor(@"..\..\Tesseract binaries\")) { //Set the performance. processor.Settings.Performance = Performance.Slow; resulttext += " \n" + "page no " + i.ToString() + "\n"; //Process OCR by providing loaded PDF document page by page. resulttext += processor.PerformOCR(lDoc, i, i, @"..\..\Tessdata\"); } } //save the OCRed text with page number File.WriteAllText(out_filename, resulttext); //close the document lDoc.Close(true);
Sample Link: http://www.syncfusion.com/downloads/support/directtrac/147065/ze/OCRPageByPage-465922787
|
2X faster development
The ultimate WinForms UI toolkit to boost your development speed.
This page will automatically be redirected to the sign-in page in 10 seconds.