Category / Section
How to split a PDF document and store it to custom folder in WinRT
1 min read
How to extract rotated image text using OCR
We can extract the rotated image text in readable in OCR by enabling the AutoDetectRotation property.
Please find the below code snippet to auto correct the image rotation.
C#:
using (OCRProcessor processor = new OCRProcessor(@"..\..\Tesseract binaries\")) { //Load the PDF document PdfLoadedDocument ldoc = new PdfLoadedDocument("..\..\Input.pdf"); //Language to process the OCR processor.Settings.Language = Languages.English; //Enable to AutoDetectRotation processor.Settings.AutoDetectRotation = true; // Process OCR by providing loaded PDF document, Data dictionary and language String str = processor.PerformOCR(ldoc, @"..\..\Tessdata\"); //Save the PDF document. ldoc.Save("Output.pdf"); ldoc.Close(true); }
Note:
The rotated image text can be readable, only if the osd.tranineddata file must be present in the Tessdata folder. We have attached osd.traineddata file in the below link.
https://www.syncfusion.com/downloads/support/directtrac/156562/ze/osd-425036592