OCR - Tesseract Engine has not been initialised

I've built a quick forms app to test parsing text from an image PDF. I've downloaded the OCR processor, added references to OCRprocessor.base, compression.base and pdf.base and included the correct paths to the binaries and Tessdata. However, I am still getting a "Tesseract Engine has not been initialised" error.
My code is:

public Form1()
{
InitializeComponent();
OCRProcessor processor = new OCRProcessor(@"C:\Program Files(x86)\Syncfusion\Tesseract Binaries\3.02");
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(@"C:\.....\TestPDF.pdf");
processor.Settings.Language = Languages.English;
processor.PerformOCR(loadedDocument, @"C:\Program Files (x86)\Syncfusion\Tessdata\3.02");/*Fails here*/
loadedDocument.Save(@"C:\......\Read.pdf");
}

Any ideas on why I keep getting this error?



3 Replies

SL Sowmiya Loganathan Syncfusion Team June 14, 2018 07:14 AM UTC

Hi Luke, 

Thank you for contacting Syncfusion support. 

Please follow the trouble shooting of OCR in the below UG documentation link to overcome the issue “Tesseract Engine has not been initialized”. 

Also make sure the Syncfusion.OCRProcessor.Base.dll is Unblocked. Please refer the below screenshot for your reference. 

 

Unblock the assembly and rebuild the project to overcome the issue with “Tesseract engine has not been initialized”.    
  
Note: Make sure the bin folder does not contain the blocked assemblies.   
  
However we have created the sample to perform OCR on PDF document. In which we have placed all the files(input document, Tesseract binaries and Tessdata) in Data folder. 

Please find the sample for the same from below location: 
 
Kindly try the above sample in your end and let us know if it solves the issue. 
  
Regards,   
Sowmiya L   



SK Seema Kahane July 7, 2025 02:26 PM UTC

Hello
getting below error after unblocking 'Syncfusion.Compression.Base.dll' DLL.

Could not process Page '0' of file 'Input' from Property Documents Library. - Unhandled Exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. at Syncfusion.OCRProcessor.Native.OCRApi.InitializeDataPath(IntPtr pt, String path, String lang) at Syncfusion.OCRProcessor.OCRProcessor.DoOCR(String[] args) --- End of inner exception stack trace --- at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor) at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at Program.Main(String[] args)

Please help



KS Karmegam Seerangan Syncfusion Team July 8, 2025 01:13 PM UTC

Hi Seema,

Thank you for reaching out to Syncfusion support.

We use Google's Tesseract engine internally to recognize text from scanned PDF documents and images. This engine relies on the Tesseract and Leptonica binaries to process images and extract text, using trained data files (.traineddata) for accurate recognition. These binaries are included in the runtimes/win-x64/native directory, and the trained data files are located in runtimes/tessdata/.

 

The issue you reported typically occurs when the Tesseract binaries are either missing or the path to them is incorrect. To resolve this, please ensure that the binaries are present in the correct location and that the path is properly configured.

 

Additionally, if the binaries do not have sufficient read, write, and execute permissions, this issue may also occur. We kindly request you to ensure that the required permissions are granted for these binaries.

 

We have included the necessary binaries in the NuGet package itself. When you install the package and build your application, the binaries are automatically copied to the project directory, and the paths are configured accordingly. For your reference, we have attached a sample project.

 

Sample:  https://www.syncfusion.com/downloads/support/directtrac/general/ze/OCR_Framework_Application

 

If you are still experiencing issues, we recommend manually specifying the paths for both the Tesseract binaries and the tessdata folder. If the problem persists, please share the following details with us so we can replicate the issue on our end:

  • Complete code snippet
  • Input documents
  • Environment details (OS platform, bit version, and RAM size)

We’ll be happy to assist you further.


Regards,

Karmegam


Loader.
Up arrow icon