We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. (Last updated on: November 16, 2018).
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

How to convert scanned image to searchable PDF by processing OCR

Tesseract is an optical character recognition engine, one of the most accurate OCR engines at present.

Syncfusion Essential PDF supports OCR by using the Tesseract open-source engine. With a few lines code, a scanned paper document containing raster images is converted to a searchable and selectable document.

The following assemblies are required to use the OCR feature in your application.

Syncfusion assemblies

  • Syncfusion.Compression.Base.dll
  • Syncfusion.Pdf.Base.dll
  • Syncfusion.OcrProcessor.Base.dll

Tesseract assemblies

  • SyncfusionTessaract.dll (Tesseract Engine Version 3.02)
  • liblept168.dll (Leptonica image processing library used by Tesseract engine)

Steps to convert scanned image to searchable PDF (OCR) programmatically:

  1. Create a new C# console application project.

  1. Install Syncfusion.Pdf.WinForms and Syncfusion.OCRProcessor.Base NuGet packages as reference to your .NET Framework application from NuGet.org.




  1. Include the following namespace in the Program.cs file.



  1. Tesseract assemblies will be found in the NuGet package installed location, you can move the Tesseract assemblies to your application folder and refer the location of the assemblies are passed as a parameter to the OCR processor.



  1. Place the Tesseract language data {E.g eng.traineddata} in the local system of the application folder and provide a path to the perform OCR method.



The dictionary packs for the other languages can be downloaded from the following online location:


Note: You can get the Tesseract binaries SyncfusionTessaract.dll, liblept168.dll, and the language pack (tessdata)— by downloading the OCR processor zip file from Add-On section from the following link.


  1. Use the following code snippet to convert scanned image to searchable PDF.



You can download the work sample from OCRSample.Zip

By executing the program, you will get the PDF document as follows.

Take a moment to peruse the documentation, where you will find other options like performing OCR on image, region of the document, and large PDF documents with code examples.

Refer here to explore the rich set of Syncfusion Essential PDF features.


Starting with v16.2.0.x, if you reference Syncfusion assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to link to learn about generating and registering Syncfusion license key in your application to use the components without trail message.


Article ID: Published Date: Last Revised Date: Platform: Control:
9144 08/14/2018 02/01/2019 WinForms PDF
Did you find this information helpful?
Add Comment
You must log in to leave a comment

Please sign in to access our KB

or the page will be automatically redirected to sign-in page in 10 seconds.

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon