Syncfusion.PDF.OCR.Net.Core exception when running on Ubuntu "Exception has been thrown by the target of an invocation."

Hello,

I am trying to perform ocr a PDF Document using Syncfusion.PDF.OCR.Net.Core with ASP.net Core 5 like described under https://help.syncfusion.com/file-formats/pdf/working-with-ocr/dot-net-core#assemblies. It is working without any problems on windows, but when running on ubuntu it throws the following exception.

I have installed everything as described:

1. sudo apt-get update 2. sudo apt-get install libgdiplus 3. sudo apt-get install y- libopenjp2-7

And I have also set the path to Tesseract Data and Tesserarct Binaries as described in "Prerequisites for Linux". I am using Syncfusion.PDF.OCR.Net.Core 19.4.0.40.

Thats my Code:

[HttpPost("Read")]

public async Task Read([FromForm] IFormFile file)

{

//Initialize the OCR processor with tesseract binaries folder path

using (OCRProcessor processor = new OCRProcessor(GetTessBinariesPath()))

{

//Load a PDF document

//FileStream stream = new FileStream(@"Input.pdf", FileMode.Open);

PdfLoadedDocument document = new PdfLoadedDocument(file.OpenReadStream());

//Set OCR language

processor.Settings.Language = "eng+deu";

//Perform OCR with input document and tessdata (Language packs)

processor.PerformOCR(document, GetTessDataPath());//@"tessdata\");

MemoryStream stream = new MemoryStream();

//Save the document into stream.

document.Save(stream);

//If the position is not set to '0' then the PDF will be empty.

stream.Position = 0;

//Close the document.

document.Close(true);

//Defining the ContentType for pdf file.

string contentType = "application/pdf";

//Define the file name.

string fileName = "Output.pdf";

//Creates a FileContentResult object by using the file contents, content type, and file name.

return File(stream, contentType, fileName);

}

}

private string GetTessBinariesPath()

{

if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))

{

//return "TesseractBinaries\\Mac";

return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Mac");

}

else if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))

{

return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Windows");

//return "TesseractBinaries\\Windows";

}

else if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))

{

return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Linux");

//return "TesseractBinaries\\Linux";

}

return "";

}

private string GetTessDataPath()

{

return Path.Combine(_hostingEnvironment.ContentRootPath, "tessdata");

}

And thats the exception:

Exception has been thrown by the target of an invocation.

at Api.Controllers.OcrController.ReadImage(IFormFile file) in C:\Users\...\source\repos\API\Controllers\OcrController.cs:line 148

at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)

at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)

at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)

at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)

at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)

at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)

at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.g__Awaited|19_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)

at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)

at Microsoft.AspNetCore.Routing.EndpointMiddleware.g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)

at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)

at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)

at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)

at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)

at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)

Thank you in advance!


1 Reply

GK Gowthamraj Kumar Syncfusion Team January 4, 2022 11:04 AM UTC

Hi Sebastian, 

We can resolve this exception by replacing the new tesseract binaries in Linux (net 5.0) environment. At present, this binaries files is not included in our latest OCR NuGet package, we will include this binaries files in our upcoming weekly NuGet release which will be available on January 11th  2022.  

We requested you to replace and use the provided latest tesseract binaries and perform the OCR operation in your sample. Please find the download location for new tesseract binaries,  

We have attached the output document which generated from latest tesseract binaries for your reference. Please try your sample with latest binaries on your end and let us know the result. 

Please refer the below documentation link about OCR, 

Please let us know if you need any further assistance with this. 

Regards, 
Gowthamraj K 


Loader.
Up arrow icon