Hello,
I am trying to perform ocr a PDF Document using Syncfusion.PDF.OCR.Net.Core with ASP.net Core 5 like described under https://help.syncfusion.com/file-formats/pdf/working-with-ocr/dot-net-core#assemblies. It is working without any problems on windows, but when running on ubuntu it throws the following exception.
I have installed everything as described:
1. sudo apt-get update 2. sudo apt-get install libgdiplus 3. sudo apt-get install y- libopenjp2-7
And I have also set the path to Tesseract Data and Tesserarct Binaries as described in "Prerequisites for Linux". I am using Syncfusion.PDF.OCR.Net.Core 19.4.0.40.
Thats my Code:
[HttpPost("Read")]
public async Task
{
//Initialize the OCR processor with tesseract binaries folder path
using (OCRProcessor processor = new OCRProcessor(GetTessBinariesPath()))
{
//Load a PDF document
//FileStream stream = new FileStream(@"Input.pdf", FileMode.Open);
PdfLoadedDocument document = new PdfLoadedDocument(file.OpenReadStream());
//Set OCR language
processor.Settings.Language = "eng+deu";
//Perform OCR with input document and tessdata (Language packs)
processor.PerformOCR(document, GetTessDataPath());//@"tessdata\");
MemoryStream stream = new MemoryStream();
//Save the document into stream.
document.Save(stream);
//If the position is not set to '0' then the PDF will be empty.
stream.Position = 0;
//Close the document.
document.Close(true);
//Defining the ContentType for pdf file.
string contentType = "application/pdf";
//Define the file name.
string fileName = "Output.pdf";
//Creates a FileContentResult object by using the file contents, content type, and file name.
return File(stream, contentType, fileName);
}
}
private string GetTessBinariesPath()
{
if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
{
//return "TesseractBinaries\\Mac";
return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Mac");
}
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Windows");
//return "TesseractBinaries\\Windows";
}
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
{
return Path.Combine(_hostingEnvironment.ContentRootPath, "TesseractBinaries", "Linux");
//return "TesseractBinaries\\Linux";
}
return "";
}
private string GetTessDataPath()
{
return Path.Combine(_hostingEnvironment.ContentRootPath, "tessdata");
}
And thats the exception:
Exception has been thrown by the target of an invocation.
at Api.Controllers.OcrController.ReadImage(IFormFile file) in C:\Users\...\source\repos\API\Controllers\OcrController.cs:line 148
at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
at
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.
at
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
at
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.
at
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.
at
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.
at
Microsoft.AspNetCore.Routing.EndpointMiddleware.
at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)
Thank you in advance!