I'm using AspNet.Core 5.0 + Syncfusion Blazor 18.4.0.43
The Nuget package Syncfusion.PDF.OCR.Net.Core is working fine and I can get a good result loading the Pdf files but the Tesseract library extracts the words in English
I changed in my project the language in this way:
//Initialize OCR processor
OCRProcessor processor = new OCRProcessor(hostingEnv.ContentRootPath + @"/TesseractBinaries/Windows");
//Load a PDF document
PdfLoadedDocument lDoc = new PdfLoadedDocument(source.ToArray());
//Set OCR language to process
processor.Settings.Language = "ita";
//OCRLayoutResult hocrBounds;
processor.PerformOCR(lDoc, hostingEnv.ContentRootPath + @"/tessdata/");
because the Italian language is not selectable in the Languages options and then I downlaoded the "ita.traineddata" to replace the "eng.traineddata" inside the tessdata path but the text extracted is still in English
is it not possible to use this line?
processor.Settings.Language = "ita"
Is there any other way to use the Italian language?
Thanks