BoldSignA modern eSignature application with affordable pricing. Sign up today for unlimited document usage!
Hello,
I have an existing .net core 3.1 project which uses the following Syncfusion packages:
I attempted to install the OCR support using the following command:
dotnet add package Syncfusion.PDF.OCR.Net.Core
After that, my project fails to restore:
Determining projects to restore...
Writing /var/folders/xq/g92c84y57ks8mt57rgxl8fh00000gn/T/tmpDs8Wlp.tmp
info : Adding PackageReference for package 'Syncfusion.PDF.OCR.Net.Core' into project '/Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/LCEnterpriseMIS.Web.csproj'.
info : CACHE https://api.nuget.org/v3/registration5-gz-semver2/syncfusion.pdf.ocr.net.core/index.json
info : Restoring packages for /Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/LCEnterpriseMIS.Web.csproj...
error: NU1605: Detected package downgrade: Syncfusion.Pdf.Net.Core from 21.1.41 to 20.2.0.40. Reference the package directly from the project to select a different version.
error: LCEnterpriseMIS.Web -> Syncfusion.PDF.OCR.Net.Core 21.1.41 -> Syncfusion.Pdf.Imaging.Net.Core 21.1.41 -> Syncfusion.Pdf.Net.Core (>= 21.1.41)
error: LCEnterpriseMIS.Web -> Syncfusion.Pdf.Net.Core (>= 20.2.0.40)
info : Package 'Syncfusion.PDF.OCR.Net.Core' is compatible with all the specified frameworks in project '/Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/LCEnterpriseMIS.Web.csproj'.
info : PackageReference for package 'Syncfusion.PDF.OCR.Net.Core' version '21.1.41' added to file '/Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/LCEnterpriseMIS.Web.csproj'.
info : Generating MSBuild file /Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/obj/LCEnterpriseMIS.Web.csproj.nuget.g.targets.
info : Writing assets file to disk. Path: /Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/obj/project.assets.json
log : Failed to restore /Users/markorciuch/Projects/lcenterprisemisproduct/LCEnterpriseMIS/LCEnterpriseMIS.Web/LCEnterpriseMIS.Web.csproj (in 1.31 sec).
How to get around the "package downgrade detected" errors? Many thanks in advance.
I was able to resolve my problem and got it working on MacOS. Now, I am trying to make it working on Windows and Linux.
I am following instructions from here to get the Windows and Linux binaries:
I installed the following packages:
Syncfusion.HtmlToPdfConverter.Blink.Net.Core.Windows
Syncfusion.HtmlToPdfConverter.Blink.Net.Core.Linux
When I look in the \.nuget\packages\syncfusion.htmltopdfconverter.net.linux\21.1.41 folder, I am not seeing the BlinkBinariesLinux or BlinkBinariesLinux folders. Same with Windows binaries. What am I missing?
The reported exception might be due to a mismatched product version of Syncfusion assemblies. So, we request you refer to the same product version of Syncfusion assemblies to resolve this issue. If adding multiple Syncfusion assemblies to your project, it is dependent assemblies must be of the same assembly version, if they are different then the error will occur. We have created a sample for converting HTML to PDF and OCRing a PDF document using Syncfusion library and it is working properly. We have attached the sample for your reference, please try the sample on your end and let us know the result.
You can find the Blink binaries Linux from below nuget installed location,
|
You can find the Blink binaries Windows from below nuget
installed location,
|
NuGet Package: https://help.syncfusion.com/file-formats/pdf/converting-html-to-pdf#nuget-packages-required-recommended
Thanks for the information. I was able to make it work on Windows for now. What I'm seeing is that Blink is slower and adds to the size of the docker image. Is there no way to continue using the legacy Webkit engine in .net core in the latest version of your software?
Is this the last release of WebKit engine?
WebKit based HTML to PDF conversion are deprecated. WebKit public NuGet package are not available/searched in nuget.org. But you can install the Syncfusion.HtmlToPdfConverter.QtWebKit.Net.Core package in Package manager console by using below command,
NuGet\Install-Package Syncfusion.HtmlToPdfConverter.QtWebKit.Net.Core -Version 21.1.41 |
https://www.nuget.org/packages/Syncfusion.HtmlToPfConverter.QtWebKit.Net.Core
Thank you again for quick response! It's great to have this option for backwards compatibility.
Now I have updated my assemblies and I am proceeding with the original goal which is redaction of sensitive information. I am using the following blog entry as a starting point: https://www.syncfusion.com/blogs/post/easy-ways-to-redact-pdfs-using-c.aspx
The following code throws null object exception while trying to get GetImagesInfo() from loadedPage. I imagine I need to make some .net core specific adjustments because the referenced assembly "Syncfusion.PDF.OCR.WPF" is for .net framework.
I am also attaching the PDF document used in my test. Many thanks in advance for additional guidance on how to make this work in .net core.
We have checked the reported issue with given document but it is working properly on our end and we have attached the sample for your reference.so please try the sample on your end and let us know the result.
Sample: https://www.syncfusion.com/downloads/support/directtrac/general/ze/Perform_OCR_ASPNetCore-1273977551
IF still you have facing any issue,we request you to share modified sample,input document to reproduce the issue on our end.so that it will be helpful for us to analyze and assist you further on this.
Hello and thank you again for your most excellent support!
The sample works great except that the generated document is not redacted. The input document contains fake SSN in plain text that would expect to be redacted. Is there something else that I am missing?
We were able to reproduce the reported issue with provided details on our end. Currently, we are validating on this and will update the further details on May 10th 2023.
We have validated the reported redaction issue in our side. The image in the first page of the document is rotated by 180 degrees. You can check the image rotation by saving the image to file in sample level. As the bounds of the SSN text is wrong, the content is not redacted properly.
We have modified the code to calculate the proper X and Y for 180 degrees manually. To redact the content in .NET Core, we have to add ldoc.Redact() method. Kindly refer the below modified code example to resolve that issue in sample level.
foreach (PdfImageInfo imgInfo in imageInfoCollection) { Bitmap ocrImage = imgInfo.Image as Bitmap;
MemoryStream imgStream = new MemoryStream(); ocrImage.Save(imgStream, System.Drawing.Imaging.ImageFormat.Bmp);
OCRLayoutResult ocrResult = null; float scaleX = 0, scaleY = 0; if (ocrImage != null) { //Process OCR by providing loaded PDF document, Data dictionary and language string text = processor.PerformOCR(imgStream, tessdata,out ocrResult);
//Calculate the scale factor for the image used in the PDF scaleX = imgInfo.Bounds.Height / ocrImage.Height; scaleY = imgInfo.Bounds.Width / ocrImage.Width; }
//Get the text from page and lines. foreach (var page in ocrResult.Pages) { foreach (var line in page.Lines) { if (line.Text != null) { //Regular expression for social security number var ssnMatches = Regex.Matches(line.Text, @"(\d{3})+[ -]*(\d{2})+[ -]*\d{4}", RegexOptions.IgnorePatternWhitespace); if (ssnMatches.Count >= 1) { Syncfusion.Drawing.RectangleF redactionBound = new Syncfusion.Drawing.RectangleF(line.Rectangle.X * scaleX, line.Rectangle.Y * scaleY, (line.Rectangle.Width - line.Rectangle.X) * scaleX, (line.Rectangle.Height - line.Rectangle.Y) * scaleY);
//Image is rotated by 180 degree. so, apply height - y to get the correct y position. redactionBound.Y = loadedPage.Size.Height - redactionBound.Y - redactionBound.Height;
//Image is rotated by 180 degree. so, apply widht - x to get the correct x position. redactionBound.X = loadedPage.Size.Width - redactionBound.X - redactionBound.Width;
//Create PDF redaction for the found SSN location PdfRedaction redaction = new PdfRedaction(redactionBound);
//Adds the redaction to loaded page loadedPage.AddRedaction(redaction); } } } } }
lDoc.Redact(); |
Thanks again for your help!
However, this change is still not redacting the embedded SSN. I am attaching the modified project.
Also, I think this line:
string text = processor.PerformOCR(imgStream, tessdata,out ocrResult);
should be:
string text = processor.PerformOCR(ocrImage, tessdata,out ocrResult);
We have checked reported issue with the provided sample
but texts are redacted properly on our end and attached the output for your reference.
Output : https://www.syncfusion.com/downloads/support/directtrac/general/ze/Output712909651
Please refer the below screenshot,
Input |
Output |
|
|