Hi,
i have below questions
i have selected area(using rectangle annotation) on pdf viewer on client side(angular) i got that coordinates for the selected area. same coordinates i applied to extract text on pdf using ocr .net core library.
attached screenshot the area where we need to ocr(bottom right) and also attached sample code and also pdf inside the sample
1) I could not extract text from pdf using below code
in the sample code the controller action is PerformOCRPDF() did not extract text.
but i tried different way please check PerformOCRUsingPDFImage() method this is extracting text but some time gives wrong text.
please tell me how to extract text
2) where can i get libraries for Tesseract Version 4.0 in .net core
3) on client side when i try to add rectangle to select area the , i could not change cursor type to cross hair for pdf-viewer control in angular, can you give sample how to change cursor type on pdf-viewer.
Thanks
Dayakar
|
I could not extract text from pdf using below code
in the sample code the controller action is PerformOCRPDF() did not extract text. |
Currently, we are analyze on this sample and we will update the further details on February 16th 2022. | |
|
where can i get libraries for Tesseract Version 4.0 in .net core
|
We can get the TesseractBinaries and tessdata from the OCR Processor download or from the Syncfusion.PDF.OCR.Net.Core NuGet package installed location. Please refer to the following example folder path.
TesseractBinaries
syncfusionocrprocessor\Tesseractbinaries_core (or)
C:\Users\username.nuget\packages\Syncfusion.PDF.OCR.Net.Core\XX.X.X.XX\lib\TesseractBinaries
tessdata
syncfusionocrprocessor\tessdata (or)
C:\Users\username.nuget\packages\Syncfusion.PDF.OCR.Net.Core\XX.X.X.XX\lib\tessdata
| |
|
on client side when i try to add rectangle to select area the , i could not change cursor type to cross hair for pdf-viewer control in angular, can you give sample how to change cursor type on pdf-viewer.
|
You can change the cursor type of the annotations while resizing it using the resizerCursorType property. We have shared the sample and code snippets for your reference.
Code snippet:
|
Hi Thank you for the reply,
For the 3rd question, I have to draw annotation(Ex circle ,square) by selecting annotation from custom toolbox, so when I select annotation from toolbox and when I start to draw I need to show crosshair cursor on pdf viewer.
Thanks,
Dayakar
|
For the 3rd question, I have to draw annotation(Ex circle ,square) by selecting annotation from custom toolbox, so when I select annotation from toolbox and when I start to draw I need to show crosshair cursor on pdf viewer. |
Syncfusion PDF Viewer will have the cursor in crossHair type while drawing the rectangle and circle annotation. We have shared the video for your reference.
Could you please try this and revert us with the screenshot of your exact requirement? It will be helpful for us to investigate further and provide the solution at the earliest. |
|
I could not extract text from pdf using below code
in the sample code the controller action is PerformOCRPDF() did not extract text.
|
We were able to reproduce the reported issue with a provided sample on our end. Currently, we are analyzing on this and we will update the further details on February 18th, 2022.
|
|
1) I could not extract text from pdf using below code
in the sample code the controller action is PerformOCRPDF() did not extract text.
|
We have checked the reported issue on our end, the provided PDF document does not have any images to perform OCR. OCR process returns the text only when the PDF document contains any scanned image. If it is not having any scanned images, it will return the empty text. So, that the provided document resultant text is empty.
|
|
but i tried different way please check PerformOCRUsingPDFImage() method this is extracting text but some time gives wrong text. |
We have checked the PerformOCRUsingPDFImage() method, in that cloned image quality is very low. If the input images does not have proper quality, then it will returns empty or incorrect characters. You can check the cloned image quality by saving the image.This is our actual behavior of OCR processors. |
Hi Gowthamraj,
For 1st Question the provide PDF contains scanned images only for PerformOCRPDF().
and PerformOCRUsingPDFImage() do you have any sample where we can increase image quality for PDF images to extract ocr?
For the 3rd question i attached sample video , i need to select drawing number or revision number from list view control right side and then i need to select the area where drawing number is located.
so in pdf viewer page click event, i am adding the rectangle annotation to pdf to select the coordinates for drawing number , please see the below code for your ref, so my question is how can i show crosshair cursor on pdf viewer when i select drawing number from listview control to draw the rectangle on pdf viewer for getting coordinates.
|
For 1st question |
As we said earlier, we have internally using Google tesseract engine for OCRing the Images. We have checked this issue in Tesseract engine directly using command prompt(CMD) and it does not returns the characters properly in tesseract engine itself. We already tried to export the image with High dpi, but the image quality is poor. So we are unable to proceed further on this issue.
| |
|
For the 3rd question |
We have checked the code snippets you shared. You have tried to import the annotations in the pageClick event. If the annotations are imported directly then Syncfusion PDF Viewer does not shows the crosshair cursor to select the rectangle annotation. So, we suggest you add the rectangle annotations using thesetAnnotationMode method to get the crosshair cursor type while drawing the rectangle annotation. We have shared the sample and code snippet for your reference.
Code snippet:
|