How to get exact coordinates of a rectangle annotation in a pdf page of pdfviewer

I want to extract the text from pdf page / tiff image by using Pdf Viewer.

  1. Draw a rectangle box on a page in the PDF viewer ( getting the bounds by using Annotationadded Event X,Y,Width,Height).
  2. Capture the coordinates of the drawn rectangle box.(Passing it to the RectangleF class)
  3. Based on the co-ordinates need to extract the text from the pdf.

    Currently i am using this code for extraction, but based on the bounds the extracted text is not accurate. 

    publicvoidExtractTextfromPDF(double X,double Y,doubleWidth,doubleHeight,intPageIndex)
    {
    string docPath =Path.GetFullPath("wwwroot/Data/Input.pdf");

  4. //Initialize the OCR processor.
        using(OCRProcessor processor =newOCRProcessor())
        {
            FileStream fileStream =newFileStream(docPath,FileMode.Open,FileAccess.Read);
            PdfLoadedDocument loadedDocument =newPdfLoadedDocument(fileStream);
            processor.Settings.Language="dan";
            RectangleF rectangle =newRectangleF((float)(X),(float)(Y),(float)(Width),(float)(Height));// X, Y, width, height
            //Assign rectangles to the page.
            List<PageRegion> pageRegions =newList<PageRegion>();
            PageRegion region =newPageRegion();
            region.PageIndex=PageIndex;
            region.PageRegions=newRectangleF[]{ rectangle };
            pageRegions.Add(region);
            processor.Settings.Regions= pageRegions;
            string extracttext = processor.PerformOCR(loadedDocument,@"wwwroot/Data/TessData/");
            loadedDocument.Close(true);
        }
    }
    Could
    you please guide me on how to achieve this with syncfusion pdf viewer? Any examples or references would be greatly appreciated.


3 Replies

SK Sathiyaseelan Kannigaraj Syncfusion Team January 24, 2025 03:03 PM UTC

Hi Nirmal Chandran,

Thank you for reaching out to us. Below, we have provided a sample for extracting text from specific coordinates obtained from the rectangle annotation. In the sample, you can add the rectangle using the "Select Area" button and then extract the text from that area by clicking the "Extract Text" button. Please review and confirm if this meets your requirements.

Sample to Extract Text


Demo on Extract Text
 


Regards,
Sathiyaseelan K



NC Nirmal Chandran January 25, 2025 06:39 AM UTC

Hi Sathya,

Thanks for your quick response. I tried your Sample and its working fine as expected for English Language, but I am trying to extract from Danish Language and its not extracting every text selected. so can you please update this sample to extract from different language's.


Thanks,
Nirmal C



SK Sathiyaseelan Kannigaraj Syncfusion Team January 29, 2025 01:32 PM UTC

Hi Nirmal Chandran,

Thank you for the update. We attempted to extract text from the PDF containing Danish language, but we did not encounter any issues. The text was extracted correctly from the PDF. Below, we have provided the sample we tested. Please review the sample and PDF to confirm whether the issue persists on your end.

If you are still experiencing issues with your PDF or sample, kindly provide a modified sample and demo that replicates the problem so we can investigate further and provide an appropriate solution.

Sample extract Danish text


Demo on extracting Danish text
 


Regards,
Sathiyaseelan K


Loader.
Up arrow icon