We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

How to compare text in two PDF documents

Platform: WinForms |
Control: PDF |
Published Date: September 3, 2018 |
Last Revised Date: May 3, 2019

Syncfusion Essential PDF is a .NET PDF library used to create, read, and edit PDF document. Using this library, you can compare the text in two PDF documents by text extraction. The resultant PDF document highlight the entire line of changed text.

Steps to compare the text in PDF documents programmatically:

  1. Create a new Windows Forms application project. Create new windows forms application
  2. Install the Syncfusion.Pdf.Base NuGet package as reference to your .NET Framework application from NuGet.org. install nuget packages
  3. Include the following namespace in the Form1.Designer.cs file.

C#

using Syncfusion.Pdf;
using Syncfusion.Pdf.Graphics;
using Syncfusion.Pdf.Parsing;

 

  1. Add a new button in Form1.Designer.cs to compare the PDF files as follows.
    label = new Label();
    button = new Button();
     
    //Label
    label.Location = new System.Drawing.Point(0, 40);
    label.Size = new System.Drawing.Size(426, 35);
    label.Text = "Click the button to view the compared PDF file generated by Essential PDF";
    label.TextAlign = System.Drawing.ContentAlignment.MiddleCenter;
     
    //Button
    button.Location = new System.Drawing.Point(180, 110);
    button.Size = new System.Drawing.Size(85, 26);
    button.Text = "Compare PDF";
    button.Click += new EventHandler(ComparePDF);
     
    //Create PDF
    ClientSize = new System.Drawing.Size(450, 150);
    Controls.Add(label);
    Controls.Add(button);
    Text = "Create PDF";
    

 

  1. Add the following code in ComparePDF to compare text in two PDF documents.
     //Load the first PDF document
    PdfLoadedDocument loadedDocument = new PdfLoadedDocument("../../Data/Source1.pdf");
     
    //Load the second PDF document
    PdfLoadedDocument loadedDocument1 = new PdfLoadedDocument("../../Data/Source2.pdf");
     
    //Creating the list to store text data in PDF documents
    List<TextData> textData = new List<TextData>();
    List<TextData> textData1 = new List<TextData>();
    List<TextData> maxContainsData = new List<TextData>();
    List<TextData> diff = new List<TextData>();
     
    for (int i = 0; i < loadedDocument.Pages.Count; i++)
    {
        //Get the page from first document
        PdfLoadedPage loadedPage = loadedDocument.Pages[i] as PdfLoadedPage;
        //Extract the text from page of first document 
        string extractedText = loadedPage.ExtractText(out textData);
     
        //Extract the text from page of second document 
        string extractedText1 = loadedDocument1.Pages[i].ExtractText(out textData1);
     
        int minCount = 0;
     
        //Compare the text data count
        if (textData.Count > textData1.Count)
            maxContainsData = textData;
        if (textData.Count < textData1.Count)
            maxContainsData = textData1;
     
        if (textData != textData1)
        {
            if (textData.Count == textData1.Count)
                minCount = textData.Count;
            else
            {
                List<int> count = new List<int>();
                count.Add(textData.Count);
                count.Add(textData1.Count);
                minCount = count.Min();
                //Add diff text to the list
                diff.Add(maxContainsData[minCount]);
            }
            for (int j = 0; j < minCount; j++)
            {
                if (textData[j].Text != textData1[j].Text && textData[j].Bounds != textData1[j].Bounds)
                {
                    //Add diff text to the list
                    diff.Add(textData[j]);
                }
            }
        }
        //Highlight the changed text
        foreach (TextData data in diff)
        {
            loadedPage.Graphics.DrawRectangle(PdfPens.Red,PdfBrushes.Transparent, data.Bounds);
        }
    }
     
    //Save and close the document 
    loadedDocument.Save("ComparedPDF.pdf");
    loadedDocument.Close(true);
    loadedDocument1.Close(true);
     
    //This will open the PDF file so, the result will be seen in default PDF viewer 
    System.Diagnostics.Process.Start("ComparedPDF.pdf");
    

 

A complete working sample can be downloaded from PDFComparisonSample.zip.

By executing the program, you will get the PDF document as follows. Screenshot of output PDF file

Note:

Starting with v16.2.0.x, if you reference Syncfusion assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to link to learn about generating and registering Syncfusion license key in your application to use the components without trail message.

 

 

2X faster development

The ultimate WinForms UI toolkit to boost your development speed.
ADD COMMENT
You must log in to leave a comment
Comments
Jordan Capa
Apr 23, 2019

thanks for the information! for the beginners and want to save time, I found zetpdf.com i heard a lot of this. I suggest it because it's the fastest pdf SKD for .NET applications.

thanks for the information! for the beginners and want to save time, I found zetpdf.com i heard a lot of this. I suggest it because it's the fastest pdf SKD for .NET applications.

Reply

Please sign in to access our KB

This page will automatically be redirected to the sign-in page in 10 seconds.

Up arrow icon

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon

Live Chat Icon For mobile