PDF Compare

Question

Hello,In my testing program check the creation of more pdf; created by the same mechanisms and forms. The comparison return the different result, the files are visually identical but the comparison is always different. The mechanism for comparing the two files:load then files in two StreamReader and then compare string line, line by line (.ReadLine) the result is the file are different.Why?I have attached an example of two files and the report of the comparison.Furthermore, the size of the files seem excessive for a few items used, the size increases significantly with increasing objects pages etc. ...Thanks,AndreaTestPdf_bea52c23.zip

Geetha M · Answer

Hi Andrea,Thank you for your continued interest in Syncfusion products.I am working on this and will update you by tomorrow.Regards,Geetha

Geetha M · Answer

Hi Andrea,Thank you for your patience.It appears that there exists few differences between the files generated by the same code. Could you please let me know if you need to compare the text content of these files or what could be the reason to compare them? This will help us to provide you a better solution.Regards,Geetha

Administrator · Answer

Thanks for your answer.The first use is compare the text, but the second and most important use is compare all objects in the file because I have the need to speed up the control problems of files generated by users and I want to generate files identical to those of users to compare.Thanks and regards,Andrea

Geetha M · Answer

Hi Andrea,Thank you for the details. We have forwarded this to our development team and will update you the details by tomorrow.Regards,Geetha

Geetha M · Answer

Hi Andrea,Thank you for your patience.I got the following feedback from our developers:A pdf document is formed of no of objects, and some objects like (images, fonts, xobjects, etc) requires a unique identifier to address them inside the pdf document. As a result of this, we generate these unique identifiers during runtime which will most likely be different between documents. These plays a crucial role in the difference in the file size and its object content.Another reason is PDF supports representation of (characters, text, images ,etc) using more than one approach. Each pdf generation library may possibly use either of the approaches.Please let me know if you have any questions.Regards,Geetha

Administrator · Answer

Thanks for your reply,
unfortunately at this moment are being comparison/validation between your two versions 5.202.0.25 and 6.402.0.15
and other products, and I need to have this opportunity to compare two files generated by the same mechanism
and the same data and they must be identical as it can make this comparison with other products
and the result is actually what I expect the two files are equal.
This mechanism of comparison will never be possible for your product?

Thank you
Andrea

Geetha M · Answer

Hi Andrea,I have passed on your question to our development team and will update you once I hear back from them.Regards,Geetha

Geetha M · Answer

Hi Andrea,The following is the update from the development team:If your sole purpose is to compare the two documents based on its contents, you have to compare only the contents of the document which is presented in the “Contents” dictionary.  It will return you the identical result most probably. But the PDF documents are very sophisticated, the data’s inside the documents can be compressed using any of the one compression scheme supported by the Adobe.Regards,Geetha

Administrator · Answer

Hello,
thank you for your reply but my situation is different,I need the comparison because:

- To identify any differences introduced by successive versions of your libraries.
- Differences introduced by us in the face of revisions / extensions
- The comparison is not the content but on the PDF (margins, fonts, rectangles, text, etc. ... all objects)
- The format is not compressed, so the compression in our case is not relevant

I also tested the product Aspose to generate PDF and comparison is OK.

Thanks,
Andrea

Geetha M · Answer

Hi Andrea,Thank you for the details.I will check this with our development team and will let you know the details once I hear back from them.Regards,Geetha

Administrator · Answer

Hi Geetha,you have news?Thanks,Andrea

Geetha M · Answer

Hi Andrea,I regret for the long delay.I have forwarded it to the development team and will update you the status of the same by tomorrow.Regards,Geetha

Geetha M · Answer

Hi Andrea,Regarding comparison, our development team is current analyzing the differences introduced by the objects like margins, fonts etc... between different versions and will get back to you in two days.Regards,Geetha

Geetha M · Answer

Hi Andreas,Thank you for your patience.Essential PDF is library for creating and manipulating Pdf files with higher-level of abstraction. What may need here is the low-level parser to parse the PDF files.  For this, a custom parser is needed based on the PDF specification. We regret to let you know that we don’t have any plans on implementing such parser in the near future.Please let me know if you have any questions.Regards,Geetha