BoldDeskWe are launching BoldDesk on Product Hunt soon. Learn more & follow us.
Hi,
I've been looking at this thread: https://www.syncfusion.com/forums/172278/convert-existing-pdf-to-pdf-a-document
Which I seem to reproduce on my end with a sample project (you can find it attached).
We would like to be able to convert a Word file (docx) to a PDF/A file. To my knowledge using Syncfusion there is no convertToPdfa method directly accessible on the WordDocument so I'm doing this instead :
1. Convert DOCX to PDF:
static void WordToPdf()
{
using (FileStream fileStream = new FileStream(Path.GetFullPath(DocxPath), FileMode.Open))
{
//Loads an existing Word document.
using (WordDocument wordDocument = new WordDocument(fileStream, Syncfusion.DocIO.FormatType.Automatic))
{
//Creates an instance of DocIORenderer.
using (DocIORenderer renderer = new DocIORenderer())
{
//Sets Chart rendering Options.
renderer.Settings.ChartRenderingOptions.ImageFormat = ExportImageFormat.Jpeg;
//Converts Word document into PDF document.
using (PdfDocument pdfDocument = renderer.ConvertToPDF(wordDocument))
{
//Saves the PDF file to file system.
using (FileStream outputStream = new FileStream(Path.GetFullPath(PdfPath), FileMode.Create, FileAccess.ReadWrite, FileShare.ReadWrite))
{
pdfDocument.Save(outputStream);
}
}
}
}
}
}
2. Then convert PDF to PDF/A:
static void PdfToPdfa()
{
//Load an existing PDF document.
using (FileStream docStream = new FileStream(PdfPath, FileMode.Open, FileAccess.Read))
{
using (PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream))
{
//Sample level font event handling.
loadedDocument.SubstituteFont += LoadedDocument_SubstituteFont;
//Convert the loaded document into PDF/A document.
loadedDocument.ConvertToPDFA(PdfConformanceLevel.Pdf_A1B);
using (MemoryStream memoryStream = new MemoryStream())
{
//Save the document.
loadedDocument.Save(memoryStream);
//Close the document.
loadedDocument.Close(true);
memoryStream.Position = 0;
using (FileStream fileStream = new FileStream(PdfaPath, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
memoryStream.WriteTo(fileStream);
}
}
}
}
}
Result is:
- PDF/A generated file "claims to be" PDF/A but is not compliant (verified with Acrobat Pro and verapdf)
- PDF file is OK, but inside the PDF/A file, some text is missing
Maybe there is something wrong with the font?
Could you please tell me if I'm doing something wrong or if there is a bug?
Thanks in advance
On our further analysis, while converting pdf to Pdf A conformance, we have embedded all the used fonts in the existing pdf document. In that, we get the font from the cache collection if it is the same font. It causes the preservation issue. We can overcome this issue in sample level to clear the font cache before converting pdf to pdfA conformance. Please use below code snippet to clear the font cache.
static void PdfToPdfa() {
PdfDocument.ClearFontCache(); //Load an existing PDF document. using (FileStream docStream = new FileStream(PdfPath, FileMode.Open, FileAccess.Read)) { using (PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream)) {
//Sample level font event handling. loadedDocument.SubstituteFont += LoadedDocument_SubstituteFont;
//Convert the loaded document into PDF/A document. loadedDocument.ConvertToPDFA(PdfConformanceLevel.Pdf_A1B);
using (MemoryStream memoryStream = new MemoryStream()) { //Save the document. loadedDocument.Save(memoryStream);
//Close the document. loadedDocument.Close(true);
memoryStream.Position = 0;
using (FileStream fileStream = new FileStream(PdfaPath, FileMode.OpenOrCreate, FileAccess.ReadWrite)) { memoryStream.WriteTo(fileStream); } } } } } |
Kindly try the above solution on your end and let us know if you need any further assistance on this.
Hi,
Thank you for your answer.
I tried adding the PdfDocument.ClearFontCache() instruction and it did work for the missing text.
However the file is still not compliant to PDF/A-1B (see verapdf report attached).
Attachment: verapdfReport_eab7aaa0.zip
We suspect that the document contains the trail watermark in it. Due to this, the conformance is Invalid. This is not an issue. To overcome this, we must apply the registration license key to avoid a trial watermark and it will be resolved.
Please use the below code snippet to apply license :
Syncfusion.Licensing.SyncfusionLicenseProvider.RegisterLicense("Your License Key"); |
we have attached the output document for your reference:
https://www.syncfusion.com/downloads/support/directtrac/general/pd/WordSample_pdfa1273051673
Please find the below steps to add licensing to the pdf document
Please let us know if you are facing any issues with any of the above steps. We request you try this on your end and let us know the result. If still you are facing any issues, we request you to share more details about the issue like issue screenshots/ documentation/ video demos to understand the issue more clearly and provide you the accurate details.
Hi,
You were right, when I uploaded my project to the thread, I removed the license and I just forgot to put it back.
I tried the same code with the license activation and the PDF/A file is valid.
Case solved, thank you very much!