Generated WordDocument to PDF : find specific location on page

Hi,

We are generating a word document on our server. As of now, we are generating the entire document in one step.

This document consists of multiple group of pages and a cover page. (cover, page 1-1, page 1-2, page 2-1, page 2-2, page 2-3, page 3-1, etc.)

We are converting this document as a PDF.

Now, we have to insert specific PDF after each group of page, so I will certainly import the pdf then rearrange the page. But for this, I need to know at which page each group starts.

From my previous example we want to have : cover, page 1-1, page 1-2, , page 2-1, page 2-2, page 2-3, , page 3-1, , etc.

I am struggling to find a way to locate each group first page in the pdf.

I have tried using PDF bookmarks (either using word bookmarks or heading style) without success : there are no information in the PdfBookmark about its location (which page could be enough)

Do you have any suggestion about how to do so?


8 Replies 1 reply marked as answer

TB Thomas B June 25, 2021 09:42 AM UTC

Nevermind,

I was able to obtain the Page Index using pdfDocument.Pages.IndexOf((PdfPage)bookmark.Destination.Page);

I based my first though on the Destination.PageIndex value, which was always 0. But the page is really the correct one, and I can base my code on this page index to rearrange the document.


Marked as answer

LB Lokesh Baskar Syncfusion Team June 28, 2021 08:22 AM UTC


Please let us know if you have any other questions and we will be happy to assist you as always.
 

Regards,
Lokesh B
 



TB Thomas B June 28, 2021 08:30 AM UTC

I found strange that PdfBookmark has a Destination property of type PdfDestination, where the PageIndex is always 0, despite the Page property refers to a page which is not at Index 0.

That is the reason why I was convinced that the Page was referring to the first page of the document (instead of the good one).

After discovering I was wrong, I had to use the PdfDocument.Pages.IndexOf(destination.Page) to find the correct page index.

I don't know what is the meaning of the PageIndex property, since it does not have any documentation and it does not contains the actual index of the Page property.

Perhaps there is a bug there, perhaps the documentation should be more clear about the intend of this property.

Regards.



LB Lokesh Baskar Syncfusion Team June 29, 2021 04:56 PM UTC

Hi Thomas,  

Thank you for your update.  

We suspect that the reported problem might be due to the input Word document which used at your end. So, to analyze further on the reported problem with your requirement, could you please provide us the following things from your end:  

 1. Input Word document. 
 2. Code snippets or simplified sample to reproduce the issue. 
 3. Syncfusion product version used at your end. 
 4. Output Pdf document.  

This will be more helpful to check and share the details at the earliest. 

Note: PageIndex property represent the actual index of the page. 

Please let us know if you have any other questions. 

Regards,  
Lokesh B 



TB Thomas B June 30, 2021 09:43 AM UTC

Here is a simple code which reproduces the problem:



using Syncfusion.DocIO.DLS;
using Syncfusion.DocToPDFConverter;
using Syncfusion.Pdf;
using Syncfusion.Pdf.Interactive;
using System;

namespace WindowsFormsApp1
{
    static class Program
    {
        static void Main()
        {
            using (var wordDoc = new WordDocument())
            {
                IWSection section;
                IWParagraph paragramh;

                section = wordDoc.AddSection();
                section.BreakCode = SectionBreakCode.NewPage;
                paragramh = section.AddParagraph();
                paragramh.AppendText("First Page");

                section = wordDoc.AddSection();
                paragramh = section.AddParagraph();
                paragramh.AppendBookmarkStart("Second Page");
                paragramh.AppendText("Second Page");
                paragramh.AppendBookmarkEnd("Second Page");

                using (var converter = new DocToPDFConverter())
                {
                    var pdfDoc = converter.ConvertToPDF(wordDoc);

                    foreach (PdfBookmark bookmark in pdfDoc.Bookmarks)
                    {
                        Console.WriteLine(bookmark.Destination.PageIndex); // output 0
                        Console.WriteLine(pdfDoc.Pages.IndexOf((PdfPage)bookmark.Destination.Page)); // output 1
                    }
                }


                Console.ReadKey();
            }
        }
    }
}



LB Lokesh Baskar Syncfusion Team July 1, 2021 03:43 PM UTC

Hi Thomas,

On further analysis, we have found that this is the actual behavior, because we use the same API (PdfDestination) for both creation and loading support. So, the PageIndex retrieves only the document is loads. Thereby you can use the second approach to get the page index for creation. 
Console.WriteLine(pdfDoc.Pages.IndexOf((PdfPage)bookmark.Destination.Page)); 

For loaded document, you can retrieve the page index from existing loaded document by using below code snippet. 
PdfLoadedDocument ldoc = new PdfLoadedDocument("Output.pdf"); 
PdfLoadedBookmark lbook = (ldoc.Bookmarks[0] as PdfLoadedBookmark); 
var index = lbook.Destination.PageIndex; 

Please let us know if you have any other questions.

Regards
Lokesh B 



TB Thomas B July 1, 2021 03:46 PM UTC

Thank for the explanation.

Perhaps you could document this behavior on the official documentation, by explaining that the PageIndex property is only set when the destination is owned by a PdfLoadedDocument?


Regards,



LB Lokesh Baskar Syncfusion Team July 2, 2021 11:40 AM UTC

Hi Thomas,

Thank you for your update.

We will update the details in our UG documentation, and we will notify you once its reflected in live.

Please let us know if you have any other questions.

Regards
Lokesh B  


Loader.
Up arrow icon