Removing page(s) from PDF OK but file size stays the same ..?

I can remove pages from my PDF doc without any problem, but I don't understand why the then saved file is the same size (in MB) as the original.  I have a 7-page PDF file, containing images and text on each page, which is some 2MB, and after removing all but the first two pages, the resultant file is still 2MB.

This is all done using the simple 
.Pages.RemoveAt[i]  // for i = Pages.Count - 1 To 2 Step -1

commands. I have also tried adding in lines, before calling these, as per

.Pages[i].ExtractImages()
.Pages[i].ExtractText()

but these don't appear to do anything - if I check them after calling the images array is not zero, and the text is not nothing.

What am I missing?


17 Replies 1 reply marked as answer

IJ Irfana Jaffer Sadhik Syncfusion Team June 23, 2025 11:05 AM UTC

Hi Phil Uribe,

Based on your description, we suspect that the problem may be related to the reuse of shared PDF objects such as images, fonts, and other resources. When a document contains shared resources, these are not automatically removed during page deletion, which could lead to unexpected behavior.

To help us investigate further and provide a prompt resolution, we kindly request you to share the input PDF file. This will allow us to analyze the issue in detail.

In the meantime, we recommend trying the following approach when removing pages:

By default, the PDF is saved using incremental update mode, which means only the modified objects are written to the file, while the existing content remains unchanged. To ensure a clean update, we suggest disabling incremental updates as shown below:

 

//Disable the incremental updateloadedDocument.
FileStructure.IncrementalUpdate = false;

 

Documentation link: https://help.syncfusion.com/document-processing/pdf/pdf-library/net/working-with-document#performing-incremental-update-for-pdf-document 

Please try this approach on your end. If the issue persists, do share the input PDF file so we can assist you further.

We appreciate your cooperation and look forward to resolving this promptly.


Regards,

Irfana J. 

 




PU Phil Uribe June 23, 2025 02:47 PM UTC

Hi

Thanks for your reply.  I did in fact the line 
FileStructure.IncrementalUpdate = false in my code...  as requested I have attached here a zip file containing four documents:

  1. template.docx
    this is read into a Syncfusion WordDocument, and using FindText and ReplaceText various text and images are inserted, and the document is then saved as
  2. dogs.docx
    this s then converted into
  3. dogs.pdf
    using 
    converter.ConvertToPDF
    This is then opened and all but the first two pages are removed as per my original post, and saved as
  4. dogs_4.pdf

To save bandwidth, I have used smaller files than my original post, but again you can see that the 14-dog pdf and the 4-dog pdf are pretty much the same size, whereas I'd expect the latter to be only a quarter of it.

regards
Phil

Attachment: template_1de92459.zip


IJ Irfana Jaffer Sadhik Syncfusion Team June 24, 2025 11:27 AM UTC

HI Phil Uribe,


Upon further analysis, we found that the provided document contains a single image resource that is reused across multiple pages. The image itself is approximately 600KB in size.

As a result, removing pages from the PDF does not significantly reduce the file size. Even if 6 out of 7 pages are removed, the shared image resource remains embedded in the file, maintaining its original size. This is expected behavior due to how shared resources are handled in PDFs, and unfortunately, we cannot proceed further on this front.

Please refer to the attached screenshot for additional context.

Page 1

Page 2

image

image

image

 

However, if reducing the file size is still a priority, we recommend using our PDF compression feature, which is designed to optimize and minimize file sizes effectively.

Working with PDF compression | Syncfusion UG documentation


Regards,

Irfana J. 




PU Phil Uribe June 24, 2025 01:45 PM UTC

I only used the same image to save upload bandwidth.  It makes no difference is I use different ones.  I've run it again using 14 different images, and the 14-image PDF is 3.81MB, as is the 4-image one! I've attached them here.

Just to add: as a community-licence holder only, I can't expect you to waste too much time on this. But I am quite intrigued as to why removing 3/4 of the pages of a PDF dc doesn't reduce the file size accordingly.

Thanks for you replies.

[edit] if I inspect this 4-page PDF file in  https://pdfcrowd.com/inspect-pdf/ I cans see that all 14 of the images are still contained within it, even though only 4 are shown. So the page remove feature doesn't remove the actual images from the file.

This, though, is what I would expect your
Pages[i].ExtractImages()
to do, but it doesn't appear to. Unless I'm missing something about it?


Attachment: dogs_f0ab0a77.zip



PU Phil Uribe June 24, 2025 02:26 PM UTC

HA!  OK, I've got it, finally.  ExtractImage doesn't remove the image, it just gives one access to it.  To actually remove images from the file, an array of PdfImageInfo objects on each page needs to be obtained, and then for each ImageInfo imgInfo object in that array we can apply RemoveImage(imgInfo)

That does it.




IJ Irfana Jaffer Sadhik Syncfusion Team June 25, 2025 02:37 PM UTC

Hi Phil,


Yes, the RemoveImage(imgInfo) method worked as expected—thanks for confirming that!

Regarding your initial observation about the file size not decreasing proportionally when reducing the number of pages: we acknowledge the behavior and will analyze it further. We'll share our validation findings with you by June 27th, 2025.

We appreciate your curiosity and the detailed investigation—it’s always insightful to see how these internal mechanics behave in real-world scenarios.


Regards,

Irfana J.




IJ Irfana Jaffer Sadhik Syncfusion Team June 26, 2025 06:27 AM UTC

Hi Phil,


Regarding your initial observation about the file size not decreasing proportionally when reducing the number of pages:

Removing pages from the PDF does not significantly reduce the file size. Even if 10 out of 14 pages are removed, the shared image resource remains embedded in the file, maintaining its original size. This is expected behavior due to how PDFs manage internal resources.

Regarding your use of Pages[i].ExtractImages(), this method is designed to extract visible image content from individual pages, but it does not alter or remove the underlying image resources embedded in the document. So even if certain pages are removed or images are no longer displayed, the original image data may still persist in the file unless explicitly removed during optimization or re-saving with resource cleanup.


Regards,

Irfana J



PU Phil Uribe June 26, 2025 06:52 AM UTC

Yes, thanks, I'd come to realise this. However, the interesting point is (and perhaps you should make this clear in the documentation) is that even if images are not shared  - ie are unique on each page - simply removing the page still does not remove the images from the document data. Images are treated as a shared resource even if they are used only once.  Hence they need to be explicitly removed before the page is, if the file size is to be reduced.

I imagine the same thing applies to text, though of course this isn't so crucial as far as document size goes.  It could though be crucial with regards sensitive information: one could end up leaving some sensitive text embedded thinking it had been removed along a deleted page...

Perhaps the Pages.RemoveAt() method could have a Boolean parameter added which, if set True, would remove all images and text automatically... just a thought!


Thanks.





JT Jeyalakshmi Thangamarippandian Syncfusion Team June 27, 2025 01:33 PM UTC

Hi Phil,


Thank you for the update. We will include these details in our documentation.

Most PDF creators optimize file size by identifying identical images and storing them as shared resources. These shared objects are referenced across multiple pages rather than being duplicated. Therefore, removing a specific page does not automatically remove the shared image resource, as it may still be used on other pages.

This behavior is by design and not considered an issue. If a shared resource is still referenced by other pages, removing it would result in missing content on those pages. In our current implementation, shared resources—such as images, text, and other objects—are only removed when all pages referencing them are deleted.

We do not plan to introduce an API to forcefully remove shared resources, as doing so could lead to unintended issues in the remaining pages of the document.

Thank you for your understanding on this matter.


Regards,

Jeyalakshmi T



PU Phil Uribe June 27, 2025 02:01 PM UTC

Sure, I understand.  However, removing images from a page does not affect other pages that may use the same image - An image won't be removed from the PDF's shared resources unless it isn't used anywhere.

Currently, to remove a page (say the first page) and any images on it, you have to use code like this:

// remove any images
    Exporting.PdfImageInfo[] imageInfo;
    Exporting.PdfImageInfo imgInfo;
    imageInfo = pdfDoc.Pages(1).ImagesInfo();
    foreach (var imgInfo in imageInfo)
          pdfDoc.Pages(1).RemoveImage(imgInfo);

// finally remove the page
    pdfDoc.Pages.RemoveAt[i];

It won't matter if any of the images removed are used elsewhere - they will still be visible in the cut-down PDF document. Hence that first block above could safely be incorporated into the  Pages.RemoveAt[] method.

If an image is used elsewhere it will remain in the PDF shared resources (so the file size won't be reduced on account of it), but if it isn't used elsewhere, it will be removed from shared resources and the file size reduced accordingly.

This is what my tests reveal, anyway.





IJ Irfana Jaffer Sadhik Syncfusion Team June 30, 2025 09:20 AM UTC

Hi Phil Uribe,


Thank you for your question regarding image and page removal in a PDF using the Syncfusion PDF library.

When you remove a page from a PDF document using the PdfDocument.Pages.RemoveAt(index) method, the page is deleted from the document. However, images used on that page are not automatically removed from the PDF’s shared resources unless they are not referenced elsewhere in the document.

  • If an image is used only on the removed page, it will be removed from the shared resources, reducing the file size.
  • If the image is used on other pages, it will remain in the shared resources and continue to appear in the document.

This behavior ensures that shared resources are preserved when still in use, while unused resources are cleaned up automatically.


Regards,

Irfana J.




PU Phil Uribe June 30, 2025 09:48 AM UTC

Hi,

But it doesn't - this is my point! Unless I explicitly add

  Exporting.PdfImageInfo[] imageInfo;
  Exporting.PdfImageInfo imgInfo;
  imageInfo = pdfDoc.Pages(1).ImagesInfo();
  foreach (var imgInfo in imageInfo)
        pdfDoc.Pages(1).RemoveImage(imgInfo);

before calling 
 pdfDoc.Pages.RemoveAt[i];

non-shared images (i.e. images only used on this page) are NOT removed, and the file size does not reduce.  I have to remove them myself first in order to reduce the page size.

regards
Phil



IJ Irfana Jaffer Sadhik Syncfusion Team July 1, 2025 11:36 AM UTC

HI Phil,


Thank you for the detailed explanation. We’ve successfully replicated the reported issue on our end and are currently analyzing it. We will share further updates with you by July 3rd, 2025.


Regards,

Irfana J. 



AJ Antro James Loordhu Raj Syncfusion Team July 3, 2025 02:14 PM UTC

Hi Phil,

We have confirmed the issue “Resources were not removed from the PDF even after removing the Pages from the PDF document” as a defect in our product, and we will include the fix in the weekly release on July 22, 2025.

 

Please use the below feedback link to track the status of the reported bug.

Resources were not removed from the PDF even after removing the Pages from the PDF document in WinForms | Feedback Portal

 

Note: If you require a patch for the reported issue in any of our Essential Studio Main or SP release version, then kindly let us know the version, so that we can provide a patch in that version based on our SLA policy.

 

Disclaimer: “Inclusion of this solution in the weekly release may change due to other factors including but not limited to QA checks and works reprioritization.”


Regards,
Antro James L



AJ Antro James Loordhu Raj Syncfusion Team July 22, 2025 01:40 PM UTC

Hi Phil,

We have resolved the issue Resources were not removed from the PDF even after removing the Pages from the PDF document” and created the custom patch of the version 30.1.40. kindly find the NuGet below. The fix will be included in upcoming weekly release on July 29, 2025.

NuGet Link: syncfusion.pdf.winforms.30.1.40.nupkg

Regards,
Antro James L



SG Sivaram Gunabalan Syncfusion Team July 29, 2025 12:43 PM UTC

Hi Phil,

As promised earlier, we have included the fix for the reported issue where " Resources were not removed from the PDF even after removing the Pages from the PDF document" in our latest weekly NuGet release (v30.1.42).

Please use the below link to download our latest weekly NuGet:
NuGet Gallery | Syncfusion.Pdf.WinForms 30.1.42


Root cause: The PDF size did not decrease after deleting pages from the document. This is because the resource dictionary of the removed pages was not deleted properly.


Regards,

Sivaram G



Marked as answer

PU Phil Uribe July 29, 2025 01:14 PM UTC

Thanks!


Loader.
Up arrow icon