We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date
close icon

Better pagination for PDF output of HTML URL

I am creating a PDF from a URL (which points to an HTML5 file with an embedded CSS on my local machine).

Syncfusion.HtmlConverter.HtmlToPdfConverter converter = new Syncfusion.HtmlConverter.HtmlToPdfConverter();
Syncfusion.Pdf.PdfDocument document = converter.Convert(MyUrlString);

document.Save(fname);

I have some nice CSS padding (top, bottom, left and right) for the HTML body element, so all of its content is well away from the page edges.

In the PDF output, the left and right padding is great, but only the first page has the nice padding at the top of the page (but no padding at the bottom).
All subsequent pages have no padding at all on the top or bottom, so the pages don't read or print well.

Is there any way to tell the PDF to pad the tops and bottoms of pages when creating this kind of output?



5 Replies

PV Prakash Viswanathan Syncfusion Team November 1, 2016 11:54 AM UTC

Hi Zander Westendarp, 
 
Thanks for contacting syncfusion support. 
 
We internally use MSHTML(IE rendering engine) for converting HTML to PDF, using MSHTML we take a snapshot of the HTML as it is displayed in Web Browser and draw it in the PDF document. So the padding is only applied in first and last pages for top and bottom padding. This is the default behavior of our HTML converter.  
 
To overcome this behavior, kindly use margin as a workaround. You can set margin for PDF page separately with below code snippet, which separate the content from the border of the page. 
//Set margin for all side 
htmlConverter.ConverterSettings.Margin.All = 10; 
 
or 
 
//Set margin for each side separately 
htmlConverter.ConverterSettings.Margin.Left = 10; 
htmlConverter.ConverterSettings.Margin.Right = 10; 
htmlConverter.ConverterSettings.Margin.Top = 20; 
htmlConverter.ConverterSettings.Margin.Bottom = 20; 
 
Kindly let us know, if this workaround meets your requirement. 
 
Regards, 
Prakash V. 



ZW Zander Westendarp November 1, 2016 04:42 PM UTC

Hey, Prakash
Many thanks for your kind reply. As you can tell, I am a new Syncfusion user!

I have tried your fix, and it almost works! I now have the margins I need.
However I now have this problem: lines of text are being cut in half between two pages.

I am attaching some sample files:
  • The original HTML source file I am trying to process (SampleSource.htm)
  • An example of the Syncfusion PDF output (SF Code Test for PDF Conversion.pdf) -- see the bottom of page 3 for an example of the problem.
  • An example of PDF output using MS Word (MS Word Test for PDF Conversion.pdf) -- I copy the HTML from the browser window, pasted it into MS Word, and then used Save As PDF. No problems in the output.

The C# code I am using is shown below. Can you help, or should I use MS Word? Thanks!

Syncfusion.HtmlConverter.HtmlToPdfConverter converter = new Syncfusion.HtmlConverter.HtmlToPdfConverter();

Syncfusion.HtmlConverter.IEConverterSettings ieConverterSettings = new Syncfusion.HtmlConverter.IEConverterSettings();
ieConverterSettings.IsPDFA1B = true; //embeds fonts in the PDF
ieConverterSettings.Margin.Top = 30;
ieConverterSettings.Margin.Bottom = 40;
ieConverterSettings.Margin.Left = 25;
ieConverterSettings.Margin.Right = 25;
ieConverterSettings.PdfPageSize = Syncfusion.Pdf.PdfPageSize.Letter;
converter.ConverterSettings = ieConverterSettings;

Syncfusion.Pdf.PdfDocument document = new Syncfusion.Pdf.PdfDocument();
//adding the following document PageSettings did not seem to help solve the output problem
document.PageSettings.Size = Syncfusion.Pdf.PdfPageSize.Letter;
document.PageSettings.Margins.Top = 30;
document.PageSettings.Margins.Bottom = 40;
document.PageSettings.Margins.Left = 25;
document.PageSettings.Margins.Right = 25;

document = converter.Convert(ReportWebBrowser.Url.ToString());
document.Save(fname);
document.Close();


Attachment: TestPDF_d6485682.zip


PV Prakash Viswanathan Syncfusion Team November 2, 2016 12:50 PM UTC

Hi Zander Westendarp, 
 
Thanks for the update. 
 
We have analyzed the documents you have provided. In our IE based HTML to PDF converter, we internally make use of MSHTML (IE rendering engine) to convert HTML to vector images. From the images we will render the PDF document. Microsoft has changed its behavior from IE 9 and above, that IE will generate bitmap images instead of vector images. So we could not parse the text in the image and could not handle the text splitting in our converter. To overcome this behavior, we have to enable legacy drawing from the registry settings. Please refer below kb link to avoid bitmapped output, 
 
 
After enabling legacy drawing you can avoid text split by using below code snippet, 
//Set false to avoid text split between pages 
ieConverterSettings.SplitTextLines = false; 
 
 
Note: If the input HTML contains HTML-5 or CSS-3, IE(MSHTML) will only generate bitmapped output images, and this issue could not be fixed with IE HTML converter.  
  
Suggestion: We suggest you try our new WebKit HTML converter. WebKit converter never produces bitmap document and it has more features and enhancement than IE based HTML converter. You can get latest WebKit converter from below link, 
 
Kindly refer below link for more information of our WebKit converter. 
 
Please let us know if you need any further assistance on this. 
 
Regards, 
Prakash V. 



ZW Zander Westendarp November 2, 2016 03:33 PM UTC

Thanks, Prakash!
You guys are truly and best! :)


PV Prakash Viswanathan Syncfusion Team November 3, 2016 08:58 AM UTC

Hi Zander Westendarp, 
 
Thanks for your appreciation. 
Please let us know if you need any further assistance on this, as always we will be happy to assist you. 
 
Regards, 
Prakash V. 


Loader.
Live Chat Icon For mobile
Up arrow icon