Speed up filling Excel sheet

I have a dataset of 34,173 rows.
I'm creating a C# webjob (console application) using Syncfusion.XlsIO.Base v16.1460.0.24
I need to create an Excelsheet with that data. It is taking over 20 min. to do so.
So I wonder if I can speed it up a bit.
The Excelsheet will have 130 columns:

First I open a template file:
  var workbook = application.Workbooks.Open(new MemoryStream(Resource.Basis, false));
Next I access the worksheet:
  var worksheet = workbook.Worksheets[0];
Next I do a foreach on the data and use to fill all columns for all rows
  migrantRange.ResetRowColumn(row, ++column);
  migrantRange.Text = dto.Firstname;

When the loop finishes I copy the formulas from the first row that was in the template to all rows and columns:
               var endColumn = worksheet.UsedRange.LastColumn;
                for (var i = 1; i <= endColumn; i++)
                {
                    // Update column styles:
                    worksheet.Range[startRow, i, row, i].CellStyle = worksheet.Range[startRow, i].CellStyle;
                } 

And at the end I call
  workbook.SaveAs(saveAsFilename);
When I have a lot of data SaveAs to filename causes a `System.OutOfMemoryException`, at least in Debug mode, which makes it hard to debug.

I added some timing:
* Looping data (34173 rows) and filling cells took 00:00:34.6421465. This is acceptable.
* Updating column styles took 00:09:36.5526678. This takes much to long. How can I improve this?
* Saving file, crashes in my integration test but takes at least 2 minutes.

Any suggestions will be much appreciated.


Here's the stack trace when crashing after saving the file:
System.OutOfMemoryException: 
   bij System.IO.MemoryStream.set_Capacity(Int32 value)
   bij System.IO.MemoryStream.EnsureCapacity(Int32 value)
   bij System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   bij System.IO.Compression.DeflateStream.WriteDeflaterOutput(Boolean isAsync)
   bij System.IO.Compression.DeflateStream.InternalWrite(Byte[] array, Int32 offset, Int32 count, Boolean isAsync)
   bij System.IO.Compression.DeflateStream.Write(Byte[] array, Int32 offset, Int32 count)
   bij Syncfusion.Compression.Zip.ZippedContentStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   bij System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   bij System.IO.StreamWriter.Write(Char value)
   bij System.Xml.XmlTextWriter.AutoComplete(Token token)
   bij System.Xml.XmlTextWriter.WriteStartAttribute(String prefix, String localName, String ns)
   bij System.Xml.XmlWriter.WriteAttributeString(String localName, String value)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.Excel2007Serializator.SerializeBlankCell(XmlWriter writer, Int32 iRowIndex, Int32 iColumnIndex, Int32 iXFIndex, Dictionary`2 hashNewParentIndexes)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.Excel2007Serializator.SerializeCells(XmlWriter writer, RowStorage row, CellRecordCollection cells, Dictionary`2 hashNewParentIndexes, String cellTag)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.Excel2007Serializator.SerializeRow(XmlWriter writer, RowStorage row, CellRecordCollection cells, Int32 iRowIndex, Dictionary`2 hashNewParentIndexes, String cellTag, Boolean isSpansNeeded)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.Excel2007Serializator.SerializeSheetData(XmlWriter writer, CellRecordCollection cells, Dictionary`2 hashNewParentIndexes, String cellTag, Dictionary`2 additionalAttributes, Boolean isSpansNeeded)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.Excel2007Serializator.SerializeWorksheet(XmlWriter writer, WorksheetImpl sheet, Stream streamStart, Stream streamConFormats, Dictionary`2 hashXFIndexes, Stream streamExtCondFormats)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.WorksheetDataHolder.SerializeWorksheetPart(WorksheetImpl sheet, Dictionary`2 hashNewXFIndexes)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.WorksheetDataHolder.SerializeWorksheet(WorksheetImpl sheet, Dictionary`2 hashNewXFIndexes, Dictionary`2 cacheFiles)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveWorksheet(WorksheetImpl sheet, String itemName, Dictionary`2 hashNewXFIndexes, Dictionary`2 cacheFiles)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveSheet(WorksheetBaseImpl sheet, String itemName, RelationCollection relations, String workbookPath, Dictionary`2 hashNewXFIndexes, Dictionary`2 cacheFiles)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveSheets(RelationCollection relations, String workbookItemName, Dictionary`2 hashNewXFIndexes, Dictionary`2 cacheFiles)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveWorkbookPart(Dictionary`2 hashNewXFIndexes, Dictionary`2 cacheFiles)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveWorkbook(ExcelSaveType saveAsType)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveDocument(ExcelSaveType saveType)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveDocument(Stream stream, ExcelSaveType saveType)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.SaveDocument(String filename, ExcelSaveType saveType)
   bij Syncfusion.XlsIO.Implementation.XmlSerialization.FileDataHolder.Serialize(String fullName, WorkbookImpl book, ExcelSaveType saveType)
   bij Syncfusion.XlsIO.Implementation.WorkbookImpl.SaveAs(String FileName, ExcelSaveType saveType, ExcelVersion version)
   bij Syncfusion.XlsIO.Implementation.WorkbookImpl.SaveAs(String FileName, ExcelSaveType saveType)

I published my webjob to Azure and run it.
This version is not saving to file but to stream which is saved as a Storage Blob.
Actual timing:
2018-11-01 15:30:49,069 [1] DEBUG Services.XlsxExports.ExportXlsService - Start looping data. 34173 rows
2018-11-01 15:31:50,132 [1] DEBUG Services.XlsxExports.ExportXlsService - Looping data took 00:01:01.0626923
2018-11-01 15:57:04,693 [1] DEBUG Services.XlsxExports.ExportXlsService - Updating column styles took 00:25:14.3921469
2018-11-01 15:57:05,119 [1] DEBUG Services.XlsxExports.ExportXlsService - Saving stream
2018-11-01 15:59:50,069 [1] DEBUG Services.XlsxExports.ExportXlsService - Stream saved
2018-11-01 16:00:05,628 [1] DEBUG Services.XlsxExports.ExportXlsService - Workbook closed.

As you can see the main bottleneck is in the 'Updating column styles'.

6 Replies

PM Paul Meems November 2, 2018 07:57 AM UTC

I did some more testing.

When I comment these lines:

var endColumn = worksheet.UsedRange.LastColumn;
for (var i = 1; i <= endColumn; i++)
{
  // Update column styles:
  worksheet.Range[startRow, i, row, i].CellStyle = worksheet.Range[startRow, i].CellStyle;
}

My code is much faster. 3 minutes vs 25 minutes AND I can save the file without getting an OutOfMemory exception.

The result is:


But I want it like this:

The cell style of the first cell of every row applied to every cell in that column.

How to do this more efficiently?


PK Prakash Kumar D Syncfusion Team November 2, 2018 12:20 PM UTC

Hi Paul, 
 
Thank you for using Syncfusion support. 
 
We have tried to reproduce the issues, but it works properly at our end. We suspect that the issue might be occurred due to given column styles. However, we suggest you use the below code snippet to improve the performance while updating the column styles. 
 
Code Snippet: 
for (var i = 1; i <= endColumn; i++) 
  { 
  //Gets the style index of a cell (first row of each column) 
  int index = (sheet[1, i] as RangeImpl).ExtendedFormatIndex; 
  for (int iRow = startRow; iRow <= row; iRow++) 
    { 
    // Update column styles: 
       (sheet as WorksheetImpl).CellRecords.SetCellStyle(iRow, i, index); 
     } 
   } 
 
Please let us know if the above solution helps you. 
 
Regards 
Prakash Kumar 



PM Paul Meems November 2, 2018 03:13 PM UTC

Thanks for the reply.
I will try your suggestion.

After applying the style I also change the width of the column. This takes very long as well.
Do you also have a faster solution:

// Update column width
// Get current width header:
var columnHeaderWidth = worksheet.Range[startRow - 1, i].ColumnWidth;
// Make columns autofit:
worksheet.Range[startRow, i, row, i].AutofitColumns();
// Check new width:
var newWidth = worksheet.Range[startRow - 1, i].ColumnWidth;
// If smaller, reset:
if (newWidth < columnHeaderWidth)
{
    worksheet.Range[startRow - 1, i].ColumnWidth = columnHeaderWidth;
}

Thanksk


DB Dilli Babu Nandha Gopal Syncfusion Team November 5, 2018 10:17 AM UTC

Hi Paul, 
 
Thank you for your update. 
 
We request to get the column width using IWorksheet.GetColumnWidth(int iColumnindex) and set the column width using IWorksheet.SetColumnWidth(int iColumnindex, double value) method to improve the performance. Please refer the code snippet below. 
 
Code snippet: 
for (var i = 1; i <= endColumn; i++) 
{ 
        // Update column width 
        // Get current width header: 
        var columnHeaderWidth = worksheet.GetColumnWidth(i); 
       // Make columns autofit: 
       worksheet.AutofitColumn(i); 
        // Check new width: 
        var newWidth = worksheet.GetColumnWidth(i); 
        // If smaller, reset: 
        if (newWidth < columnHeaderWidth) 
        { 
            worksheet.SetColumnWidth(i, columnHeaderWidth); 
        } 
} 
 
To know more about improving performance, please refer the following documentation link. 
 
Please let us know if you have any questions. 

Regards, 
Dilli babu 



PM Paul Meems November 8, 2018 08:19 AM UTC

Thanks for the snippets.

The main problem was 

  var endColumn = worksheet.UsedRange.LastColumn;

It returned 1024 instead of the actual number of columns (130). After changing this the copying of the styles took 3 min. instead of 25 min.
After replacing the code with your first snippet I could reduce the time needed to 2:40 min.

But the second snippet is still not working.
With my code the file cannot be saved and with your code I get the same error.

For now I can work without changing the width of the columns. But it would be great if this could be solved as well.


PK Prakash Kumar D Syncfusion Team November 8, 2018 12:31 PM UTC

Hi Paul, 
 
Thank you for updating us.  
 
Query 1: var endColumn = worksheet.UsedRange.LastColumn; It returned 1024 instead of the actual number of columns (130).  
 
We suspect that the input template has styles in 1024 columns. So, we request to disable the UsedRangeIncludesFormatting  property to skip the blank cells while accessing the used range. 
 
Code Snippet: 
sheet.UsedRangeIncludesFormatting = false; 
 
 
Query 2: But the second snippet is still not working. With my code the file cannot be saved and with your code I get the same error. 
 
We request to use the below code snippet to improve the performance.  
 
Code Snippet: 
sheet.UsedRangeIncludesFormatting = false; 
sheet.UsedRange.AutofitColumns(); 
Note: Invoking AutofitCoumns() will set the column width automatically. so, it is not necessary to set the column width explicitly. If the above code snippet doesn’t meet your requirement. Kindly share your complete scenario. 
 
 
Regards, 
Prakash Kumar

 


Loader.
Up arrow icon