We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

Weak read performance

I have a 100 columns and 3000 rows xls-file which I read row by row:
IRange row = ws.Rows[i]
and then getting the values of some cells in the row.

Each row reading (+accessing some 15 cells) costs me 0.5 secs (measured with a Stopwatch).
0.5secs*3000rows = 25 mins which is unquestionably unacceptable.
Are there any secret tips to make my life easier? How do I read such files?


15 Replies

VA Vadim April 23, 2008 12:51 PM UTC

Saving Rows collection a priori to iterating through the rows helps drastically as well as saving the Cells collection. Applying both we gain a 10 times performance boost.

Why are not they cached under the hoods?!

What are the guidelines in such type of work???



VA Vadim April 23, 2008 12:53 PM UTC

And it's still some 2 mins(0.05secs*3000 = 150secs = 2.5mins) which is still barely unacceptable!!!

Waiting for the reply of the support guys...



VA Vadim April 23, 2008 06:07 PM UTC

I had to do all this above even with enabled UseRangesCache...

Not to mention that loading of the file occpies 5 secs (java poi does it some as twice as faster and as ten times as faster it iterates through the file) :(



YG Yavanaarasi G Syncfusion Team April 24, 2008 01:05 PM UTC

Hi Vadim,

Sorry for the delay in responding.

Weak Read Performance:

The issue mentioned here is suspected to be a defect and we have sent this to our development team for more analysis. We will update you once we get back from them.

Please let me know if you have any other concerns.

Regards,
G.Yavana



VA Vadim April 24, 2008 05:48 PM UTC

I would accept Open method working for 5-6 secs :) and iterating should be instant (we've already loaded the data into the memory). Please provide me with some performance trick :)

Another question: how do I define real last row of the sheet (which contains some data except from styles and borders)? Is there any way to tell XlsIO miss the styles and borders?



YG Yavanaarasi G Syncfusion Team April 28, 2008 01:53 PM UTC

Hi Vadim,

Sorry for the delay in responding.

Performance:

We have forwarded to our development team regarding the performance issue and we will update you once we get back from them.

How do I define real last row of the sheet:

UsedRange property is used to get the range of used cells on a worksheet. It determines the actual used range. Using used range property you can get the specified row and apply the desired border styles.

Here is the code snippet:

sheet.UsedRange[sheet.UsedRange.End.Row,sheet.UsedRange.End.Column].EntireRow.Borders[ExcelBordersIndex.EdgeRight].LineStyle = ExcelLineStyle.None;
sheet.UsedRange[sheet.UsedRange.End.Row,sheet.UsedRange.End.Column].EntireRow.Borders[ExcelBordersIndex.EdgeLeft].LineStyle = ExcelLineStyle.None;
sheet.UsedRange[sheet.UsedRange.End.Row,sheet.UsedRange.End.Column].EntireRow.Borders[ExcelBordersIndex.EdgeBottom ].LineStyle = ExcelLineStyle.None;
sheet.UsedRange[sheet.UsedRange.End.Row,sheet.UsedRange.End.Column].EntireRow.Borders[ExcelBordersIndex.EdgeTop].LineStyle = ExcelLineStyle.None;


Please try this and let me know if this helps.

Regards,
G.Yavana




VA Vadim April 28, 2008 05:01 PM UTC

I'm sorry but this doesn't help at all. The sheet.Rows.Length still remains near 60000 whilst the last meaning row is near 3000. Maybe the fact that the sheet is protected impacts?



MA Manopriya Syncfusion Team April 30, 2008 04:38 AM UTC

Hi,

Thank you for your patience.

The issue of 'Reading values from 3000 rows takes long time' has been confirmed as a defect and a defect report has been created. You can track the status of this defect in the following link:

http://www.syncfusion.com/support/issues/xlsio/Default.aspx?ToDo=view&questId=10145

I will get back to you once this issue has been fixed.

Regards,
Manopriya



YG Yavanaarasi G Syncfusion Team April 30, 2008 09:49 AM UTC

Hi Vadim,

Thank you for your update.

As I have mentioned in the previous update "UsedRange" property is used to get the range of used cells or edited cells on a worksheet. It determines the actual used range. In your worksheet you have used 3000 rows. So it takes the last row as 3000. If you have used 60000 rows and the used range shows 3000 as the last row means please send your excel file so that we could sort out the cause of the issue and provide you a better solution.

Please let me know if you have any other concerns.

Regards,
G.Yavana






VA Vadim May 1, 2008 08:17 AM UTC

You must have misunderstood my point. UsedRange.End.Row returns 60000 while I expect it to return 3000! This is caused by the fact that I have some borders set in the cell of 60000. But the trick is I have no data after 3000! Only borders! But I need the last row contatining some data

Once again, my problems:
1) iteration over the rows isn't instant for some reason (they are loaded into the memory, aren't they?), but also it is terribly slow (some 2 secs slower than the loading actually: 8-9secs vs 5-6secs)
2) the last row isn't actually the last row with meaningful data



KH Khonsort May 1, 2008 08:41 AM UTC

Hello,

Have you try to use the "IMigrantRange" instance.

I use it to fill (create) worksheet with about the same number of rows and cols and using a MigratRange that take less than 15 sec (applying multiple styles on each cells)

Regards,



KH Khonsort May 1, 2008 08:45 AM UTC

Here is the code of one of my tools methods



private static bool SetCellValue( IMigrantRange pMigrantRange, bool pEmptyIsNan, object pValue ) {

if ( pValue == null || pValue == DBNull.Value )
if ( pEmptyIsNan ) {
pMigrantRange.Formula = "=NA()";
pMigrantRange.CellStyle.FormulaHidden = true;
return true;
}
else
pMigrantRange.Clear( ExcelMoveDirection.None );
else
pMigrantRange.Value2 = pValue;

return false;
}




internal static IRange FillHorizontalRange( IWorksheet pSheet, string pCellName, bool pNanIsEmpty,
ExcelStyleApplyer pStyleApplyer,
params double[] pValues ) {

IRange rangeName = GetRange( pSheet, pCellName );

if ( rangeName != null ) {

IMigrantRange migrantRange = pSheet.MigrantRange;

// Application des style alternatif (pair/impair)
IRange resultRange = pSheet.Range[ rangeName.Row, rangeName.Column,
rangeName.Row, rangeName.Column + pValues.Length - 1];

pStyleApplyer.ApplyRowStyle( resultRange );

for ( int i = 0; i < pValues.Length; i++ ) {

migrantRange.ResetRowColumn( rangeName.Row, rangeName.Column + i );

if ( SetCellValue( migrantRange, pNanIsEmpty, pValues[i] ) ) {
// Pour éviter de voir les "=NA()" dans le fichier généré
migrantRange.CellStyle.Font.Color = migrantRange.CellStyle.ColorIndex;
}
else {
// Application des styles cellule/cellule
pStyleApplyer.ApplyCellStyle( migrantRange, -1, 1 + i );
}
}
return resultRange;
}
return null;
}





BP Bhuvaneswari P Syncfusion Team May 5, 2008 10:02 AM UTC

Hi khonsort,

Thank you for your patience.

On further analysis, we found we found the below details.


We have used 3000 rows and 10 columns there. And we have such loop


for (int i = 1; i <= sheet.UsedRange.Rows.Length; i++)
{
for (int j = 1; j <= sheet.UsedRange.Columns.Length; j++)
{
sheet.GetText[i, j];
}
}



As we know, for-loop works in such way that after every loop, end condition is re-evaluated. So here calling UsedRange.Rows 3000 times, and UsedRange.Columns 3000*10 = 30000 times. Both Rows and Columns properties are arrays of IRange objects. These properties are evaluated each time user tries to access it. We are creating array with 3000 rows 3000 times, and with 10 columns 30000 times, and each item is Range object which should be created and initiated correctly. So we have large number of arrays, large number of ranges, and extremely inefficient memory usage. That loop must be replaced by something like this:


int iRowCount = sheet.UsedRange.Rows.Length;
int iColumnCount = sheet.UsedRange.Columns.Length;


for (int i = 1; i <= iRowCount; i++)
{
for (int j = 1; j <= iColumnCount; j++)
{
sheet.GetText( i, j );
}
}



This will increase the performance. Here we have calculated the sheet.UsedRange.Rows.Length and sheet.UsedRange.Columns.Length only once.

Here is the sample for your reference:

http://www.syncfusion.com/development/uploads/73154_sample_f75260ba.zip


This takes less than 2 secs to read the cells.

Please try this and let me know if this helps.

Best Regards,
Bhuvaan




AD Administrator Syncfusion Team May 6, 2008 08:09 PM UTC

I perform the reading in exactly this fashion. And the iteration over rows isn't instant. why? It's even more expensive than the reading for some reason! __This__ is odd!

P.S. Why don't cache the UsedRange propery under the covers? I mean the neccessity to cache it manually is by no means obvious!



YG Yavanaarasi G Syncfusion Team May 12, 2008 12:01 PM UTC

Hi Vadim,

We have forwarded this issue to our development team. We will update you on this issue within two business days.

Please let me know if you have any other concerns.

Regards,
G.Yavana


Loader.
Live Chat Icon For mobile
Up arrow icon