I am using XlsIo to parse large (50k rows) Excel documents. This takes a while. I'm not expecting it to be instant, but I'm wondering if there's any simple tricks to improve performance. An example of my current code:
using (ExcelEngine ee = new ExcelEngine())
{
IWorkbook workbook = await ee.Excel.Workbooks.OpenAsync(await schema.OpenStreamForReadAsync());
IEnumerable<IRange> rows = workbook.Worksheets[0].Rows.Skip(1);
foreach (IRange row in rows)
{
double latitude = row.Columns[0].Number;
double longitude = row.Columns[1].Number;
if (!double.IsNaN(latitude) && !double.IsNaN(longitude))
{
Location location = new Location(latitude, longitude);
}
}
}
For my test spreadsheet, this is taking about six seconds. The spreadsheet has nine columns, and if I remove all but the first two (which I'm actually interested in), I can get this down to under two seconds.
I notice that XlsIo tries to parse the values of each cell in a number of ways, which gives you nice Number and DateTime properties, but this probably has a performance cost. Is there any way to customize this behavior? I'd like to tell the parser: "the first two columns are numbers, don't do anything with the rest", or "columns zero and one are numbers, two is a DateTime, so don't try to parse dates into numbers or anything".
Or maybe there's some other trick I'm missing?