Problem when opening certain excel files with missing cell nodes

Hello,

I encounter a problem when I upload certain excel files to our platform using XLSIO. The version I use is 18.1.0.42.

I use a certain excel template that my client fills. After the excel is uploaded, I get its xml and send it to a database where it gets parsed and processed. 
The problem is that some of the cells could remain empty and for some of the excels that the client fills that is not a problem. But for others like the one in the attachment the xml and the the way the engine interprets the excel is not as expected in regards to the empty cells. You'll find 2 pictures inside the attachment showing what I mean. This behaviour renders our parsing useless because there is no way we could retain a direct correlation between the header row and the subsequent rows, leading to xml parsing problems and shifts in column data for the rows that have empty cells.

Could you help us with identifying this problem? Is this working as intended or is there a problem in the excel itself? How can I fix this issue?

Inside the attachment you'll find the excel with problem, pictures with how the xml should look like and how it actually looks like and the xmls resulted labeled good/bad. The "good" one is obtained using the same data from the bad excel but with data copied from the bad one directly into a new excel file.

This is the code we use to open and get the xml:
  
using (ExcelEngine excelEngine = new ExcelEngine())
{
     IApplication application = excelEngine.Excel;
     System.IO.MemoryStream stream = new System.IO.MemoryStream();
     IWorkbook workbook = excelEngine.Excel.Workbooks.Open(new MemoryStream(data), ExcelOpenType.Automatic);
     workbook.SaveAsXml(stream, ExcelXmlSaveType.MSExcel);
     workbook.Close();

      //extraction of string from the stream using System.Text.UTF8Encoding.UTF8.GetString(stream.ToArray()) then sending the string to the database for parsing and             processing

}

Attachment: syncfusion_issue_9112f8b5.7z

5 Replies

KK Konduru Keerthi Konduru Ravichandra Raju Syncfusion Team July 30, 2020 03:15 PM UTC

Hi David, 

Greetings from Syncfusion. 

we have understood your query with the shared screenshots and trying to reproduce it. Kindly let us know if the shared bad_file.xlsx can be used to reproduce the issue.  
  
It would be more helpful for us if you could share the issue reproducing sample and input Excel document if any. 

Regards, 
Keerthi.


DD David Dumitru July 31, 2020 08:19 AM UTC

Hi,

Yes you can use the bad_file.xlsx file to reproduce the issue.


KK Konduru Keerthi Konduru Ravichandra Raju Syncfusion Team August 3, 2020 12:02 PM UTC

Hi David, 

Thanks for sharing the details. 

We found that styles are set for empty cells in the bad_file.xlsx document and hence they are being serialized in the output file. The behavior of Syncfusion XlsIO is to serialize the empty cells if they have styles set.  

Kindly let us know if you need any further assistance. 

Regards, 
Keerthi. 



DD David Dumitru August 6, 2020 08:01 AM UTC

But it's ok that they are serialized, the problem is the order in which they appear to be serialized. You can compare the order of the header row and the order of the next row. You can see the difference in the 2 pictures. The bad_file.xlsx has the empty cells serialized at the end which is not something to be desired. It should be the way it is in the ok.jpg file.


KK Konduru Keerthi Konduru Ravichandra Raju Syncfusion Team August 7, 2020 04:01 PM UTC

Hi David, 

Thanks for the update. 

We have created a new support incident under your direct-trac account and request you to follow it. 

Regards, 
Keerthi. 


Loader.
Up arrow icon