English 中文(简体)
Open XML SDK v2.0 Performance issue when deleting a first row in 20,000+ rows Excel file
原标题:

Do anyone come across a performance issue when deleting a first row in a 20,000+ rows Excel file using OpenXML SDK v2.0?

I am using the delete row coding suggested in the Open XML SDK document. It takes me several minutes just to delete the first row using Open XML SDK, but it only takes just a second in Excel applicaton.

I eventually found out that the bottle-neck is actually on the bubble-up approach in dealing with row deletion. There are many rows updating after the deleted row. So in my case, there are around 20,000 rows to be updated, shifting up the data row by row.

I wonder if there is any faster way to do the row deletion.

Do anybody have an idea?

最佳回答

Well, the bad news here is: yep, that s the way it is.

You may get slightly better performance moving outside of the SDK itself into System.IO.Packaging and just creating an IEnumerable/List in like Linq-to-XML of all the rows, copy that to a new IEnumerable/List without the first row, rewrite the r attribute of <row r="?"/> to be it s place in the index, and the write that back inside <sheetData/> over existing children.

You d need to kind of do the same for any strings in the sharedStrings.xml file - i.e. removing the <ssi>.<si> elements that were in the row that was deleted, but in this case they are now implicitly indexed, so you d be able to get away with just outright removing them.

问题回答

The approach of unzipping the file, manipulating it and repacking it is very error-prune.

How about this: If you say, that it works fine in Excel: Have you tried to use the Interop? This starts a new instance of Excel (either visible or invisible), then you can open the File, delete the line, save and close the application again.

using System;
using System.IO;
using Microsoft.Office.Interop.Excel;
using Excel = Microsoft.Office.Interop.Excel;
public void OpenAndCloseExcel() 
{
    Excel.Application excelApp = new Excel.Application();
    // Open Workbook, open Worksheet, delete line, Save
    excelApp.Quit();
}

The Range-object is qualified for many purposes. Also for deleting elements. Have a look at: MSDN Range-Description. One more hint: Interop uses Excel, so all Objects have to be adressed with a 1-based index! For more resources take a look at this StackOverflow-thread.





相关问题
import of excel in SQL imports NULL lines

I have a stored procedure that imports differently formatted workbooks into a database table, does work on them then drops the table. Here is the populating query. SELECT IDENTITY(INT,1,1) AS ID ...

Connecting to Oracle 10g with ODBC from Excel VBA

The following code works. the connection opens fine but recordset.recordCount always returns -1 when there is data in the table. ANd If I try to call any methods/properties on recordset it crashes ...

Excel date to Unix timestamp

Does anyone know how to convert an Excel date to a correct Unix timestamp?

C# GemBox Excel Import Error

I am trying to import an excel file into a data table using GemBox and I keep getting this error: Invalid data value when extracting to DataTable at SourceRowIndex: 1, and SourceColumnIndex: 1. As ...

Importing from excel "applications" using SSIS

I am looking for any tips or resources on importing from excel into a SQL database, but specifically when the information is NOT in column and row format. I am currently doing some pre-development ...

热门标签