I have a large table of customers, invoice numbers, invoice reference and amount. Sometimes invoices get cancelled/credited. When they are cancelled, they get unique invoice number, but same invoice reference as the original invoice.

I read the entire data into an array, but i am struggling with following: In the case customer number and invoice reference is same and the total of the postings is 0, i want to delete these rows from my array.


Customer Invoice Reference Amount
3233 91324941 91324941 143.966
3233 91323172 91323172 155.418
3233 91323173 91323172 -418
3233 91330112 91324941 -143.966


Customer Invoice Reference Amount
3233 91323172 91323172 155.418
3233 91323173 91323172 -418

The two others needs to be removed, as they have the same reference and sum is zero.

PS! 我有大约70个客户和4 000个发票,以通过......办理。


也可以使用电文,可在Windows Excel 2010+和Excel 365(Windows或Ma)上查阅。


  • Select some cell in your Data Table
  • Data => Get&Transform => from Table/Range or from within sheet
  • When the PQ Editor opens: Home => Advanced Editor
  • Make note of the Table Name in Line 2
  • Paste the M Code below in place of what you see
  • Change the Table name in line 2 back to what was generated originally.
  • Read the comments and explore the Applied Steps to understand the algorithm

//Change next line to reflect actual data source
    Source = Excel.CurrentWorkbook(){[Name="Invoices"]}[Content],

//Set Column Data Types
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"Customer", Int64.Type}, {"Invoice", Int64.Type}, 
        {"Reference", Int64.Type}, {"Amount", Currency.Type}}),

/*Group by Reference and Customer
    Then Add the invoice amounts in that reference*/
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Customer","Reference"}, {
        {"Total Amt", each List.Sum([Amount]), type nullable number}, 

//Also retain the "subtable" for later expansion
        {"all", each _, type table 
            [Customer=nullable number, Invoice=nullable number, 
            Reference=nullable number, Amount=Currency.Type]}}),

//Select grouped rows where "Total Amt" is not zero
    #"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each ([Total Amt] <> 0)),

//Remove the original Customer, Reference and the Total Amt columns
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Customer","Reference", "Total Amt"}),

//Expand the subtables
    #"Expanded all" = Table.ExpandTableColumn(#"Removed Columns", "all", {"Customer", "Invoice", "Reference", "Amount"})
    #"Expanded all"


Opton 1: Dictionary

  • Group the full dataset by customer and reference columns, summing the amount column in the aggregation.
  • During the groupby, track the original row index for each group in a dictionary.
  • With the summed amounts, extract only the specific rows needed.
Option Explicit
Sub Demo()
    Dim objDic As Object, arrData
    Dim i As Long, j As Long, sKey, Cus_Ref As String
    Dim arrRes(), iIdx As Long, aRow
    Dim dAmt As Double, sCust As String
    Set objDic = CreateObject("scripting.dictionary")
    arrData = Range("A1").CurrentRegion.Value
      Dict key: Customer+Reference
      Dict item: Array(sum(Amount), row list)
    For i = LBound(arrData) + 1 To UBound(arrData)
        Cus_Ref = arrData(i, 1) & "|" & arrData(i, 3)
        If objDic.exists(Cus_Ref) Then
            dAmt = objDic(Cus_Ref)(0) + Val(arrData(i, 4))
            sCust = objDic(Cus_Ref)(1) & " " & CStr(i)
            dAmt = Val(arrData(i, 4))
            sCust = CStr(i)
        End If
        objDic(Cus_Ref) = Array(dAmt, sCust)
    Next i
    iIdx = 1
    For Each sKey In objDic.keys
        If objDic(sKey)(0) <> 0 Then
            aRow = Split(objDic(sKey)(1))
            For i = 0 To UBound(aRow)
                ReDim Preserve arrRes(1 To 4, 1 To iIdx)
                For j = 1 To 4
                    arrRes(j, iIdx) = arrData(aRow(i), j)
                Next j
                iIdx = iIdx + 1
            Next i
        End If
    Next sKey
      Transpose(arrRes) is what you are looking for
      load result into worksheet
    ActiveSheet.Range("A1").Resize(iIdx - 1, 4).Value = Application.Transpose(arrRes)
    Set objDic = Nothing
End Sub


注: 表格名称Sales

Option Explicit
Sub ExtractADO()
    Dim conn As Object
    Dim rs As Object
    Dim strSql As String
    Set conn = CreateObject("ADODB.Connection")
    conn.Open "Provider=Microsoft.ACE.OLEDB.12.0" & _
        ";Data Source=" & ThisWorkbook.FullName & _
        ";Extended Properties=""Excel 12.0;HDR=Yes;IMEX=1"";"
    Set rs = CreateObject("ADODB.Recordset")
    strSql = " SELECT T1.* FROM [Sales$] T1 " & _
    " INNER JOIN (SELECT Customer,Reference FROM [Sales$] " & _
    " GROUP BY Customer,Reference HAVING sum(Amount)<>0) T2 " & _
    "  ON T1.Customer=T2.Customer AND T1.Reference=T2.Reference "
    rs.Open strSql, conn
    Cells(1, 1).CopyFromRecordset rs
    Set rs = Nothing
    Set conn = Nothing
End Sub

