English 中文(简体)
Excel Find Speed vs. VBA binary Search?
原标题:

How good/fast is Excel VBA s Find vs. binary search? My platform is Office 11|2003 and I ll be searching for strings against Column A on three sheets of values. Total number of rows ~140,000

If worth it which Library & functions should I reference to do the sorting and then the binary search? Binary searching strings/text reportedly has potential problems.

... one thing must be noted. Using binary search formulas with sortedtextrequires caution. Aladin A., Excel MVP

Excel Find:

Worksheets(1).Range("A:A").Find("PN-String-K9", LookIn:=xlValues, LookAt:=xlWhole)
最佳回答

Much against my intuition a VBA binary search strongly outperforms an Excel Find. At least with the scenario below where 120,000 6 character strings are distributed evenly over 3 worksheets.

Excel Find takes 1 minute 58 seconds,
VBA binary search takes 36 seconds on my particular machine.

The advantage of knowing that the text is in order obviously outweighs Excel s natural advantage. Note Aladin A s warning about sort order.

Option Explicit

  Call Search to look for a thousand random strings
  in 3 worksheets of a workbook

  requires a workbook with 3 sheets and
  column A populated with values between "00001" to "120000"
  split evenly 40,000 to a worksheet in ascending order.
  They must be text, not numbers.

Private Const NUM_ROWS As Long = 120000
Private Const SHEET_1 As String = "Sheet1"
Private Const SHEET_2 As String = "Sheet2"
Private Const SHEET_3 As String = "Sheet3"

  This uses VBA Binary Search
Public Sub Search()
    Worksheets(SHEET_1).Range("B:B").ClearContents
    Worksheets(SHEET_2).Range("B:B").ClearContents
    Worksheets(SHEET_3).Range("B:B").ClearContents
    DoSearch True         change to False to test Excel search
End Sub

  Searches for a thousand values using binary  or excel search depending on
  value of bBinarySearch
Public Sub DoSearch(ByVal bBinarySearch As Boolean)
    Debug.Print Now
    Dim ii As Long

    For ii = 1 To 1000
        Dim rr As Long
        rr = Int((NUM_ROWS) * Rnd + 1)
        If bBinarySearch Then
            Dim strSheetName As String
            Dim nRow As Long
            If BinarySearch(MakeSearchArg(rr), strSheetName, nRow) Then
                Worksheets(strSheetName).Activate
                Cells(nRow, 1).Activate
            End If
        Else
            If Not ExcelSearch(SHEET_1, MakeSearchArg(rr)) Then
                If Not ExcelSearch(SHEET_2, MakeSearchArg(rr)) Then
                    ExcelSearch SHEET_3, MakeSearchArg(rr)
                End If
            End If
        End If
        ActiveCell.Offset(0, 1).Value = "FOUND"
    Next
    Debug.Print Now

End Sub

  look for one cell value using Excel Find
Private Function ExcelSearch(ByVal strWorksheet As String _
  , ByVal strSearchArg As String) As Boolean
    On Error GoTo Err_Exit
    Worksheets(strWorksheet).Activate
    Worksheets(strWorksheet).Range("A:A").Find(What:=strSearchArg, LookIn:=xlValues, LookAt:= 
        xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=True 
        , SearchFormat:=False).Activate
    ExcelSearch = True
    Exit Function
Err_Exit:
    ExcelSearch = False
End Function

  Look for value using a vba based binary search
  returns true if the search argument is found in the workbook
  strSheetName contains the name of the worksheet on exit and nRow gives the row
Private Function BinarySearch(ByVal strSearchArg As String _
  , ByRef strSheetName As String, ByRef nRow As Long) As Boolean
    Dim nFirst As Long, nLast As Long
    nFirst = 1
    nLast = NUM_ROWS
    Do While True
        Dim nMiddle As Long
        Dim strValue As String
        If nFirst > nLast Then
            Exit Do       Failed to find search arg
        End If
        nMiddle = Round((nLast - nFirst) / 2 + nFirst)
        SheetNameAndRowFromIdx nMiddle, strSheetName, nRow
        strValue = Worksheets(strSheetName).Cells(nRow, 1)
        If strSearchArg < strValue Then
            nLast = nMiddle - 1
        ElseIf strSearchArg > strValue Then
            nFirst = nMiddle + 1
        Else
            BinarySearch = True
            Exit Do
        End If
    Loop
End Function

  convert 1 -> "000001", 120000 -> "120000", etc
Private Function MakeSearchArg(ByVal nArg As Long) As String
    MakeSearchArg = Right(CStr(nArg + 1000000), 6)
End Function

  converts some number to a worksheet name and a row number
  This is depenent on the worksheets being named sheet1, sheet2, sheet3

  and containing an equal number of vlaues in each sheet where
  the total number of values is NUM_ROWS
Private Sub SheetNameAndRowFromIdx(ByVal nIdx As Long _
  , ByRef strSheetName As String, ByRef nRow As Long)
    If nIdx <= NUM_ROWS / 3 Then

        strSheetName = SHEET_1
        nRow = nIdx
    ElseIf nIdx > (NUM_ROWS / 3) * 2 Then
        strSheetName = SHEET_3
        nRow = nIdx - (NUM_ROWS / 3) * 2
    Else
        strSheetName = SHEET_2
        nRow = nIdx - (NUM_ROWS / 3)
    End If
End Sub
问题回答

I find using the AutoFilter works a lot faster than manually searching the records with any method.

I filter, check to see if there are any results, then move on. If any are found (by checking the count of results), I can search the small portion that are filtered manually or return them all.

I used it on around 44,000 records, searching for a list of 100+ parts against it.

Binary searches can easily get stuck in infinite loops if you re not careful.

If you use vlookup with the sorted option, it will likely be faster than your vba.

I have become interested in this because I was using the .Find function, and on one PC it failed to work on some lookups, but on another it was OK! So I ve done some testing on timings - I ve got a sheet with 985 names sorted into order, and I ve written a small subroutine to run through them, and look each one up in the same list using a different method (the times are in milliseconds):

  1. Brute force: 2000
  2. .Find: 750
  3. Application.VLookup: 265
  4. Binary Search: 234

The problem with VLookup is that it can t return the row number unless you include it in your table.

Here is my code for the Binary search, I ve assumed that the sheet has a header row, but you could easily modify the header and code to pass that information. The optional Col parameter is used to indicated if you want the row number, or a cell s value. The function returns 0 (zero) if the find fails.

Function Find(Sheet As Worksheet, What As String, Optional Col As Long = 0) As Variant
Dim Top As Long
Dim Mid As Long
Dim Bot As Long  Bottom
Dim S As String
Dim T As String

   With Sheet
     Top = 2  Sheet has a header row
     Bot = .UsedRange.Rows.Count
     S = LCase(What)
     Do
       Mid = (Top + Bot) / 2
       T = LCase(.Cells(Mid, 1))
       Select Case True
       Case T > S
         Bot = Mid - 1
       Case T < S
         Top = Mid + 1
       Case Else  T = S
         If Col = 0 Then
           Find = Mid   Return the row
         Else
           Find = .Cells(Mid, Col).Value2  Return the cell s value
         End If
         Exit Function
       End Select
     Loop Until Bot < Top
   End With
   Find = 0
End Function




相关问题
import of excel in SQL imports NULL lines

I have a stored procedure that imports differently formatted workbooks into a database table, does work on them then drops the table. Here is the populating query. SELECT IDENTITY(INT,1,1) AS ID ...

Connecting to Oracle 10g with ODBC from Excel VBA

The following code works. the connection opens fine but recordset.recordCount always returns -1 when there is data in the table. ANd If I try to call any methods/properties on recordset it crashes ...

Excel date to Unix timestamp

Does anyone know how to convert an Excel date to a correct Unix timestamp?

C# GemBox Excel Import Error

I am trying to import an excel file into a data table using GemBox and I keep getting this error: Invalid data value when extracting to DataTable at SourceRowIndex: 1, and SourceColumnIndex: 1. As ...

Importing from excel "applications" using SSIS

I am looking for any tips or resources on importing from excel into a SQL database, but specifically when the information is NOT in column and row format. I am currently doing some pre-development ...

热门标签