.NET 泛型字典应该初始化一个容量,该容量等于它将包含的项数吗?
  • 时间:2009-01-05 18:57:56
var myDictionary = new Dictionary<Key, Value>(100);






What you should initialize the dictionary capacity to depends on two factors: (1) The distribution of the gethashcode function, and (2) How many items you have to insert.





Improved benchmark:

  • Hardware: Intel Core i7-10700K x64, .NET 5, Optimized build. LINQPad 6 for .NET 5 run and LINQPad 5 for .NET Fx 4.8 run.
  • Times are in fractional milliseconds to 3 decimal places.
    • 0.001ms is 1 microsecond.
    • I am unsure of the actual resolution of Stopwatch as it s system-dependent, so don t stress over differences at the microsecond level.
  • Benchmark was re-run dozens of times with consistent results. Times shown are averages of all runs.
  • Conclusion: Consistent 10-20% overall speedup by setting capacity in the Dictionary<String,String> constructor.

.NET: .NET Framework 4.8 .NET 5
With initial capacity of 1,000,000
Constructor 1.170ms 0.003ms
Fill in loop 353.420ms 181.846ms
Total time 354.590ms 181.880ms
Without initial capacity
Constructor 0.001ms 0.001ms
Fill in loop 400.158ms 228.687ms
Total time 400.159ms 228.688ms
Speedup from setting initial capacity
Time 45.569ms 46.8ms
Speedup % 11% 20%
  • I did repeat the benchmark for smaller initial sizes (10, 100, 1000, 10000, and 100000) and the 10-20% speedup was also observed at those sizes, but in absolute terms a 20% speedup on an operation that takes a fraction of a millisecond
  • While I saw consistent results (the numbers shown are averages), but there are some caveats:
    • This benchmark was performed with a rather extreme size of 1,000,000 items but with tight-loops (i.e. not much else going on inside the loop body) which is not a realistic scenario. So always profile and benchmark your own code to know for sure rather than trusting a random benchmark you found on the Internet (just like this one).
    • The benchmark doesn t isolate the time spent generating the million or so String instances (caused by i.ToString().
    • A reference-type (String) was used for both keys and values, which uses the same size as a native pointer size (8 bytes on x64), so results will be different when re-run if the keys and/or values use a larger value-type (such as a ValueTuple). There are other type-size factors to consider as well.
    • As things improved drastically from .NET Framework 4.8 to .NET 5 it means that you shouldn t trust these numbers if you re running on .NET 6 or later.
      • Also, don t assume that newer .NET releases will _always) make things faster: there have been times when performance actually worsened with both .NET updates and OS security patches.
// Warmup:
    var foo1 = new Dictionary<string, string>();
    var foo2 = new Dictionary<string, string>( capacity: 10_000 );
    foo1.Add( "foo", "bar" );
    foo2.Add( "foo", "bar" );

Stopwatch sw = Stopwatch.StartNew();

// Pre-set capacity:
TimeSpan pp_initTime;
TimeSpan pp_populateTime;
    var dict1 = new Dictionary<string, string>(1000000);

    pp_initTime = sw.GetElapsedAndRestart();

    for (int i = 0; i < 1000000; i++)
        dict1.Add(i.ToString(), i.ToString());
pp_populateTime = sw.GetElapsedAndRestart();

TimeSpan empty_initTime;
TimeSpan empty_populateTime;
    var dict2 = new Dictionary<string, string>();

    empty_initTime = sw.GetElapsedAndRestart();

    for (int i = 0; i < 1000000; i++)
        dict2.Add(i.ToString(), i.ToString());
empty_populateTime = sw.GetElapsedAndRestart();


Console.WriteLine("Pre-set capacity. Init time: {0:N3}ms, Fill time: {1:N3}ms, Total time: {2:N3}ms.", pp_initTime.TotalMilliseconds, pp_populateTime.TotalMilliseconds, ( pp_initTime + pp_populateTime ).TotalMilliseconds );
Console.WriteLine("Empty capacity. Init time: {0:N3}ms, Fill time: {1:N3}ms, Total time: {2:N3}ms.", empty_initTime.TotalMilliseconds, empty_populateTime.TotalMilliseconds, ( empty_initTime + empty_populateTime ).TotalMilliseconds );

// Extension methods:

[MethodImpl( MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization )]
public static TimeSpan GetElapsedAndRestart( this Stopwatch stopwatch )
    TimeSpan elapsed = stopwatch.Elapsed;
    return elapsed;

Original benchmark:

Original benchmark:

  • With capacity (dict1) total time is 1220.778ms (for construction and population).
  • Without capacity (dict2) total time is 1502.490ms (for construction and population).
  • So a capacity saved 320ms (~20%) compared to not setting a capacity.
static void Main(string[] args)
    const int ONE_MILLION = 1000000;

    DateTime start1 = DateTime.Now;
        var dict1 = new Dictionary<string, string>( capacity: ONE_MILLION  );

        for (int i = 0; i < ONE_MILLION; i++)
            dict1.Add(i.ToString(), i.ToString());
    DateTime stop1 = DateTime.Now;
    DateTime start2 = DateTime.Now;

        var dict2 = new Dictionary<string, string>();

        for (int i = 0; i < ONE_MILLION; i++)
            dict2.Add(i.ToString(), i.ToString());
    DateTime stop2 = DateTime.Now;
    Console.WriteLine("Time with size initialized: " + (stop1.Subtract(start1)) + "
Time without size initialized: " + (stop2.Subtract(start2)));



考虑到您在`Dictionary`构造函数中指定了初始容量k, 那么:

  1. The Dictionary will reserve the amount of memory necessary to store k elements;
  2. QUERY performance against the dictionary is not affected and it will not be faster or slower;
  3. ADD operations will not require more memory allocations (perhaps expensive) and thus will be faster.


The capacity of a Dictionary(TKey, TValue) is the number of elements that can be added to the Dictionary(TKey, TValue) before resizing is necessary. As elements are added to a Dictionary(TKey, TValue), the capacity is automatically increased as required by reallocating the internal array.

If the size of the collection can be estimated, specifying the initial capacity eliminates the need to perform a number of resizing operations while adding elements to the Dictionary(TKey, TValue).

是的,与使用重新散列作为解决冲突的方法的哈希表相反,字典将使用链接。所以,是的,使用计数是一个好方法。对于哈希表,您可能想使用计数*(1 / 填充因子)。

