我有特克斯特案,其中载有普韦6号地址范围和其他信息,如:
minip maxip border ...
2001:: 2001:0:ffff:ffff:ffff:ffff:ffff:ffff domestic nan Hurricane Electric LLC nan nan nan nan nan nan
2001:1:: 2001:1:ffff:ffff:ffff:ffff:ffff:ffff outbound nan nan nan nan nan nan nan nan
2001:2:: 2001:3:ffff:ffff:ffff:ffff:ffff:ffff outbound nan nan nan nan nan nan nan nan
2001:4:: 2001:4:ff:ffff:ffff:ffff:ffff:ffff outbound nan nan nan nan nan nan nan nan
2001:4:1000:: 2001:4:1fff:ffff:ffff:ffff:ffff:ffff outbound nan nan nan nan nan nan nan nan
It contains millions of records, I want to create a hash map use the records. And other modules can query the hash table and get corresponding information by given an ipv6 address. The problem is How can i generate hash key from the minip and maxip, eg: minip = 2001::, maxip = 2001:0:ffff:ffff:ffff:ffff:ffff:ffff, and the query ipv6 address is 2001:0:50:124:70:65:80:214. If I generate the hash key with high 4 Bytes, there can be a long chain of conflicts and the query efficiency is low. eg:
2001:250:2:: 2001:250:2:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:3:: 2001:250:3:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:4:: 2001:250:4:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:5:: 2001:250:5:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:6:: 2001:250:7:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:8:: 2001:250:f:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:10:: 2001:250:1f:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:20:: 2001:250:3f:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
2001:250:40:: 2001:250:5f:ffff:ffff:ffff:ffff:ffff domestic nan nan nan nan 86 110000 110000 51
Also, I can t use every class C network number as a Hash key like IPv4 does. IPv6 address is 128 bits, and ipv6 does not have a clear network division. If you use the first 8 Bytes as network numbers, there may be many network numbers in each record belonging to [minip, maxip]. At the same time, this does not guarantee that the conflict chain is relatively short, and will cause a lot of memory consumption. How can I design a hash map so that memory consumption is within acceptable limits and queries are as efficient as possible?