If the array is sorted you can find duplicates by looking a neighbours. To compare absolute values to need to start at both the start and the end. This avoid creating a new structure.
EDIT:IMHO HashMap/HashSet是O(log(n))由于碰撞而成的,只有O(1)有完美的散射功能。 我本会认为,不会产生更快但似乎只有4x条快的物体。
In summary, you can see that using a Set is simpler, clearer and easier to maintain. It is still very fast and would be the best solution in 98% of cases.
public static void main(String[] args) throws Exception {
for (int len : new int[]{100 * 1000 * 1000, 10 * 1000 * 1000, 1000 * 1000, 100 * 1000, 10 * 1000, 1000}) {
int[] nums = new int[len];
for (int i = 0; i < len; i++)
nums[i] = (int) (Math.random() * (Math.random() * 2001 - 1000));
Arrays.sort(nums);
long timeArray = 0;
long timeSet = 0;
int runs = len > 1000 * 1000 ? 10 : len >= 100 * 1000 ? 100 : 1000;
for (int i = 0; i < runs; i++) {
long time1 = System.nanoTime();
int count = countDistinct(nums);
long time2 = System.nanoTime();
int count2 = countDistinctUsingSet(nums);
long time3 = System.nanoTime();
timeArray += time2 - time1;
timeSet += time3 - time2;
assert count == count2;
}
System.out.printf("For %,d numbers, using an array took %,d us on average, using a Set took %,d us on average, ratio=%.1f%n",
len, timeArray / 1000 / runs, timeSet / 1000 / runs, 1.0 * timeSet / timeArray);
}
}
private static int countDistinct(int[] nums) {
int lastLeft = Math.abs(nums[0]);
int lastRight = Math.abs(nums[nums.length - 1]);
int count = 0;
for (int a = 1, b = nums.length - 2; a <= b;) {
int left = Math.abs(nums[a]);
int right = Math.abs(nums[b]);
if (left == lastLeft) {
a++;
lastLeft = left;
} else if (right == lastRight) {
b--;
lastRight = right;
} else if (lastLeft == lastRight) {
a++;
b--;
lastLeft = left;
lastRight = right;
count++;
} else if (lastLeft > lastRight) {
count++;
a++;
lastLeft = left;
} else {
count++;
b--;
lastRight = right;
}
}
count += (lastLeft == lastRight ? 1 : 2);
return count;
}
private static int countDistinctUsingSet(int[] nums) {
Set<Integer> s = new HashSet<Integer>();
for (int n : nums)
s.add(Math.abs(n));
int count = s.size();
return count;
}
印刷
每100 000 000 000 000人中,平均使用279 623人,平均使用1 270 029人,比率=4.5
For 10,000,000 numbers, using an array took 28,525 us on average, using a Set took 126,591 us on average, ratio=4.4
1 000 000 000人,使用阵列平均为2 846人,使用一套办法平均为12 131人,比率为4.3
100 000人,使用阵列平均为297人,采用一套计算方法平均为1 239人,比率为4.2
10 000人中,使用阵列平均占42人,使用一套计算方法平均占156人,比率=3.7
1 000人,使用阵列平均为8人,平均使用一套计算方法计算30人,比率=3.6。
在“Kevin K”点,即使Integer通过它具有独特的散射价值,也有可能发生碰撞,它可绘制出与能力有限的同一桶。
public static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
public static void main(String[] args) throws Exception {
Map<Integer, Integer> map = new HashMap<Integer, Integer>(32, 2.0f);
for (int i = 0; i < 10000 && map.size() < 32 * 2; i++) {
if (hash(i) % 32 == 0)
map.put(i, i);
}
System.out.println(map.keySet());
}
印刷
[2032、2002、1972、1942、1913、183、183、183、183、183、183、183、1669、1642、1608、152、1548、1524、1454、1456、1426、1405、1375、1337、1307、1255、1221、1187、1153、1134、1100、1066、1032、1016、986、956、926、881、851、821、791、747、713、6、6、610、610、35、576、550、37、508、137、440
这些价值观正相反,因为哈希姆普公司已形成联系文化。