Fred, I notice that you said in a comment that you re on OS X. The best way to get very accurate timings of small-scale functions on OS X is with the mach_absoute_time( )
function. You can use it as follows:
#include <mach/mach_time.h>
#include <stdint.h>
int loopCount;
uint64_t startTime = mach_absolute_time( );
for (loopCount = 0; loopCount < iterations; ++loopCount) {
functionBeingTimed( );
}
uint64_t endTime = mach_absolute_time( );
double averageTime = (double)(endTime-startTime) / iterations;
This gives you the average timing across iterations
calls to the function. This can be affected somewhat by effects outside of your process on the system. Thus, you may instead want to take the fastest time:
#include <mach/mach_time.h>
#include <stdint.h>
int loopCount;
double bestTime = __builtin_inf();
for (loopCount = 0; loopCount < iterations; ++loopCount) {
uint64_t startTime = mach_absolute_time( );
functionBeingTimed( );
uint64_t endTime = mach_absolute_time( );
double bestTime = __builtin_fmin(bestTime, (double)(endTime-startTime));
}
This can have its own problems, especially if the function being timed is very very fast. You need to think about what you are really trying to measure and pick an approach that is scientifically justified (good experimental design is hard). I often use a hybrid between these two approaches as a first attempt at measuring a novel task (a minimum of averages over many calls).
Note also that in the code samples above, the timings are in "mach time units". If you just want to compare algorithms, this is usually fine. For some other purposes, you may want to convert them to nanoseconds or cycles. To do this, you can use the following functions:
#include <mach/mach_time.h>
#include <sys/sysctl.h>
#include <stdint.h>
double ticksToNanoseconds(double ticks) {
static double nanosecondsPerTick = 0.0;
// The first time the function is called
// ask the system how to convert mach
// time units to nanoseconds
if (0.0 == nanosecondsPerTick) {
mach_timebase_info_data_t timebase;
// to be completely pedantic, check the return code of this call:
mach_timebase_info(&timebase);
nanosecondsPerTick = (double)timebase.numer / timebase.denom;
}
return ticks * nanosecondsPerTick;
}
double nanosecondsToCycles(double nanoseconds) {
static double cyclesPerNanosecond = 0.0;
// The first time the function is called
// ask the system what the CPU frequency is
if (0.0 == cyclesPerNanosecond) {
uint64_t freq;
size_t freqSize = sizeof(freq);
// Again, check the return code for correctness =)
sysctlbyname("hw.cpufrequency", &freq, &freqSize, NULL, 0L );
cyclesPerNanosecond = (double)freq * 1e-9;
}
return nanoseconds * cyclesPerNanosecond;
}
Be aware that the conversion to nanoseconds will always be sound, but the conversion to cycles can go awry in various ways, because modern CPUs do not run at one fixed speed. Nonetheless, it generally works pretty well.