Question

If I have an algorithm which is comprised of (let s say) three sub-algorithms, all with different O() characteristics, e.g.:

algorithm A: O(n)
algorithm B: O(log(n))
algorithm C: O(n log(n))

How do I theoretically estimate the O() for the entire algorithm? I.e. not clocking it or performing any other practical measurement.

Is there a well known formula or procedure?

Answer 1

Is there a well known formula or procedure?

Since that’s basic maths, yes. But it’s even simpler in your case: since all of these give an upper bound, you can simply take the maximum of all the bounds to get the total upper bound.

– Provided that all of these algorithms are chained (rather than nested). If the algorithms are nested, you need to multiply their upper bounds (in the simplest case).

To illustrate, let’s say that you have a container data structure which has a cost of O(log n) for lookup of a single element. You also have an algorithm that needs such a lookup, but has running time O(n) itself, assuming constant costs for lookup, and using a single loop over the input, with a constant number of lookups per loop.

Now, if you want to use the previously mentioned container data structure with this algorithm, its lookup runtime obviously replaces the (assumed) constant-time lookup. So we’ve got the same loop, but each of its iterations now takes O(log n) instead of constant-time O(1), so the overall runtime becomes O(n log n).

Answer 2

The "total" complexity is the worst case of all sub algorithms. In your example: O(n*log(n))

Definition of O(f(n)) is that starting from some number (some n0), there is a constant, Const , where the total number of actions on input of size n is less than Const*f(n).

Therefore, if you have group of sub algorithms, the complexity will always be less then Sigma of all consts (Const1 + Const2 + ...) multiplied by the worst complexity function (e.g., from "n", "n*logn" and "n^2" it ll be n^2). By complexity definition it s the worst complexity in the group.

Example:

Let s assume we re sorting an array (n*logn)
Finding the item with key higher then x (logn)

Assume Const1 of the sort is 5 (meaning we can sort list of n items in less than 5*n*logn actions) and the Const2 is 3 (meaning we can find the element in less than 3*logn actions).

In this case, it s clear that the total number of action of both algorithms is less than (5+3)*n*logn actions, and therefore it O(n*logn)

Answer 3

One of the assumptions with O(n) algorithm efficiency estimations is that our actual running time approaches the theoretical value. Thus, we shouldn t get too wrapped around the axle trying to figure out small variances. An O(n) algorithm may finish in O(n/2) time if we re doing a simple iterative search through an unsorted dataset (on average we ll find the value by the halfway point) for instance, but it is still an O(n) algorithm.

If you have a function that has to complete three subprocesses on a dataset before it can exit, then the slowest subprocess s running time on that dataset will be the longest pole in the tent, so to speak. In the specific example given, the function ( algorithm ) runs in O(n) time, and we don t worry if it s O(n + n*log(n) + log(n)); the difference between that sum and O(n) is at most a factor of two. What we care about is the relative magnitude, i.e., whether running time is log(n), or (n) or (n^2) or (n^3) ad infinitum. What we care about is a factor of 10, or 100, or 1000.

Answer 4

The complexity of a problem is determined with the conditions of "n" tending towards infinity. This link is fundamentally why all O(n) numbers of less complexity are dropped; even the O(n) format drops other polynomial terms which would change on different hardware. Reasonably, you can add the various total times if you have the times of the functions. This could be meaningful data in a time dependent environment, such as handling large sets of data, where the called functions are called more than once. This could also give scalability of a solution and the performance peak of an upgrade on, let us say, a server. This would be a one machine solution, and coefficients would not be dropped either.

All machines will have various running coefficients in performing any function based on architecture and how the compiler generates the binary code. This means, if you are designing programs for multiple users, and they are on different computers , then the specific calculations are not only extraneous, but inaccurate as well.

In the case where the calculations are not extraneous or inaccurate:

The trick with separate structures is the time function of one is not the time function of all of the others.

O(n) = x + y + z, 
x(n) = (t1) * (n^2)
y(n) = (t2) * (log n) 
z(n) = (t3) * (n log n)

...

Time (t1), (t2), and (t3) are given as time of the function on a particular machine.

友情链接