Question

Wikipedia says on A* complexity the following (link here):

More problematic than its time complexity is A*’s memory usage. In the worst case, it must also remember an exponential number of nodes.

I fail to see this is correct because:

Say we explore node A, with successors B, C, and D. Then we add B, C, and D to the list of open nodes, each accompanied by a reference to A, and we move A from the open nodes to the closed nodes.

If at some time we find another path to B (say, via Q), that is better than the path through A, then all that is needed is to change B s reference to A to point to Q and update its actual cost, g (and logically f).

Therefore, if we store in a node its name, its referring node name, and its g, h, and f scores, then the maximum amount of nodes stored is the actual amount of nodes in the graph, isn t it? I really cannot see why at any time the algorithm would need to store an amount of nodes in memory that is exponential to the length of the optimal (shortest) path.

Could someone please explain?

edit As I understand now reading your answers, I was reasoning from the wrong viewpoint of the problem. I took for granted a given graph, whereas the exponential complexity refers to a an conceptual graph that is defined solely by a "branching factor".

Answer 1

A* is just a guided version of breadth-first search, which is exponential in memory complexity with respect to the length of the solution.

When using a constant heuristic, A* will become a normal breadth-first search; uniform cost search to be exact.

When using the optimal heuristic, A* will be O(n) in both space and time complexity if we disregard the complexity of the heuristic calculation itself. Again n is the length of the solution path.

Answer 2

I think the exponential-ness comes into play when you backtrack to node B to expand it, but then backtrack to node C to expand it, and then backtrack to node D. Now we have to keep track of all the children of nodes A, B, C, and D.

The backtracking is based on the cost of the edges to move to the next node, so this is a real possibility, but is the worse case.

If each node has exactly 2 children off of it, and each node has the same cost, then the equation is 2^n, where n is the depth of the search so far.

For example, you start off with node 0. 0 has 2 children 00 and 01. 00 has 2 children 000 and 001. At the worse case with a depth of 4 the equation is 2^4, where 2 is the number of children each node has and 4 is the depth of the search.

Answer 3

I am not an expert, but I studied the Wikipedia article for a while and my explanation would be this one (hope i have understood it well :)

Say, we have a 4x4 matrix of nodes.
A,B,C,D are the directions we can take at a given time (North,South,East,West)
The A* algorithm starts searching.

A
Queue: B,C,D
AA
Queue: B,C,D,AB,AC,AD
AAA-->Goal
Queue: B,C,D,AB,AC,AD,AAB,AAC,AAD
The goal is reached but there are still other possibilities to consider.

D
Queue: B,C,AB,AC,AD,AAB,AAC,AAD
DC
Queue: B,C,AB,AC,AD,AAB,AAC,AAD,DA,DB,DD
DCA
Queue: B,C,AB,AC,AD,AAB,AAC,AAD,DA,DB,DD,DCB,DCC,DCD
DCAB-->Goal
Queue: B,C,AB,AC,AD,AAB,AAC,AAD,DA,DB,DD,DCB,DCC,DCD,DCAA,DCAC,DCAD
Etc etc

As you can see, for every step taken, three more nodes are added to the queue.
Since A* follows only acyclic paths [1], the maximum number of steps per route is 15.
The max number of possible routes in this case is 3^15, or directions^nodes.
Since every route has 15 steps,the worst case steps taken is 15*3^15.
In the absolute worst case, every step ever taken is "wrong".
In that case 3*15*3^15 nodes are in the queue before finding the answer.
So the worst case amount of nodes that needs to be kept in memory is a constant, to the power of the number of nodes available. In other words the memory use is exponential to the amount of nodes.

[1] http://www.autonlab.org/tutorials/astar08.pdf, slide 15

友情链接