Although Bayesian analysis encompasses much more, the Naive Bayes algorithm well known from spam filters is based on one very fundamental assumption: all variables are essentially independent of each other. So for instance, in spam filtering each word is usually treated as a variable so this means assuming that if the email contains the word viagra , that knowledge does affect the probability that it will also contain the word medicine (or foo or spam or anything else). The interesting thing is that this assumption is quite obviously false when it comes to natural language but still manages to produce reasonable results.
Now one way people sometimes get around the independence assumption is to define variables that are technically combinations of things (like searching for the token buy viagra ). That can work if you know specific cases to look for but in general, in a game environment, it means that you can t generally remember anything. So each time you have to move, perform an action, etc, its completely independent of anything else you ve done so far. I would say for even the simplest games, this is a very inefficient way to go about learning the game.
I would suggest looking into using q-learning instead. Most of the examples you ll find are usually just simple games anyway (like learning to navigate a map while avoiding walls, traps, monsters, etc). Reinforcement learning is a type of online unsupervised learning that does really well in situations that can be modeled as an agent interacting with an environment, like a game (or robots). It does this trying to figure out what the optimal action is at each state in the environment (where each state can include as many variables as needed, much more than just where am i ). The trick then is maintain just enough state that helps the bot make good decisions without having a distinct point in your state space for every possible combination of previous actions.
To put that in more concrete terms, if you were to build a chess bot you would probably have trouble if you tried to create a decision policy that made decisions based on all previous moves since the set of all possible combinations of chess moves grows really quickly. Even a simpler model of where every piece is on the board is still a very large state space so you have to find a way to simplify what you keep track of. But notice that you do get to keep track of some state so that your bot doesn t just keep trying to make a left term into a wall over and over again.
The wikipedia article is pretty jargon heavy but this tutorial does a much better job translating the concepts into real world examples.
The one catch is that you do need to be able to define rewards to provide as the positive reinforcement . That is you need to be able to define the states that the bot is trying to get to, otherwise it will just continue forever.