Question

I need to know whether the Naive bayesian classifier can be used to generate multiple decisions. I couldn t find any examples which have any evidence in supporting multiple decisions. I m new to this area. So, I m bit confused.

Actually I need to develop character recognition software. There I need to identify what the given character is. It seems the Bayesian classifier can be used to identify whether a character given is a particular character or not, but it cannot give any other suggestions.

For example, if an image of 3 is given(we think it s 3 ), if the system cannot identify it as 3 . If it seems like 2 for the system, system should return 2 .

The idea that I have about Naive Bayesian classifier is, once we train data we can ask the system whether the given character is a particular character or not. Eg. We draw an image of a particular number and ask the system whether it s 2 or not.

I further noticed KNN(k nearest neighbor) gives multiple decisions. A character given to that, it decides a nearest compatible character given in training data.

It s highly appreciated if someone could explain me whether the Naive Bayesian classifier can be used to make multiple decisions such as above.

Answer 1

The assumption of a Naive Bayesian classifier is that data dimensions are independent (naive part) and that the model is generative (Bayesian part). In other words you model how data are generated from world states - P(data|world_state), where world_state can be continues or categorical variable (with multiple classes-categories). This runs in contrast to discriminative models that ignore data generation and describe a posterior probability of world states via hacking the data directly: P(world_state|data)

Here are the steps you have to follow to implement a Naive Bayesian classifier:
1. Model your data with a generative model, for example, a Gaussian distribution. Each class would have its own Gaussian. In naive model you take product of Gaussians for each data dimension. In more complete model, the dimensionality of a Gaussian is equal to the dimensionality of the data.
2. Figure out a prior for each of your classes (for example, a categorical distribution with a single probability assigned to each class);
3. Learn parameters by fitting Gaussians to your data;
4. Evaluate test data class via a Bayesian formula:

P(class|data) = P(data|class) * P(class)/P(data);           [1]
P(data) = sum_over_classes[P(data|class) * P(class)];       [2]

The first term in formula 1 is called posterior, the second one is a likelihood and the last one is a prior. The denominator shown in [2] often gets ignored when you calculate a maximum of a posterior (MAP) that is the most probable class responsible for data generation. However, the denominator is very important in understanding how class models work together.

For example, you can create a very sophisticated generative model for each class but your posterior would look very simple due to the fact that during normalization one of the likelihoods was reduced to 0. In this case it is better to abandon Bayesian approach and create a discriminative model with fewer parameters than you put in the generative one. In the diagram below, vertical axes are probabilities of a world state (class) while horizontal axes represent data. enter image description here

Answer 2

A Bayes classifier should give a probability for an item to belong to each of several classes. It s definitely possible to have more than two classes.

With the probabilities for the classes, you will normally want to make a decision, which can be done by, e.g., choosing the most likely class. This may be why you re seeing it as providing just one possibility.

友情链接