English 中文(简体)
Pruning Deductions in Expert Systems
原标题:

In a rule system, or any reasoning system that deduces facts via forward-chaining inference rules, how would you prune "unnecessary" branches? I m not sure what the formal terminology is, but I m just trying to understand how people are able to limit their train-of-thought when reasoning over problems, whereas all semantic reasoners I ve seen appear unable to do this.

For example, in John McCarthy s paper An Example for Natural Language Understanding and the AI Problems It Raises, he describes potential problems in getting a program to intelligently answer questions about a news article in the New York Times. In section 4, "The Need For Nonmonotonic Reasoning", he discusses the use of Occam s Razer to restrict the inclusion of facts when reasoning about the story. The sample story he uses is one about robbers who victimize a furniture store owner.

If a program were asked to form a "minimal completion" of the story in predicate calculus, it might need to include facts not directly mentioned in the original story. However, it would also need some way of knowing when to limit its chain of deduction, so as not to include irrelevant details. For example, it might want to include the exact number of police involved in the case, which the article omits, but it won t want to include the fact that each police officer has a mother.

最佳回答

Good Question.

From your Question i think what you refer to as pruning is a model-building step performed ex ante--ie, to limit the inputs available to the algorithm to build the model. The term pruning when used in Machine Learning refers to something different--an ex post step, after model construction and that operates upon the model itself and not on the available inputs. (There could be a second meaning in the ML domain, for the term pruning. of, but i m not aware of it.) In other words, pruning is indeed literally a technique to "limit its chain of deduction" as you put it, but it does so ex post, by excision of components of a complete (working) model, and not by limiting the inputs used to create that model.

On the other hand, isolating or limiting the inputs available for model construction--which is what i think you might have had in mind--is indeed a key Machine Learning theme; it s clearly a factor responsible for the superior performance of many of the more recent ML algorithms--for instance, Support Vector Machines (the insight that underlies SVM is construction of the maximum-margin hyperplane from only a small subset of the data, i.e, the support vectors ), and Multi-Adaptive Regression Splines (a regression technique in which no attempt is made to fit the data by "drawing a single continuous curve through it", instead, discrete section of the data are fit, one by one, using a bounded linear equation for each portion, ie., the splines , so the predicate step of optimal partitioning of the data is obviously the crux of this algorithm).

What problem is solving by pruning?

At least w/r/t specific ML algorithms i have actually coded and used--Decision Trees, MARS, and Neural Networks--pruning is performed on an initially over-fit model (a model that fits the training data so closely that it is unable to generalize (accurately predict new instances). In each instance, pruning involves removing marginal nodes (DT, NN) or terms in the regression equation (MARS) one by one.

Second, why is pruning necessary/desirable?

Isn t it better to just accurately set the convergence/splitting criteria? That won t always help. Pruning works from "the bottom up"; the model is constructed from the top down, so tuning the model (to achieve the same benefit as pruning) eliminates not just one or more decision nodes but also the child nodes that (like trimming a tree closer to the trunk). So eliminating a marginal node might also eliminate one or more strong nodes subordinate to that marginal node--but the modeler would never know that because his/her tuning eliminated further node creation at that marginal node. Pruning works from the other direction--from the most subordinate (lowest-level) child nodes upward in the direction of the root node.

问题回答

暂无回答




相关问题
Resample Filter of WEKA - How to interpret the result

I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ( 1 , 2 ... 6 ). Unfortunately there are e.g. for class ...

How to recognize rectangles in this image?

I have a image with horizontal and vertical lines. In fact, this image is the BBC website converted to horizontal and vertical lines. My problem is that I want to be able to find all the rectangles in ...

Question About Using Weka, the machine learning tool

I m using the explorer feature of Weka for classification. So I have my .arff file, with 2 features of NUMERIC value, and my class is a binary 0 or 1 (eg {0,1}). Sample: @RELATION summary @...

Implementing a linear, binary SVM (support vector machine)

I want to implement a simple SVM classifier, in the case of high-dimensional binary data (text), for which I think a simple linear SVM is best. The reason for implementing it myself is basically that ...

libsvm model file format

According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is 1 3:1 11:1 14:1 19:1 39:...

Competitive Learning in Neural Networks

I am playing with some neural network simulations. I d like to get two neural networks sharing the input and output nodes (with other nodes being distinct and part of two different routes) to compete. ...

热门标签