我拥有一个网站一年的数据。 我愿培训一个机器学习算法,根据某些变量(如字数、张贴日等)预测新内容的成功。
我想取一新数据,就数据的某些特点提出意见,并有可能使网站能够做得更好。
此外,我还要继续向培训组补充未来数据,并不断培训算法,以便随着时间的推移而进行模拟。
我的问题是:我如何利用智慧来实现这一目标?
我拥有一个网站一年的数据。 我愿培训一个机器学习算法,根据某些变量(如字数、张贴日等)预测新内容的成功。
我想取一新数据,就数据的某些特点提出意见,并有可能使网站能够做得更好。
此外,我还要继续向培训组补充未来数据,并不断培训算法,以便随着时间的推移而进行模拟。
我的问题是:我如何利用智慧来实现这一目标?
什么是双重分类问题,即你必须决定某项投入是否好。
不同的回归算法、立克谢-列收入使转换算法变得十分容易,使你们能够看到什么奏效和什么。
从我头上看,我试图采取的一些方法:
如何评估某种方法的质量? 使用交叉验证(如果你有足够数据,则有10倍,否则有5倍)。 该手册中有一节(5.1)。
Adding new data to the training set will require to retrain your model. Depending on the computing power you have at hand it may or may not be a problem. If you have a lot of examples, adding one won t change much, so be sure to re-train your algorithm with a handful of new examples. That will save computational time.
使用培训套的弹性算法称为离线算法。 另一方面,在线算法每当提出一个新的实例时就学习。 如果你实际需要,就象最近邻那样,尝试在线方法。
If you need example code, scikit-learn doc is very helpful: - http://scikit-learn.org/0.10/auto_examples/linear_model/logistic_l1_l2_sparsity.html#example-linear-model-logistic-l1-l2-sparsity-py - http://scikit-learn.org/0.10/modules/linear_model.html#ridge-regression
What is the best programming language for artificial intelligence purposes? Mind that using suggested language I must be able to employ any AI technique (or at least most of them).
I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ( 1 , 2 ... 6 ). Unfortunately there are e.g. for class ...
In terms of artificial intelligence and machine learning, what is the difference between supervised and unsupervised learning? Can you provide a basic, easy explanation with an example?
I have a image with horizontal and vertical lines. In fact, this image is the BBC website converted to horizontal and vertical lines. My problem is that I want to be able to find all the rectangles in ...
I m using the explorer feature of Weka for classification. So I have my .arff file, with 2 features of NUMERIC value, and my class is a binary 0 or 1 (eg {0,1}). Sample: @RELATION summary @...
I want to implement a simple SVM classifier, in the case of high-dimensional binary data (text), for which I think a simple linear SVM is best. The reason for implementing it myself is basically that ...
According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is 1 3:1 11:1 14:1 19:1 39:...
I am playing with some neural network simulations. I d like to get two neural networks sharing the input and output nodes (with other nodes being distinct and part of two different routes) to compete. ...