English 中文(简体)
Mysql multivariable linear regression
原标题:

I am trying to do a multivarible (9 variables) linear regression on data in my mysql 5.0 database (the result value field only has 2 possible values, 1 and 0).

I ve done some searching and found I can use:

mysql> SELECT
    -> @n := COUNT(score) AS N,
    -> @meanX := AVG(age) AS "X mean",
    -> @sumX := SUM(age) AS "X sum",
    -> @sumXX := SUM(age*age) "X sum of squares",
    -> @meanY := AVG(score) AS "Y mean",
    -> @sumY := SUM(score) AS "Y sum",
    -> @sumYY := SUM(score*score) "Y sum of square",
    -> @sumXY := SUM(age*score) AS "X*Y sum"

To get at many of the basic regression variables, but I really don t want to type out doing this for every combination of the 9 variables. All of the sources I can find about how to do regression on multi variables requires Matrix operations. Can I do Matrix operations with mysql, or are there other ways to do a 9 variable linear regression?

Should I export the data out of mysql first? Its ~80,000 rows, so it would be alright to move it, just not sure what else I should use.

Thanks, Dan

最佳回答

It is good to store this data in MySQL but you could process the data from a language that has access to the database. Pseudocode:

variables = [  A ,  B ,  C ,  D ,  E ,  F ,  G ,  H ,  I  ];

for X in $variables do
    for Y in $variables do
        query =  SELECT
            @ +$X+$Y+  := COUNT(score) AS  +$X+$Y+ ,
            @mean +$X+  := AVG(age) AS "X mean",
            @sum +$X+  := SUM(age) AS "X sum",
            @sum +$X+$X+  := SUM(age*age) "X sum of squares",
            @mean +$Y+  := AVG(score) AS "Y mean",
            @sum +$Y+  := SUM(score) AS "Y sum",
            @sum +$Y+$Y+  := SUM(score*score) "Y sum of square",
            @sum +$X+$Y+  := SUM(age*score) AS "X*Y sum" ;
        db_execute(query);
    done
done

but why do not store the results in a table? More appropriate for a database.

for X in $variables do
    for Y in $variables do
        query =  INSERT INTO regression SELECT FROM measurements
            " +$X+ " AS X
            " +$Y+ " AS Y
            score AS valX
            age AS valY
            COUNT(score) AS N,
            AVG(age) AS meanX,
            SUM(age) AS sumX,
            SUM(age*age) squareX,
            AVG(score) AS meanY,
            SUM(score) AS sumY,
            SUM(score*score) squareY,
            SUM(age*score) AS sumXY ;
        db_execute(query);
    done
done

Put separate index on both X and the Y columns.

问题回答

I would reccomend moving the data out of MySQL and into R. With 1/0 response data a logistic regression is much more appropriate and it is not the simple sum of squares you are implementing.

http://en.wikipedia.org/wiki/Logistic_regression

This seems to do a good job of showing how to solve the logistic

http://www.omidrouhani.com/research/logisticregression/html/logisticregression.htm#_Toc147483467





相关问题
SQL SubQuery getting particular column

I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

php return a specific row from query

Is it possible in php to return a specific row of data from a mysql query? None of the fetch statements that I ve found return a 2 dimensional array to access specific rows. I want to be able to ...

Character Encodings in PHP and MySQL

Our website was developed with a meta tag set to... <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> This works fine for M-dashes and special quotes, etc. However, I ...

Pagination Strategies for Complex (slow) Datasets

What are some of the strategies being used for pagination of data sets that involve complex queries? count(*) takes ~1.5 sec so we don t want to hit the DB for every page view. Currently there are ~...

Averaging a total in mySQL

My table looks like person_id | car_id | miles ------------------------------ 1 | 1 | 100 1 | 2 | 200 2 | 3 | 1000 2 | 4 | 500 I need to ...

热门标签