English 中文(简体)
GAM-GEE, in gamm4 R Pack?
原标题:conducting GAM-GEE in gamm4 R package?

我正试图分析生物体的一些直观渗透数据,以形成一种生境分布模式。 一旦发现生物体,就会随着特定时间间隔收集点数据而加以跟踪。 由于这些“跟踪器”之间的自动校正,我希望采用类似于Pirotta等人(2011年)的GAM-GEE方法,使用包裹和间谍(http://www.int-res.com/abstracts/meps/v436/p257-272/)。 他们的区域说明载于本文(http://www.int-res.com/articles/suppl/m436p257_supp/m436p257_supp1-code.r)。 我利用了这一守则,但成功有限,模型的多重问题未能趋同。

我的数据结构如下:

> str(dat2)

 data.frame :   10792 obs. of  4 variables:

 $ dist_slag       : num  26475 26340 25886 25400 24934 ...
 $ Depth           : num  -10.1 -10.5 -16.6 -22.2 -29.7 ...
$ dolphin_presence: int  0 0 0 0 0 0 0 0 0 0 ...


 $ block           : int  1 1 1 1 1 1 1 1 1 1 ...


> head(dat2)

  dist_slag    Depth dolphin_presence block
1  26475.47 -10.0934                0     1
2  26340.47 -10.4870                0     1
3  25886.33 -16.5752                0     1
4  25399.88 -22.2474                0     1



5  24934.29 -29.6797                0     1
6  24519.90 -26.2370                0     1

这里是我块状变量的概述(表明每个区有汽车校正的团体数目)。

> summary(dat2$block)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   39.00   76.00   73.52  111.00  148.00

然而,我想利用一揽子措施,因为我更熟悉Simon Wood教授的一揽子计划和职能,似乎伽马4是最合适的。 必须指出,这些模型具有双轨反应(瞬间存在的有机存在),因此我认为赌博4比赌博更合适。 在赌博帮助中,它提供了以下因素中的汽车校服的例子:

## more complicated autocorrelation example - AR errors
## only within groups defined by `fac 
e <- rnorm(n,0,sig)
for (i in 2:n) e[i] <- 0.6*e[i-1]*(fac[i-1]==fac[i]) + e[i]
y <- f + e
b <- gamm(y~s(x,k=20),correlation=corAR1(form=~1|fac))

Following this example, the following is the code I used for my dataset

b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),random=(form=~1|block),  family=binomial(),data=dat)

然而,通过审查产出(摘要(b$gam)和具体摘要(b$mer),我要么不知道如何解释结果,要么不认为小组内部的汽车校正正在得到考虑。

> summary(b$gam)

Family: binomial 
Link function: logit 

Formula:
dolphin_presence ~ s(dist_slag) + s(Depth)

Parametric coefficients:


            Estimate Std. Error z value Pr(>|z|)   
    (Intercept)  -13.968      5.145  -2.715  0.00663 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Approximate significance of smooth terms:


               edf Ref.df Chi.sq  p-value    
s(dist_slag) 4.943  4.943  70.67 6.85e-14 ***
s(Depth)     6.869  6.869 115.59  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 



R-sq.(adj) =  0.317glmer.ML score =  10504  Scale est. = 1         n = 10792
> 

> summary(b$mer)
Generalized linear mixed model fit by the Laplace approximation 


   AIC   BIC logLik deviance
 10514 10551  -5252    10504
Random effects:
 Groups Name         Variance Std.Dev.
 Xr     s(dist_slag) 1611344  1269.39 
 Xr.0   s(Depth)       98622   314.04 
Number of obs: 10792, groups: Xr, 8; Xr.0, 8



Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)   
X(Intercept)      -13.968      5.145  -2.715  0.00663 **
Xs(dist_slag)Fx1  -35.871     33.944  -1.057  0.29063   
Xs(Depth)Fx1        3.971      3.740   1.062  0.28823   


---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
            X(Int) X(_)F1
Xs(dst_s)F1  0.654       
Xs(Dpth)Fx1 -0.030  0.000
> 

我如何确保在“锁”变数的每个独特价值范围内确实考虑到汽车校正? 什么是解释“消费(mer)”产出的最简单方式?

这些结果与使用相同变量和参数的普通gam(包装毫克v)不同,没有“关联=.......”一词,这表明情况有所不同。

然而,当我对相关术语(海森)使用不同的变量时,我获得SAME输出:

> dat2 <- data.frame(dist_slag = dat$dist_slag, Depth = dat$Depth, dolphin_presence = dat$dolphin_presence,

+ block = dat$block, season=dat$season)
 > head(dat2)
      dist_slag    Depth dolphin_presence block season
1  26475.47 -10.0934                0     1      F
2  26340.47 -10.4870                0     1      F

3  25886.33 -16.5752                0     1      F
4  25399.88 -22.2474                0     1      F
5  24934.29 -29.6797                0     1      F
6  24519.90 -26.2370                0     1      F

> summary(dat2$season)

   F    S 
3224 7568 


> b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),correlation=corAR1(1, form=~1 |   season), family=binomial(),data=dat2)
> summary(b$gam)

Family: binomial 
Link function: logit 


Formula:
dolphin_presence ~ s(dist_slag) + s(Depth)

Parametric coefficients:
            Estimate Std. Error z value Pr(>|z|)   
    (Intercept)  -13.968      5.145  -2.715  0.00663 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


Approximate significance of smooth terms:
               edf Ref.df Chi.sq  p-value    
s(dist_slag) 4.943  4.943  70.67 6.85e-14 ***
s(Depth)     6.869  6.869 115.59  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


R-sq.(adj) =  0.317glmer.ML score =  10504  Scale est. = 1         n = 10792
> summary(b$mer)
Generalized linear mixed model fit by the Laplace approximation 
   AIC   BIC logLik deviance

 10514 10551  -5252    10504
Random effects:
 Groups Name         Variance Std.Dev.
 Xr     s(dist_slag) 1611344  1269.39 
 Xr.0   s(Depth)       98622   314.04 
Number of obs: 10792, groups: Xr, 8; Xr.0, 8


Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)   
X(Intercept)      -13.968      5.145  -2.715  0.00663 **
Xs(dist_slag)Fx1  -35.871     33.944  -1.057  0.29063   
Xs(Depth)Fx1        3.971      3.740   1.062  0.28823   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
            X(Int) X(_)F1
Xs(dst_s)F1  0.654       
Xs(Dpth)Fx1 -0.030  0.000
> 

我只想确保正确允许在“锁定”变量的每个价值范围内进行相互关系。 我如何制定这一模式,以说在每一块块的单一价值中都可能存在汽车校正,而是在各块块之间独立?

在另一个方面,我还收到大模型模型完成模型之后的以下警告信息(变数超过2个):

Warning message:
 In mer_finalize(ans) : false convergence (8)
问题回答
  • gamm4 is built on top of lme4, which does not allow for a correlation parameter (in contrast to the nlme, package, which underlies mgcv::gamm). mgcv::gamm does handle binary data, although it uses PQL, which is generally less accurate than Laplace/GHQ approximations as in gamm4/lme4. It is unfortunate (!!) that you re not getting a warning telling you that the correlation argument is being ignored (when I try a simple example using a correlation argument with lme4, I do get a warning, but it s possible that the extra argument is getting swallowed somewhere inside gamm4).
  • Your desired autocorrelation structure ("autocorrelation can exist within each single value for block, but assume independence among blocks") is exactly the way correlation structures are coded in nlme (and hence in mgcv::gamm).
  • I would use mcgv::gamm, and would suggest that if at all possible you try it out on some simulated data with known structure (or use the data set provided in the supplementary material above and see if you can reproduce their qualitative conclusions with your alternative approach).
  • StackOverflow is nice, but there is probably more mixed model expertise at [email protected]




相关问题
How to manage a pageview DB

I am interested in tracking my users pageviews on my site. Being that traffic is expanding very quickly, I am worried about robots, etc, and I also want to be able to use tracked data live to alter ...

Statistics Question

Suppose I conduct a survey of 10 people asking whether to rank a movie as 0 to 4 stars. Allowable answers are 0, 1, 2, 3, and 4. The mean is 2.0 stars. How do I calculate the certainty (or ...

Calculating variance with large numbers

I haven t really used variance calculation that much, and I don t know quite what to expect. Actually I m not too good with math at all. I have a an array of 1000000 random numeric values in the ...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Generating correlated numbers

Here is a fun one: I need to generate random x/y pairs that are correlated at a given value of Pearson product moment correlation coefficient, or Pearson r. You can imagine this as two arrays, array ...

Multivariate time series modelling in R

I want do fit some sort of multi-variate time series model using R. Here is a sample of my data: u cci bci cpi gdp dum1 dum2 dum3 dx 16.50 14.00 53.00 45.70 80....

热门标签