RでSteel-Dwass（スティールドゥワス）検定をする - 統計学と疫学と時々、助教生活

今日はノンパラメトリックな多重比較の一例として、Steel.Dwass testのRでの実行方法を記載する。
（2016年6月13日にコメントされたようにsteel-dwassについてNSM3パッケージにあるコマンドについて追記した。）

方法1

Steel.dwass検定は下記のHP（群馬大学・青木先生による）を参考して、自作の関数を作成し、実行することになる。
http://aoki2.si.gunma-u.ac.jp/R/Steel-Dwass.html

早速、上のHPから関数をcopyアンドpasteする。

Steel.Dwass <- function(data,group)
{
        OK <- complete.cases(data, group)
        data <- data[OK]
        group <- group[OK]
        n.i <- table(group)
        ng <- length(n.i)
        t <- combn(ng, 2, function(ij) {
                i <- ij[1]
                j <- ij[2]
                r <- rank(c(data[group == i], data[group == j]))
                R <- sum(r[1:n.i[i]])
                N <- n.i[i]+n.i[j]
                E <- n.i[i]*(N+1)/2
                V <- n.i[i]*n.i[j]/(N*(N-1))*(sum(r^2)-N*(N+1)^2/4)
                return(abs(R-E)/sqrt(V))
        })
        p <- ptukey(t*sqrt(2), ng, Inf, lower.tail=FALSE)
        result <- cbind(t, p)
        rownames(result) <- combn(ng, 2, paste, collapse=":")
        return(result)
}

Steel.Dwass()を実行する。

#個人の中性脂肪値（TG）を目的変数にして、個人の運動習慣（exercise（3群））で検定を行う。
#この関数では指定するグループは数値データであるべきなので、as.numericで変換している。
Steel.Dwass(sample$TG, as.numeric(sample$exercise))

##            t         p
## 1:2 1.631953 0.2321546
## 1:3 0.502737 0.8699921
## 2:3 1.781280 0.1758568

多重比較の結果、特にどの群間にも差は見られなかった。

NSM3パッケージでsteel-dwass検定をする。

コメントを頂いた内容としては、上記の青木先生の方法は漸近検定（p値の求め方）であるとの指摘を受けた。
一方で、NSM3パッケージでは正確検定も実施できるとのコメントであった。
実際に下記の通り、1)青木先生のコマンド、2)pSDCFlig（Exact）、3)pSDCFlig（Asymptotic）を実行した。

非常に簡単な事例をご紹介する（下記のデータを用意する）

A = c(56, 50, 43, 68, 68, 69, 99, 100)
B = c(1, 1, 1, 2, 2, 2, 3, 3)　#グループの数値に変換したもの

1)青木先生のコマンドから...

Steel.Dwass(A, B)
##            t         p
## 1:2 1.992633 0.1139883
## 1:3 1.732051 0.1932252
## 2:3 1.777047 0.1773049

2)pSDCFlig（Exact）

pSDCFlig(A, B, method="Exact")

## Ties are present, so p-values are based on conditional null distribution. 
## Group sizes:  3 3 2 
## Using the Exact method: 
##  
## For treatments 1 - 2 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.818 . 
## The smallest experimentwise error rate leading to rejection is 0.0214 .
##   
## For treatments 1 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.4495 . 
## The smallest experimentwise error rate leading to rejection is 0.325 .
##   
## For treatments 2 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.5131 . 
## The smallest experimentwise error rate leading to rejection is 0.1321 .

3)pSDCFlig（Asymptotic）

pSDCFlig(A, B, method="Asymptotic")

## Ties are present, so p-values are based on conditional null distribution. 
## Group sizes:  3 3 2 
## Using the Asymptotic method: 
##  
## For treatments 1 - 2 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.818 . 
## The smallest experimentwise error rate leading to rejection is 0.1141 .
##   
## For treatments 1 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.4495 . 
## The smallest experimentwise error rate leading to rejection is 0.1935 .
##   
## For treatments 2 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.5131 . 
## The smallest experimentwise error rate leading to rejection is 0.1775 .

最後にこのページの最初に実行している運動習慣3群ごとの中性脂肪値についての比較をpSDCFlig関数で行う。

pSDCFlig(sample$TG, as.numeric(sample$exercise), method="Exact")
## vector("list", count) でエラー:  ベクトルのサイズは∞ではありえません

あれ、エラー出るの？（試行数の問題？）

Exactではなく、モンテカルロ（Monte Carlo）なら答えが返ってきます。
（何かご存知の方はコメント宜しくお願いします。）

exercise1 <- sample$TG[as.numeric(sample$exercise)==1]
exercise2 <- sample$TG[as.numeric(sample$exercise)==2]
exercise3 <- sample$TG[as.numeric(sample$exercise)==3]
list <- list(dt1=exercise1, dt2=exercise2, dt3=exercise3)
pSDCFlig(list, method="Monte Carlo")

## Ties are present, so p-values are based on conditional null distribution. 
## Group sizes:  297 1294 1392 
## Using the Monte Carlo(with  10000 Iterations)  method: 
##  
## For treatments 1 - 2 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is -2.3079 . 
## The smallest experimentwise error rate leading to rejection is 0.2213 .
##   
## For treatments 1 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is -0.711 . 
## The smallest experimentwise error rate leading to rejection is 0.8653 .
##   
## For treatments 2 - 3 , the  Dwass, Steel, Critchlow-Fligner W  Statistic is 2.5191 . 
## The smallest experimentwise error rate leading to rejection is 0.1662 .

以上です。

20160529
RF