How to test for a treatment effect in a single factor design (in R, using perm)
Task
Suppose you are given a dataset that has more than one treatment level and you wish to see if there is a unit-level treatment effect. How would you check that?
Solution
The solution below uses an example dataset about the teeth of 10 guinea pigs at three Vitamin C dosage levels (in mg) with two delivery methods (orange juice vs. ascorbic acid). (See how to quickly load some sample data.)
1
df <- ToothGrowth
In this dataset, there are only two treatments (orange juice and ascorbic acid, in the variable supp
). We can therefore perrform a two-sample len
(tooth length) based on supp
.
1
t.test(len ~ supp, data=df)
Welch Two Sample t-test
data: len by supp
t = 1.9153, df = 55.309, p-value = 0.06063
alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
95 percent confidence interval:
-0.1710156 7.5710156
sample estimates:
mean in group OJ mean in group VC
20.66333 16.96333
The
Since the t.test
makes some assumptions, we can use the permTS
function instead. It can conduct a permutation or randomization test, but it requires us to load the perm
package first.
1
2
3
# install.packages("perm") # If you have not already installed it
library(perm)
permTS(len ~ supp, data=df)
Permutation Test using Asymptotic Approximation
data: len by supp
Z = 1.8734, p-value = 0.06102
alternative hypothesis: true mean supp=OJ - mean supp=VC is not equal to 0
sample estimates:
mean supp=OJ - mean supp=VC
3.7
The
If there are multiple levels (2 or more), you can apply the parametric ANOVA test which in this case will provide a similar
1
2
aov1 <- aov(len ~ supp, data = df)
summary(aov1)
Df Sum Sq Mean Sq F value Pr(>F)
supp 1 205 205.35 3.668 0.0604 .
Residuals 58 3247 55.98
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The supp
is shown at the end of the supp
row, in the Pr(>F)
column. Because it is 0.0604, which is greater than 0.05, at a 5% significance level, we see that the length of the tooth does not differ between the delivery methods.
However, if the assumptions of ANOVA are not met, we can utilize the non parametric approach via the Kruskal-Wallis Test.
1
kruskal.test(len ~ supp, data = df)
Kruskal-Wallis rank sum test
data: len by supp
Kruskal-Wallis chi-squared = 3.4454, df = 1, p-value = 0.06343
The
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Krtin Juneja (KJUNEJA@falcon.bentley.edu)