# How to test for a treatment effect in a single factor design

## Description

Suppose you are given a dataset that has more than one treatment level and you wish to see if there is a unit-level treatment effect. How would you check that?

## Using SciPy and statsmodels, in Python

View this solution alone.

The solution below uses an example dataset about the teeth of 10 guinea pigs at three Vitamin C dosage levels (in mg) with two delivery methods (orange juice vs. ascorbic acid). (See how to quickly load some sample data.)

1
2
from rdatasets import data
df = data('ToothGrowth')


In this dataset, there are only two treatments (orange juice and ascorbic acid, in the variable supp). We can therefore perrform a two-sample $t$ test. But first we must filter the outcome variable len (tooth length) based on supp.

1
2
3
4
5
subjects_receiving_oj = df[df['supp']=='OJ']['len']
subjects_receiving_vc = df[df['supp']=='VC']['len']

import scipy.stats as stats
stats.ttest_ind( subjects_receiving_oj, subjects_receiving_vc, equal_var=False )

1
Ttest_indResult(statistic=1.91526826869527, pvalue=0.06063450788093387)


At the 5% significance level, we see that the length of the tooth does not differ between the two delivery methods. We assume that the model assumptions are met, but do not check that here.

If there are multiple levels (two or more), you can apply the parametric ANOVA test which in this case will provide a similar $p$ value.

1
2
3
4
5
from statsmodels.formula.api import ols
model = ols('len ~ supp', data = df).fit()

import statsmodels.api as sm
sm.stats.anova_lm(model, typ=1)

df sum_sq mean_sq F PR(>F)
supp 1.0 205.350000 205.350000 3.668253 0.060393
Residual 58.0 3246.859333 55.980333 NaN NaN

We see the $p$ value in the final column is very similar.

However, if the assumptions of ANOVA are not met, we can utilize a nonparametric approach via the Kruskal-Wallis Test. We use the filtered variables defined above and import the kruskal function from SciPy.

1
2
from scipy.stats import kruskal
kruskal( subjects_receiving_oj, subjects_receiving_vc )

1
KruskalResult(statistic=3.4453580631407035, pvalue=0.06342967639688878)


Similar to the previous results, the length of the tooth does not differ between the delivery methods at the 5% significance level.

See a problem? Tell us or edit the source.

## Using perm, in R

View this solution alone.

The solution below uses an example dataset about the teeth of 10 guinea pigs at three Vitamin C dosage levels (in mg) with two delivery methods (orange juice vs. ascorbic acid). (See how to quickly load some sample data.)

1
df <- ToothGrowth


In this dataset, there are only two treatments (orange juice and ascorbic acid, in the variable supp). We can therefore perrform a two-sample $t$ test. But first we must filter the outcome variable len (tooth length) based on supp.

1
t.test(len ~ supp, data=df)

1
2
3
4
5
6
7
8
9
10
Welch Two Sample t-test

data:  len by supp
t = 1.9153, df = 55.309, p-value = 0.06063
alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
95 percent confidence interval:
-0.1710156  7.5710156
sample estimates:
mean in group OJ mean in group VC
20.66333         16.96333


The $p$-value is reported in the first row of numerical output as 0.06063. Because this is greater than 0.05, at a 5% significance level, we see that the length of the tooth does not differ between the two delivery methods.

Since the t.test makes some assumptions, we can use the permTS function instead. It can conduct a permutation or randomization test, but it requires us to load the perm package first.

1
2
3
# install.packages("perm") # If you have not already installed it
library(perm)
permTS(len ~ supp, data=df)

1
2
3
4
5
6
7
8
Permutation Test using Asymptotic Approximation

data:  len by supp
Z = 1.8734, p-value = 0.06102
alternative hypothesis: true mean supp=OJ - mean supp=VC is not equal to 0
sample estimates:
mean supp=OJ - mean supp=VC
3.7


The $p$-value is reported in the first row of numerical output as 0.06102. Because this is greater than 0.05, at a 5% significance level, we see that the length of the tooth does not differ between the two delivery methods. We assume that the model assumptions are met but not shown in this task.

If there are multiple levels (2 or more), you can apply the parametric ANOVA test which in this case will provide a similar $p$-value.

1
2
aov1 <- aov(len ~ supp, data = df)
summary(aov1)

1
2
3
4
5
Df Sum Sq Mean Sq F value Pr(>F)
supp         1    205  205.35   3.668 0.0604 .
Residuals   58   3247   55.98
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


The $p$-value for supp is shown at the end of the supp row, in the Pr(>F) column. Because it is 0.0604, which is greater than 0.05, at a 5% significance level, we see that the length of the tooth does not differ between the delivery methods.

However, if the assumptions of ANOVA are not met, we can utilize the non parametric approach via the Kruskal-Wallis Test.

1
kruskal.test(len ~ supp, data = df)

1
2
3
4
Kruskal-Wallis rank sum test

data:  len by supp
Kruskal-Wallis chi-squared = 3.4454, df = 1, p-value = 0.06343


The $p$-value is the last part of the output, and is 0.06343. Because it is greater than 0.05, at a 5% significance level, we see that the length of the tooth does not differ between the delivery methods.