Non-parametric tests

Inference Statistics (cont.)

Both T-test and Pearson assume Normality. What if the data is not normally distributed?

Let us test for normality using the Shapiro-Wilk test:

Normality can be assumed only if p > 0.05.

Input

shapiro.test(carbon$AverageTemperature)

Output

Shapiro-Wilk normality test

data:  carbon$AverageTemperature
W = 0.94052, p-value = 0.008176

We can also check normality visually with geom_density()

Input

ggplot(carbon, aes(x=AverageTemperature)) + 
  geom_density()

Notice that the curve is not the normal curve that we presented before.

Whenever data is normal, we run a parametric test. Most parametric tests have a non-parametric sibling. For instance:

Parametric test	R	Non-parametric test	R
Independent t-test	`t.test(y~x)`	Mann-Whitney test	`wilcox.test(y~x)`
Paired t-test	`t.test(y1, y2, paired=TRUE)`	Wilcoxon signed rank test	`wilcox.test(y1, y2, paired=TRUE)`
One-way ANOVA	`aov(y ~ x, data = my_data)`	Kruskal-Wallis test	`kruskal.test(y~x)`
Pearson’s correlation	`cor.test(x, y, method=c("pearson")`	Spearman’s correlation	`cor.test(x, y, method=c("spearman")`

If we are on track, try to run the proper non-parametric tests for our data/analysis: