Example 1-power simulation

Power simulation for example 1

Chicken Welfare - bone density (difference between 2 means)

load the required packages

library('paramtest')
library('pwr')
library('tidyverse')

Create a function to simulate data and perform t-test

the function creates two normally distributed samples of data, namely x1 and x2
x1 has mean = 0 and SD = 1
x2 has mean = d and SD = 1
the function requires you to specify the standardised effect size, d
and the sample size for each group, N

t_func <- function(simNum, N, d) {
  x1 <- rnorm(N, 0, 1)
  x2 <- rnorm(N, d, 1)
  
  t <- t.test(x1, x2, var.equal = TRUE)  # run t-test
  stat <- t$statistic
  p <- t$p.value
  
  return(c(t = stat, p = p, sig = (p < .05)))
  # return a named vector with the results we want to keep
  # value of sig will be TRUE or FALSE based on the p value
}

Specify the values of sample size N (per group) and cohens d for the smallest effect size of interest

We will begin with using values of d and N from the G*Power example
Sample size N = 64
cohen’s d = 0.5
number of iterations = 1000
Use paramtest::run_test() to run our required number of iterations of the function results go in the object “power_ttest”

power_ttest <-
  run_test(
    t_func,
    n.iter = 1000,
    output = 'data.frame',
    N = 64,
    d = .5
  )

## Running 1,000 tests...

report the proportion of “sig” values that are TRUE

results(power_ttest) %>%
  summarise(power = mean(sig))

##   power
## 1 0.793

Compare the reported value with the value from G*Power (power = 0.8015)
Note that it will be slightly different, but accuracy can be improved by increasing the number of iterations.

Use paramtest::grid_search() to explore scenarios

give the params argument a list of parameters we want to vary.
lets try N=50, N=100 and N=150 (per group)

power_ttest_vary <-
  grid_search(
    t_func,
    params = list(N = c(50, 100, 150)),
    n.iter = 1000,
    output = 'data.frame',
    d = .5
  )

## Running 3,000 tests...

results(power_ttest_vary) %>%
  group_by(N.test) %>%
  summarise(power = mean(sig))

## # A tibble: 3 x 2
##   N.test power
##    <dbl> <dbl>
## 1     50 0.68 
## 2    100 0.925
## 3    150 0.986

Scenarios: 2 parameters varying

vary N and Cohen’s d
We will use values to cover the range shown in the G*Power example.

power_ttest_vary2 <-
  grid_search(
    t_func,
    params = list(N = c(25, 50, 100, 200), d = c(.33, .5, .67)),
    n.iter = 1000,
    output = 'data.frame'
  )

## Running 12,000 tests...

power <- results(power_ttest_vary2) %>%
  group_by(N.test, d.test) %>%
  summarise(power = mean(sig))

## `summarise()` has grouped output by 'N.test'. You can override using the `.groups` argument.

print(power)

## # A tibble: 12 x 3
## # Groups:   N.test [4]
##    N.test d.test power
##     <dbl>  <dbl> <dbl>
##  1     25   0.33 0.223
##  2     25   0.5  0.413
##  3     25   0.67 0.654
##  4     50   0.33 0.361
##  5     50   0.5  0.696
##  6     50   0.67 0.902
##  7    100   0.33 0.636
##  8    100   0.5  0.943
##  9    100   0.67 0.998
## 10    200   0.33 0.897
## 11    200   0.5  0.997
## 12    200   0.67 1

Plot the power simulation

ggplot(power, aes(
  x = N.test,
  y = power,
  group = factor(d.test),
  colour = factor(d.test)
)) +
  geom_point() +
  geom_line() +
  ylim(c(0, 1)) +
  xlim(c(0, 200)) +
  labs(x = 'Sample Size', y = 'Power', colour = "Cohen's d") +
  theme_minimal()

Compare the plot with the G*Power plot. They are roughly similar.

Note: I have set n.iter = 1000 for each simulation to keep the run time very short. Consider increasing n.iter to improve accuracy of results.

This simulation method can be generalised to a wide variety of statistical models. Other packages may be preferred for specific situations.

Reference: https://cran.r-project.org/web/packages/paramtest/vignettes/Simulating-Power.html