Home  >  assets  >  files

Power simulation for example 1

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ### Chicken Welfare - bone density (difference between 2 means) load the required packages ```{r load packages, echo=TRUE, message=FALSE, warning=FALSE} library('paramtest') library('pwr') library('tidyverse') ``` ### Create a function to simulate data and perform t-test the function creates two normally distributed samples of data, namely x1 and x2 x1 has mean = 0 and SD = 1 x2 has mean = d and SD = 1 the function requires you to specify the standardised effect size, d and the sample size for each group, N ```{r t_func, echo=TRUE} t_func <- function(simNum, N, d) { x1 <- rnorm(N, 0, 1) x2 <- rnorm(N, d, 1) t <- t.test(x1, x2, var.equal = TRUE) # run t-test stat <- t$statistic p <- t$p.value return(c(t = stat, p = p, sig = (p < .05))) # return a named vector with the results we want to keep # value of sig will be TRUE or FALSE based on the p value } ``` ### Specify the values of sample size N (per group) and cohens d for the smallest effect size of interest We will begin with using values of d and N from the G*Power example Sample size N = 64 cohen's d = 0.5 number of iterations = 1000 Use paramtest::run_test() to run our required number of iterations of the function results go in the object "power_ttest" ```{r power_ttest, echo=TRUE} power_ttest <- run_test( t_func, n.iter = 1000, output = 'data.frame', N = 64, d = .5 ) ``` ### report the proportion of "sig" values that are TRUE ```{r results, echo=TRUE} results(power_ttest) %>% summarise(power = mean(sig)) ``` Compare the reported value with the value from G*Power (power = 0.8015) Note that it will be slightly different, but accuracy can be improved by increasing the number of iterations. ### Use paramtest::grid_search() to explore scenarios give the params argument a list of parameters we want to vary. lets try N=50, N=100 and N=150 (per group) ```{r scenario_N, echo=TRUE} power_ttest_vary <- grid_search( t_func, params = list(N = c(50, 100, 150)), n.iter = 1000, output = 'data.frame', d = .5 ) results(power_ttest_vary) %>% group_by(N.test) %>% summarise(power = mean(sig)) ``` ### Scenarios: 2 parameters varying vary N and Cohen's d We will use values to cover the range shown in the G*Power example. ```{r vary2, echo=TRUE} power_ttest_vary2 <- grid_search( t_func, params = list(N = c(25, 50, 100, 200), d = c(.33, .5, .67)), n.iter = 1000, output = 'data.frame' ) power <- results(power_ttest_vary2) %>% group_by(N.test, d.test) %>% summarise(power = mean(sig)) print(power) ``` ### Plot the power simulation ```{r plot2, echo=TRUE} ggplot(power, aes( x = N.test, y = power, group = factor(d.test), colour = factor(d.test) )) + geom_point() + geom_line() + ylim(c(0, 1)) + xlim(c(0, 200)) + labs(x = 'Sample Size', y = 'Power', colour = "Cohen's d") + theme_minimal() ``` Compare the plot with the G*Power plot. They are roughly similar. Note: I have set n.iter = 1000 for each simulation to keep the run time very short. Consider increasing n.iter to improve accuracy of results. This simulation method can be generalised to a wide variety of statistical models. Other packages may be preferred for specific situations. Reference: https://cran.r-project.org/web/packages/paramtest/vignettes/Simulating-Power.html