mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

8.2K
active users

#effectsize

1 post1 participant0 posts today

When designing a scientific experiment, a key factor is the sample size to be used for the results of the experiment to be meaningful.

How many cells do I need to measure? How many people do I interview? How many patients do I try my new drug on?

This is of great importance especially for quantitative studies, where we use statistics to determine whether a treatment or condition has an effect. Indeed, when we test a drug on a (small) number of patients, we do so in the hope our results can generalise to any patient because it would be impossible to test it on everyone.

The solution is to perform a "power analysis", a calculation that tells us whether given our experimental design, the statistical test we are using is able to see an effect of a certain magnitude, if that effect is really there. In other words, this is something that tells us whether the experiment we're planning to do could give us meaningful results.

But, as I said, in order to do a power analysis we need to decide what size of effect we would like to see. So... do scientists actually do that?

We explored this question in the context of the chronic variable stress literature.

We found that only a few studies give a clear justification for the sample size used, and in those that do, only a very small fraction used a biologically meaningful effect size as part of the sample size calculation. We discuss challenges around identifying a biologically meaningful effect size and ways to overcome them.

Read more here!
physoc.onlinelibrary.wiley.com

It always takes me some minutes to look up the interpretation guidelines for various effect size measures (yes, I know the rules of thumb are somewhat arbitrary). Today I edited Wikipedia to show three different guidelines for four different measures in the same table. Hopefully this can save some time for other researchers.

#statstab #260 Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data

Thoughts: "A_w and d_r were generally robust to these violations"

#robust #effectsize #ttest #2groups #metaanalysis #assumptions #ttest #cohend

link.springer.com/article/10.3

SpringerLinkEffect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data - Behavior Research MethodsIn psychological science, the “new statistics” refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7–29, 2014). In a two-independent-samples scenario, Cohen’s (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs—the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317–328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386–401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494–509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19–30, 2008)—may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.

#statstab #254 Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Thoughts: I share tutorial papers, as people resonate with different writing styles and explanations.

#statistics #guide #tutorial #effectsize

link.springer.com/article/10.1

SpringerLinkStatistical tests, P values, confidence intervals, and power: a guide to misinterpretations - European Journal of EpidemiologyMisinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.