Statistics

A simpler way of understanding (and teaching!) basic statistics

5. April 2019 8. April 2019

Tweet Share Email Share Share FollowLast week, I published a cheat sheet and a post on how most common statistical tests are simple linear models. This started out as a hobby project last summer, but a few weeks ago, I realized that this was actually really important. So I spent many evenings polishing, and with my heart pounding, I tweeted: It got a great reception and gathered more than half a million views on twitter within the first day. On the bright side, this shows that people care about understanding statistics, and communicating it effectively. On the flip side, it …

SPSS is dying. It’s time to change.

13. March 2019 8. April 2019 itemprop="discussionURL"2

Tweet Share Email Share Share FollowI predict that R overtakes SPSS in yearly citations by 2020. The implications are clear: If you use SPSS in your business or research, move to R now rather than later. Do not ask for SPSS competences in job postings. You will scare away the good candidates. We are doing students a disservice by teaching SPSS. Switch to JASP for simple one-off analyses and R for complex or repeated analyses. Rstudio Desktop is a highly recommended interface to R. The numbers The numbers have been clear for a number of years now that SPSS was …

Scoring Complex Span tasks using performance discontinuities (VSS 2018 poster)

18. May 2018 12. November 2020

Tweet Share Email Share Share FollowI’m in Tampa, Florida, for the annual Vision Sciences Society (VSS) meeting. I brought one of my pet projects with me. Performance discontinuities have been used in a sort-of-informally-eye-balling-graphs way to estimate working memory capacity. Some of this is cited by Cowan (2000) as evidence for a “magical” capacity of four chunks in working memory. I try to formalise the estimate of working memory capacity from performance discontinuities using a Bayesian analysis. In brief, I think that most of the current scores on serial recall tasks are either hard to interpret theoretically or very complex. Existing …

Can I use parametric analyses for my Likert scales? A brief reading guide to the evidence-based answer.

6. April 2018 8. April 2019

Tweet Share Email Share Share FollowUpdate (Aug 7th, 2018): after reading this preprint by Liddel & Krusche (2017), I am convinced that it would be even better to analyzeLikert scales is using ordered-probit models. This is still a parametric model; just with non-metric intervals between response category thresholds. What I write below still holds for the non-parametric vs. parametric discussion. Whether to use parametric or non-parametric analyses for questionnaires is a very common question from students. It is also an excellent question since there seem to be strong opinions on both sides and that should make you search for deeper …

New tutorial on computing Bayes factors in R

11. February 2018 8. April 2019

Tweet Share Email Share Share FollowI just published a practical guide on computing Bayes factors using various packages in R. Head over to RPubs and check out How to compute Bayes factors using lm, lmer, BayesFactor, brms, and JAGS/stan/pymc3. My first goal is to present solutions to things that I found difficult in the respective packages and which are relatively undocumented. A second goal was to show a side-by-side comparison on whether the packages converge on the same Bayes factor estimates. I hope to keep the document updated. In particular, I’m keeping an eye on the development of, BASand I …

Software for graphical models

23. November 2017 31. July 2022

Tweet Share Email Share Share FollowI’m currently writing a paper on a new Bayesian method for scoring Complex Span tasks. I needed some software to represent it using plate notation of a directed acyclic graph (DAG). Many people have pointed to the Tikz/pgf drawing library for latex, but I did not want to install latex for this simple task. Here I briefly review yEd, Daft, and Libreoffice Draw. yEd I ended up using yEd and produced this graph which contains plates, estimated nodes, deterministic nodes (double-lined), observed nodes (shaded), and categorical nodes (rectangles).yEd is purely graphical editing which is fast to work with and great …

Decimals of PI with consistent colors

10. November 2017 31. July 2022 itemprop="discussionURL"0

Tweet Share Email Share Share FollowI was invited to do a fun task by my office colleague, Hazel Anderson. She researches synesthesia, and she wanted to induce grapheme-color synesthesia by having participants learn pi using digit-color mapping as one available strategy. So she needed something that could a Word document with pi with an arbitrary number of decimal places. Approximately 40 minutes of the pure joy of structured procrastination and: Here’s the python script to generate this beauty: # Settings here DIGITS = 5000 # Number of decimal places to print BLOCK_SIZE = 4 # Number of digits …

$$\sqrt{2}$$ is superior to Bessel’s correction

15. April 2014 8. April 2019

Tweet Share Email Share Share Follow… for the estimation of the population standard deviation. That is, if you substitute $$n – 1$$ with $$n – \sqrt{2}$$, you get a much less biased estimate of the population SD. I just stumbled upon this when I desperately (!) looked to frequentist methods because of convergence problems with my bayesian model. The details can be seen in my post on Cross Validated about this curious finding as well as some excellent answers/elaborations. I’m re-posting the simulation results here, just because they’re pretty and I want some content on this blog. Naturally, $$\sqrt{2}$$ doesn’t outperform the analytically …

This PNAS paper could be the go-to example of how not to interpret statistics

5. December 2013 28. January 2017 itemprop="discussionURL"5

Tweet Share Email Share Share FollowA few days ago, a science reporter asked me to evaluate a PNAS paper titled Gender differences in the structural connectome of the human brain. As it turns out, this paper is awful. On the positive side, it’s an excellent opportunity to highlight common mistakes in psychology/neuroscience and how to do it (more) properly. 1. Minute effect sizes do not allow for bold generalizations From abstract: In all supratentorial regions, males had greater within-hemispheric connectivity, as well as enhanced modularity and transitivity, whereas between-hemispheric connectivity and cross-module participation predominates in females. The paper generalizes it’s significant …