A few days ago, a science reporter asked me to evaluate a PNAS paper titled Gender differences in the structural connectome of the human brain. As it turns out, this paper is awful. On the positive side, it's an excellent opportunity to highlight common mistakes in psychology/neuroscience and how to do it (more) properly.
1. Minute effect sizes do not allow for bold generalizations
In all supratentorial regions, males had greater within-hemispheric connectivity, as well as enhanced modularity and transitivity, whereas between-hemispheric connectivity and cross-module participation predominates in females.
The paper generalizes it's significant findings of structural brain differences between males and females to ALL males and females. Let's evaluate how wrong you'd be to take the author's conclusion for granted (using Common Language effect size): If you guess that a randomly picked female has got more between-hemispheric connections than a randomly picked male, you'd be wrong more than 41 % of the time. And for their second major conclusion: If you do the same tests, guessing male superiority with respect to within-hemispheric connections, you'd be wrong at least 37 % of the time. It is outright frightening to think of the possibility that policy makers could be influenced by this paper!
2. Don't do reverse inference when you can do forward inference
“Overall, the results suggest that male brains are structured to facilitate connectivity between perception and coordinated action, whereas female brains are designed to facilitate communication between analytical and intuitive processing modes”.
The authors say that the (minute) structural differences explain behavioral differences between genders. This is reverse inference which can be very uninformative when applied to macrostructures such as lobes or hemispheres (read: small effect size). Therefore the same critique applies as above.
It is particularly strange that the authors did collect gender-relevant behavioral data on all subjects (see page 4, right column) but they do not statistically test them against the structural connectivity of said subjects. This is more than obvious to do if they wanted to claim that the structural differences underlie behavioral gender differences, and it could shut down critics like me if it showed a convincing relationship. The fact that they didn't explicitly test to what extent the relationship between individual's connectivity and behavior is modulated by gender makes me worry that it does not.
3. Interaction is required to claim developmental differences
“Analysis of these changes developmentally demonstrated differences in trajectory between males and females, mainly in adolescence and adulthood”.
… but the age x sex interaction is non-significant (page 3, right column). Thus it is invalid to conclude that sexes develop differently as a major part of this difference might be present at the outset. This is an all too common error in neuroscience.
4. Keep conclusions close to what's actually tested
between-hemispheric connectivity and cross-module participation predominates in females.
and from significance-section:
”female brains are optimized for interhemispheric connections.
... yet they only statistical test it on connections between frontal lobes. It's a quite violent generalization to equate a frontal lobe with a hemisphere. Figure 2 clearly indicates that almost only connections between frontal lobes are significant in females. E.g. there are no parietal-parietal or temporal-frontal connections. It seems very post-hoc (only report significant findings), and it's certainly not a valid generalization.
If we remove the invalid parts of the abstract (and the study motivation), here's what's left:
“In this work, we modeled the structural connectome using diffusion tensor imaging in a sample of 949 youths (aged 8–22 y, 428 males and 521 females). Connection-wise statistical analysis, as well as analysis of regional and global network measures, presented a comprehensive description of network characteristics.”
… which is just a description of what they did without any interpretations. This is exactly the part that I like about the paper. The design is good, the data set is very impressive, and the modeling of connections between regions seems valid. I'd love to see these data in the hands of other researchers.
The fact that this paper got published in its current form is frankly discouraging for PNAS, for peer-review and the reputation of neuroscience. Let's hope that cases like this only generalizes to the same extent as the conclusions of this paper do.