Nature of Academia · Teaching Political Science · This Week in Bad Journalism

Breaking News: Students with Higher Grades Give Higher Evaluation Scores

This article on the Chronicle defends student evaluations as “not worthless.”

We’ve talked about evaluations before, and the Poli Sci Bitches have strong beliefs that they are, in fact, worthless and biased against women.

But I just want to stop for a second and examine the claim that there is a 0.5 correlation between evaluations scores and student learning, because when I saw that, I was thinking, “I wonder how they measure student learning…”

It turns out, they measure student learning with grades.


Yes, I took a look at a few of those studies linked, and while they do acknowledge the problem of using something like final exam grades to measure student achievement, that’s pretty much what many of them do.

So, there’s a 0.5 correlation between student evaluations of you and how well they did in your class. And this means that evaluations are measuring that you’re a good teacher?  Actually, I think the causal arrow goes the wrong way. You’re getting good evaluations because the students got good grades. Maybe you’re an easy grader!

But wait, it gets better.  In the blog post linked by the author of the Chronicle piece, there is a lovely chart showing a 0.53 correlation between course averages and evaluations.  But it’s a HYPOTHETICAL chart.  It’s hypothetical data, based on “what we would expect based on previous studies.”  Check it out. Hypothetical data.

hypothetical data

I get the point that the blog was trying to make with this hypothetical data scatter plot – that professors with low evaluation scores aren’t necessarily worse, that they don’t necessarily give worse grades (pardon me, they don’t necessarily “fail to incite student learning”). I heartily agree with the notion that ordinal evaluations of professors cannot be compared or used to say whether one professor is better than the other.

But to me, this is exactly the reason why student evaluations are worthless.

Being a Woman · This Week in Bad Journalism

This Week in Bad Journalism: Sex Sells, We Know.

An article was published in the Archives of Sexual Behavior that got picked up by the Guardian.  The Guardian provides a lot of cute charts and, in short, concludes that lesbian women are having a lot more orgasms than other women, and that women need a “Golden Trio” of deep kissing, oral sex, and genital stimulation to orgasm.

Well, first of all, let’s start by getting it out of the way: #notallwomen

Right? But beyond that, we can dig into the actual article a bit more to see where the cute charts and Golden Trio advice might not quite match up with the research itself.  Because all the cute charts in the world are no match for a good blaster at your side, kid.


First, let’s take a look at the study itself, because there are some methodological issues that the authors do, in their defense, mostly acknowledge.

  • The N is massive.  Yes, a large N is good, but with 52,000 observations, everything is going to be significant. Because of this, the authors select a cutoff point for the coefficient size: any coefficient less than 0.09 is not reported as significant. I’m not sure this makes much  methodological sense, but at least they’ve acknowledged that sometimes, there IS such a thing as too big.
  • They run three (six) models when I think they should have run one (two).  Rather than controlling for, or even better, creating interaction terms for, the authors run three separate models for straight, gay/lesbian, and bisexual individuals.  I’d rather see a model of all women with some indication of how (straight * oral sex) affects the likelihood of orgasm.
  • Their causal arrow might go the wrong way.  I know. I KNOW. Always my critique. But is it the case that the number of orgasms is caused by how long you’ve been together, or do you stay together longer because you’re getting lots of orgasms?  The authors partially acknowledge this by running a model in which they take out relationship satisfaction as a variable (because it might be circular, in the sense that satisfaction causes orgasms which causes more satisfaction), but at the end of the day, we’re not proving that orgasms are actually caused by any of this.

The Guardian article, however, decides not to take a very nuanced approach to reporting on this. Sex sells and talking about orgasms is a great way to get likes and shares (it worked, obviously, because we’re talking about it here), but this article makes some pretty strong claims about what’s happening in the bedroom that aren’t backed up by the data.

The cute chart with the waterworks?  Percentage who saw they “Always” orgasm? Is that really the appropriate level of comparison? ALWAYS? Always is a lot.  The article itself does include information comparisons between groups on whether they “usually/always” orgasm, but the chart is what you look at, and it’s not really telling a good story about how heterosexual and bisexual women are enjoying their sex lives.

There is a bit of a throwaway statement that another question might be to find out whether women are happy with the frequency with which they orgasm. Actually, I think that’s a much BETTER question!  It’s quality, not quantity, folks (I haven’t got any data on that).

And we haven’t even begun to talk about the problems with asking people to report on their own sex lives.

Overall, I think Dr. Lloyd’s hope that women will “talk about [the Golden Trio] with their partners” is a great one, because we can probably all agree that more communication about what we want in order to be sexually satisfied is probably a good thing (yes, I know, where’s the data?).  But this is Bad Journalism. It’s making strong claims about sexual behavior that result in lots of clicks, without enough support to back it up.

Let’s be honest:  if Sociologists aren’t careful, we’re going to lower them in the Social Science rankings. Watch it, Sociology!

This Week in Bad Journalism

This Week in Bad Journalism: Don’t (Necessarily) Ban Laptops!

The Poli Sci Bitches love a good Twitterstorm, and while we know that, as Twitterstorms go, yesterday’s discussion of this New York Times article on laptops in the classroom was pretty tame, it really resonated with us.  Which has led us to our first edition of This Week in Bad Journalism!

While there are others who can break down flawed statistics in journal articles better than we can (we only teach intro methods, after all), there are countless examples of bad research journalism that go viral on our Facebook and Twitter feeds every day. Usually what happens is:  an journalist looking to make a click-bait headline or to prove a point will link to a study without really reading it.  People share it without clicking on the study at all. Sometimes without even reading the article. Then all of a sudden you have everyone saying that giving blow jobs cures depression.


So we’re taking it upon ourselves to find and point out examples of bad research journalism, starting with the NYT article.

The Twitterverse* did an excellent job at capturing exactly why the Leave Your Laptops article was so flawed.  Our biggest beef with the article was this: it’s easy to read an abstract, find a statement that supports your point of view, and link to a study to prove your point.  It’s a lot trickier to actually read the study and see if it says what you think it says.


You see, the author quickly runs through reasons why laptops in classrooms are bad, one of which is that they reduce exam scores by 18%.  Eighteen percent! Wow! That’s a big effect!  Clearly, if removing laptops from classrooms can improve my students’ performance by almost two letter grades, I should look into it.

We aren’t going to beat a dead horse with the West Point study’s flaws (we really don’t think it’s a bad study). Here are our bullet point highlights for why it doesn’t necessarily support the NYT author’s point:

  1. The authors use information about groups (a laptop-permitted or -prohibited class) to make claims about individual student performance, what we call an ecological fallacy.
  2. The authors’ model showed that permitting or prohibiting laptops in the classroom had an effect only on the multiple choice and short answer portions, but there was no significant effect on the essay portion.
  3. The effect of the treatment is relatively small.
  4. While we acknowledge that saying, “This isn’t generalizable” is what every political scientist says when he or she is woken from an accidental slumber during any research presentation, we’re going to say it here too. This isn’t generalizable.

But our problem isn’t with the study itself (and we loved the excellent Twitter conversations about its flaws and its strengths!), it’s with the op-ed that used it in a fly-by to support a total laptop ban.

Click-bait headlines are the hill that the Poli Sci Bitches are going to die on. So, stay tuned for the next issue of This Week in Bad Journalism!

*Trying to give credit where credit is due, because these ideas were certainly not all original to us:  Joshua Eyler @joshua_r_eyler, Kevin Gannon @TheTattooedProf, Daniel Franke @danfranke79, and many others. Please let us know if you’re among those we should credit here!