So a day or so after my last post, It's not the science that is junk, it's the measures, I came across this interview of Jesse Rothstein by Rachel Cohen in the American Prospect. There's lots of good stuff in there and it's worth reading. I don't mean to take away from the import of Jesse Rothstein's work (I am a big fan of his work and of Rachel Cohen's work) but a piece of it kind of demonstrates what I was trying to get at in my last post.
Talking about VAM, Rothstein said,
It’s very controversial and I’ve argued that one of the flaws of it is that even though VAM shows the average growth of a teacher’s student, that’s not the same thing as showing a teacher’s effect, because teachers teach very different groups of students.
If I’m a teacher who is known to be really good with students with attention-deficit disorder, and all those kids get put in my class, they don’t, on average, gain as much as other students, and I look less effective. But that might be because I was systematically given the kids who wouldn’t gain very much.So, yes, this is a very good point: there is a difference between showing "the growth of a teacher's student" and "showing a teacher's effect." And yes, according to test scores, and how well students perform on them, teachers can look more effective or less effective, regardless of how good they are at teaching.
The he says, when she asks if he is skeptical of VAM,
I think the metrics are not as good as the plaintiffs made them out to be. There are bias issues, among others. One big issue is that evaluating teachers based on value-added encourages teachers to teach to the state test.
During the Vergara trials you testified against some of Harvard economist Raj Chetty's VAM research, and the two of you have been going back and forth ever since. Can you describe what you two are arguing about?
Raj’s testimony at the trial was very focused on his work regarding teacher VAM. After the trial, I really dug in to understand his work, and I probed into some of his assumptions, and found that they didn’t really hold up. So while he was arguing that VAM showed unbiased results, and VAM results tell you a lot about a teacher’s long-term outcomes, I concluded that what his approach really showed was that value-added scores are moderately biased, and that they don’t really tell us one way or another about a teacher’s long-term outcomes.If you look at this response and then go back to the previous one I pulled out, you see that Rothstein is referencing "growth" and then "bias." That certain types of students won't "gain as much as other students" and that the value-added scores are "moderately biased" and that they don't tell us much about a teacher's "long-term outcomes."
Nowhere in there is there a repudiation of the measures, of the tests themselves, or even a question about their validity. His responses seem to assume that determining a teacher's effectiveness according to test scores is unfair because some students won't perform on them and that these tests can show growth and gains in learning. Nowhere does he question that the tests themselves might not be reflective of real learning, good teaching, or of quality education.
And then the bias and assumptions critique, that has to do with the model, and not with what is being fed into the model, i.e., test scores. Arguments about the strength of statistical models are worth having but those should start with probing what's being fed into them.
If someone like Jesse Rothstein isn't questioning that, then test-based accountability isn't going away anytime soon. It will forever be a matter of tinkering with models.