There is a fair amount of dispute over using “value-added” methodology to describe teachers. I have my concerns as well, but I am glad that the debate has moved from using data at all to what kind of data should be used.
The debate over how to use data to assess teachers reminds me of the debate in baseball statistics over so-called “advanced statistics,” and the sabermetric revolution of the last 25 years or so. Many baseball fans insisted that their eyes could tell them the whole story, and the basic easily-calculated statistics could give the best analysis. Purveyors of new acronyms — OBP, VORP, WARP, etc. — were quickly derided as pointy-heads with no knowledge of the game.
Similarly, non-teachers attempting to evaluate teachers based on student test scores have faced vociferous objections, particularly in the rather comical case of the LA teachers union threatening to boycott the LA Times for releasing its analysis.
The reason I bring up VORP, in particular, is that advanced statistical analysis should still yield results that are relatively similar to what one’s own eyes see. The current VORP rankings have a fairly unsurprising set of players at the top of the rankings. If one were to ask old-timer baseball writers about the best position players in baseball, I imagine the list would include many of the players on the top 50 VORP list. (NOTE: There are no Cubs on this list. Big shock, I know.) Yes, there are counterintuitive findings (Richie Weeks??), but that’s the nature of statistical analysis.
Furthermore, many different advanced value-added metrics may yield many different results.
Per the NYT:
“If these teachers were measured in a different year, or a different model were used, the rankings might bounce around quite a bit,” said Edward Haertel, a Stanford professor who was a co-author of the report. “People are going to treat these scores as if they were reflections on the effectiveness of the teachers without any appreciation of how unstable they are.”
Nowhere is this clearer than in basketball, where the “correct” advanced metric to use is still a crowded field, with Wins Produced, Win Shares, PER, and NBA Efficiency all with varying degrees of support.
My point, then, is that even in a sport with clear winners and losers, points and statistics to record, there will always be debate about the merits of one statistical model over another. Nevertheless, these models give us a better picture of the talent we have and how to use it.
In teaching, we don’t even know necessarily what the statistics are that we need to look for.
Yet, just as in sports, the best teachers are often the teachers we already know. One need look no further than Zenaida Tan, one of the best teachers in LA Unified according to the LA Times analysis. Everything about her classroom reflects what we already know about good teachers from a qualitative standpoint: she is nurturing, persistent, and determined, with high expectations for her students. In the same way that everything I physically see about Albert Pujols tells me that he is one of the best players in baseball, the statistics verify the accuracy of my hypothesis.
Besides, just as in baseball, we have already used statistics in teaching for a long time; they’re just very bad statistics. Teachers are often fired or marked down on evaluations because of tardiness or absences from school, particularly because these are easy to quantify. The numbers we use to evaluate teachers, like batting average for hitters or wins for pitchers in baseball, are obsolete and need updating and revision to be effective tools.
This is not to say that I fully endorse the use of test scores to evaluate teachers. Far from it; test scores are yet another in a series of statistics that I believe is flawed. But the only way we can start on the path to better metrics is from the statistics that we have now. Basketball didn’t even keep track of steals or turnovers until the mid-70s, while advanced defensive metrics for baseball are still in the works.
The flaws listed in the NYTimes article don’t indicate a need to end value-added analysis; rather they point to a need to refine the process. Using the value-added analysis as one factor in a wide variety of metrics can provide a valuable tool in improving our talent and resource allocation in education, just as value-added analysis has been a boon for statistically-minded baseball teams.
EDIT: I forgot to mention that this obsession with performance pay is by no means perfect. In fact, professional sports, as well as a number of other fields, indicate the enormous inability of even educated observers to successfully analyze talent. Players are often “overpaid” or “underpaid,” and many disbelievers in advanced stats inevitably pick a team of schmucks. Thus, the whole idea of making all teachers at-will employees seems, to me, a rather misguided idea, particularly since salary allocation is poorly executed in almost all fields, even those with legitimate value-added statistical methodology.