The New York State Common Core test scores are meaningless

If you know me at all, you already know the disdain I feel towards our increasing reliance on tests and test scores to make educational decisions. As the Common Core continues its slow march into our classrooms, we got to see its first effects in the test scores of New York State, and according to pretty much everyone, they were terrible. I posted the other day with some of the explanations as to why.

However, all of that leaves out some of the intricacies as to how the proficiency rates are calculated. Essentially, students take the test, they get a certain percentage of questions correct, and they are placed into different categories (failing, passing, superawesome, whathaveyou) based on cut points.

This is one of those moments that should easily expose to anyone that test scores are not “objective” measures by any means: they’re actually heavily reliant on the biases and assumptions of individuals and groups of individuals who decide what the cutoff points for each particular area.

At least in the case of New York, if not most Common Core states, the process for setting cut scores (and thus proficiency rates) relied on a series of assumptions that makes the final proficiency rates pretty much useless. The ever-hilarious Jersey Jazzman has a great rundown of how we ended up with the cut scores we got:

  • Start with “college and career ready,” an ill-defined phrase that could mean just about anything. Leap to freshman year GPA in selected courses at a limited number of four-year colleges. Could be graded on or off a curve (normative or criteria-based – more on this later); varies widely between professors, schools, and courses; doesn’t necessarily indicate whether the student’s entire college experience was “successful.”

  • Spring to SAT/PSAT scores, somewhat correlated to first year college GPA, but a normative assessment (meaning a set number of students must score at each percentile – someone’s got to lose). This is a test, by the way, tightly correlated to family income.

  • Bounce to 8th grade NY State test scores, which are given three years before the SAT. Carom (got a thesaurus?) to 3rd through 7th NY State test scores, which would assume all children follow the same learning trajectory.

  • Jounce (SAT word!) to teacher/principal evaluations and school evaluations and student retention decisions.

This is absurd. You start with a meaningless assumption, tie it to wildly varied data, and then project it back 10 years through more meaningless assumptions and useless data to come up with cut points for 3rd graders.

This is the data we’re going to use to make major decisions about teachers, which ones get fired, which ones get raises, as well as schools, which ones get closed, which ones get funding. It borders on the absurd. The big problem is those who actually make the decisions don’t have any idea what this is all about. They don’t know where the numbers come from, they don’t know what they mean, and they don’t know how tenuous the data really is.

The absurdity of this system should be obvious to anyone:

It does not matter that SATs are nearly immovable. It does not matter that there is absolutely no proof that if you increase a third grader’s scores that his SATs, nearly a decade later, will go up. It does not matter that those pesky, unscientific grades that teachers give are better predictors of college success.

It does not matter that the hard work that a student puts into her GPA is a far better predictor of college graduation than her test scores.  We are to believe that if we buy all of those Common Core products and data systems and our third-graders sit through days of difficult SAT, NAEP and PSAT aligned tests, their scores will soar and they will do better in college.

On some level, what they’re trying to accomplish makes a modicum of sense: they have a complicated system with varying outcomes that they’re try to harmonize so it optimizes a single variable, college readiness.

Fine. I guess.[1. Although I have problems with this as well. I’m not convinced that optimizing for “college readiness” makes a lot of sense. Part of that is going to college may not really be the best path for everyone, and we’ve done little to provide alternate paths to middle-class life. Going to college is similar to owning a home, in that it’s now pretty much a doctrine of faith that you must do these things in order to be middle-class. Any suggestion to the contrary is met with a level of resistance far beyond the significance of the point.]

But we’ve then placed additional emphasis on what that data means in achieving that objective that it does not deserve and we end up with an absurd and unworkable system.

When will the madness end?

Edit this post on GitHub.

This post is part of the thread: Testing & Standards in Education - an ongoing story on this site. View the thread timeline for more context on this post.