Thursday, October 29, 2009

Good Tests, Bad Tests: Can The Testing Fanatics Tell The Difference?


And do they really care?

Consider the following problem:

=====================================

3, 4, 6, 7, 10, 12

The number n is to be added to the list above. If n is an integer, which of the following could be the median of the new list of seven numbers?

I. 6

II. 6 1/2

III. 7

(A) I only (B) II only (C) III only (D) I and III only (E) I, II, and III

==============================================

In considering your evaluation of the question, you may wish to know the following facts: this is the 13th of 16 questions for which the total time allotted is 20 minutes. The question appeared on the last of three mathematics sections and on the 8th of 10 sections overall in an exam that takes a total of 225 minutes. The test is one commonly used as part of college admissions in the United States.

What factors enter into your determination of how good or bad a question this is? What assumptions are you making, if any, about what comprises a good mathematics assessment in general?

Some interesting responses
I posted the above on a couple of my usual math-education lists, primarily to elicit reactions from the usual suspects who seem to mindlessly support multiple-choice standardized testing in mathematics as the only valid way to assess learning, students, teachers, schools, districts, books, pedagogy, etc. I had a particular issue in mind with this problem and didn't expect to see several people become concerned about the use of the word "added" in the problem's exposition. Here is part of one such comment:


This seems more of an aptitude question than a content
question, and it's probably in the upper 3rd to upper
4th portion of all math questions in difficulty (but
the actual difficulty level is obtained from the item
statistics after pre-testing and not by people sitting
in a room rating how hard the question is). This is not
the kind of question that should be on a college math
placement test, but from what you said, it's not. In
isolation, I don't really like it all that much (the
test taker may have forgotten what 'median' means),
but as one of many other questions, I don't see any
major flaws with it right now, except I don't like
the use of "added" in the statement of the problem,
since a possibly valid mathematical interpretation
(at least before the appearance of "seven" at the end)
is that the new list is the old list translated by n,
giving 3+n, 4+n, 6+n, 7+n, 10+n, 12+n.

A similar response appeared on another list to which I posted the problem, followed by one post that questioned whether there was really any real ambiguity. I then made the following reply:

I think it's a stretch to say this is ambiguous, and if we allow that it is, what word(s) should be substituted for "added"? "Appended"? You immediately lose a host of kids who've never seen the word and have no clue what that means. Further, it's too restrictive given the intent of the problem.

"Concatenated"? Even more problematic. "Inserted"? Fails to make the point that the number could come at the beginning or end of the list (though of course given how n is restricted, it can't be placed between two pairs of values in particular, which winds up being rather significant).

That said, I wasn't thinking about that aspect of the problem when I posted it. Indeed, my "real" target was people who seem to have enormous faith in standardized tests except for when they don't. When don't they? When they don't like the results or the implications of the results for their educational politics.

I don't think the problem is horrid. But I did note at least one point I found a little annoying, and it's yet another example of the difference between SAT/ACT-type math problems and actual assessment. (For the record, this one is from the SAT; that might have been obvious to some readers from the additional information I gave, since the ACT only has ONE section of mathematics).

The annoying thing is the restriction that n is an integer. Because of this fact, there is no number that can come between 6 and 7, and hence it's impossible for 6 1/2 to be the new median. Either the new number is 6 or less, in which case 6 is the median, or it is 7 or more, in which case 7 is the new median. No other possibilities exist.

What's wrong with that? Nothing, per se. Except that I believe it's really easy for a student who is trying to do problems with an average of 1 minute 15 seconds per problem available to miss that restriction. Of course, in HINDSIGHT, the restriction is pretty much the point, and if the purpose of the problem was to illustrate that point and teach students something, I'd have no complaints.

But of course TEACHING students is the last thing the SAT is used for and certainly is NOT the reason the test-writers construct it. The next conversation or article you encounter in which those who are responsible for the SAT or ACT address what students learn from these tests will be the first. They do NOT provide formative feedback, at least not in the vast majority of cases. Students take the tests and await the numbers. End of story.

So trying to trip kids up, which is so often the strategy test-makers use to help produce the holy bell curve they seek, is increasingly the rule as the problems "increase in difficulty." There's an upper bound on what topics the test makers have decided to address, and that means they need to get trickier rather than more thought-provoking. And why do the latter when no one is going to be reflecting on the problems afterwards?

So what does a teacher, a student, a parent, an administrator, a politician, or anyone else learn about what any given student, class, school, district, state, or nation knows about the median from this problem? Can anyone state with a straight face what it MEANS about a student's mathematical knowledge if s/he gets this one wrong (or right)? What's being tested here, really? What does the teacher tell the student about where s/he went amiss? What does the teacher need to teach "better" for the next time?

On my view, while of course there is mathematical content here, it's impossible for anyone to claim to know whether any student gets this wrong because of a misunderstanding about what the median is or how to find it, because of misreading, because of not knowing what an integer is, because of forgetting or missing the restriction of n to integers, because of not correctly figuring the implications of that restriction, or anything else, EVEN if we know what answer choice the student selected.

And yet huge claims are made and consequences suffered based on the results of student performance on such problems. Imagine if you can a state deciding to make its high school graduation test the SAT. Impossible, you say? What state would abuse a test like that? What state would use a test not based closely on its curriculum framework? Well, Michigan, for one, which has been using the ACT to that end for the last few years, despite the fact that its curriculum framework and grade-level content expectations (GLCEs) do not in any way concern or inform the production of the ACT.

Amazing? Not really. We continue to be in the grips of testing insanity. All one needs remember to understand this is one simple thing: it has NOTHING to do with learning or education.