PARCC scores lower on computer than on paper

Thursday, February 4, 2016

103

Voxitatis reported a year and a half ago that research showed taking notes on computer during lectures leads to lower retention rates for the knowledge gained in those lectures, and now the PARCC multistate testing consortium issued a report that claims students who took standardized tests in math and English during the 2014-15 school year on computer scored lower than their peers who took the tests using paper and pencil, Education Week reports.

Most of our best students learn, write, teach, and test in mathematics by drawing pictures, working little side problems in the corner of their papers, etc., none of which is possible on the PARCC tests. This forces students into a foreign mode of expression that does not resemble their classroom instruction in (good) mathematics. Scores go down.

On average, students who took standardized tests developed by the Partnership for Assessment of Readiness for College and Careers, or PARCC, scored lower if they took the test on computer than if they took it using pencil and paper. But the differences weren’t universal across all subgroups, schools, or states, PARCC noted.

“There is some evidence that, in part, the differences we’re seeing may be explained by students’ familiarity with the computer-delivery system,” the education journal quoted Jeffrey Nellhaus, PARCC’s chief of assessment, as saying.

Let’s start with the inescapable fact that the delivery system forces students to express themselves about math in ways that don’t in any way resemble math—even when computers and servers are functioning properly, which was not always the case—and then move to a demonstration using a public-release PARCC math question.

The instruction manual for the equation editor, say, to type in a constructed response for a math question is flawed in that it neither fully explains the plethora of palettes students need to navigate nor assists students with the entry for most of the answers they’ll be required to type in for the test, using a combination of math symbols and plain text. Any time a purported tool requires so many pages of instructions to say nothing, you know the tool itself is getting in the way of valid measurement.

“The differences are significant enough that it makes it hard to make meaningful comparisons between students and [schools] at some grade levels,” the journal quoted Russell Brown, chief accountability and performance-management officer for Baltimore County Public Schools, as saying. “I think it draws into question the validity of the first year’s results for PARCC.”

UPDATE Feb. 24: Liz Bowie in the Baltimore Sun writes that when the Maryland State Board of Education was told about the score difference reported here, board member S James Gates said, “If you told me I had to use an equation editor, that would be an impediment. You might think I was a blithering idiot.” Dr Gates is a physics professor at the University of Maryland and received the National Medal of Science from President Obama three years ago.

Now let’s move to that example I promised. Anecdotally, I can tell you that when I looked at student responses in high school math, I noticed, with my colleagues from other states, that students who entered answers online had a noticeable tendency to type less than those who took the tests using paper and pencil wrote.

Using the public release questions from PARCC in algebra 1, let me now move to an actual demonstration of this phenomenon. Consider this problem, with a Part A that is worth two points, one for the right answer and one for showing the work:

PARCC algebra 1, public release question VF736473, PBA #12, “Amount School Earns”

This student receives full credit in Part A. Not all of that work is required for full credit, but the student is required, according to the scoring rubric published for this question, to show correct “work to support the function.” Now look at this one:

The answer is correct, but because the student didn’t show where the 450 or 6 came from, the student gets only partial credit, 1 point. Now, 450 isn’t a number a student can just pull out of the air, so it could be argued that there’s evidence to support the statement: “The second student performed the exact same mathematical operations as the first student.”

But not all of that work was shown or, more to the point, typed in. Maybe this student found the equation editor on the test-delivery system cumbersome and didn’t want to waste time on a timed test to enter what he might have considered simple arithmetic. It’s obvious to his teacher and any reasonable adult where the numbers came from.

It can also be argued that we need to train students how to use the tool so they can achieve full credit for the work they do. But that argument is a little off the mark: What we actually care about is that both students understand this particular learning standard in algebra 1 at about the same level but received different scores on the test. We really couldn’t care less how well students are able to use some fabricated and contrived tech tool that they’ll never need to know how to use in their entire lives after taking the PARCC test.

If these students had taken the test using paper and pencil, it’s probably the case that the second student would have shown some scratch work in the corner of the page, which could be counted toward the score. But on computer, that work is not shown, so it can’t just be “assumed” and counted toward the score, despite the fact that every reasonable adult knows the student didn’t just pull 450 out of thin air.

Longhand is better for learning as well as testing

Studies showing that it’s better for learning to take notes using pencil and paper have been widely analyzed and extended, in publications as diverse as Scientific American and the Harvard Business Review. The original study, conducted by psychological scientists Pam Mueller of Princeton University and Daniel Oppenheimer of the University of California, Los Angeles, was published in the journal Psychological Science.

Their research “suggests that even when laptops are used solely to take notes, they may still be impairing learning because their use results in shallower processing,” they wrote. “In three studies, we found that students who took notes on laptops performed worse on conceptual questions than students who took notes longhand. We show that whereas taking more notes can be beneficial, laptop note takers’ tendency to transcribe lectures verbatim rather than processing information and reframing it in their own words is detrimental to learning.”

Online testing has its advantages

Perhaps the biggest advantage to online testing, compared to paper testing, is that the student responses themselves are more protected from teacher interference or mishaps during shipping. If a teacher wants to change a student’s wrong answer to a right one, for instance, she will have to access the student’s test online using a secure login and password. Tracking information will be available for investigators down the road, while investigating security violations using paper tests is much more difficult.

Cost is another advantage. Although scores are lower for students, states save money by using online testing, as a general rule: there’s no expense to print or ship test booklets. However, these savings accrue to the state for the PARCC tests and are not noticed by individual schools. Unofficially, as in I don’t have public records with the figure, Maryland saved more than $2 million because so many schools chose to give the PARCC tests online in 2014-15.

It’s also easier, in general, to meet the needs of students with individualized education plans using computers. Screen colors can be adjusted, for example, if students are sensitive to high-contrast visual fields. Kids with motor deficiencies may be able to use an adapted keyboard.

In addition, it has been argued that online testing allows results from the tests to be delivered more quickly to teachers so that they can modify instruction, where needed, for individual students. Again, this expected benefit of online testing, while it exists with smaller, school-based tests, has not materialized on the PARCC tests.

Online testing also allows different question formats to be used in testing students’ understanding, and a few of these question types can’t be delivered on paper. An example would be an audio or video clip, a simulation of a science experiment, or a similar “technology-enabled” question.

But not all tech enhancements are true enhancements

As with cost and time savings, the benefits of new-age question types have largely failed to materialize on the PARCC tests, although videos have been included on some English tests. Furthermore, technology often gets in the way instead of making the test more valid or reliable, especially in math, and may hinder students or frustrate them as they try to respond to a question in a way that’s not available to them.

Consider the PARCC released question 15 for algebra 2, here. The student is required to use a tool to plot the graph of a quadratic function. Given students’ limited knowledge of algebra, one student may want to draw a straight line or a cubic function on the coordinate axes provided.

These answers would be incorrect, but whereas a student taking the test on paper would be able to draw whatever graph he believed in his mind to be the correct answer, the online test-taker will be frustrated by the system that allows only a parabola to be drawn on the coordinate axes. No matter how hard he tries or how badly he wants to draw a straight line, the system will frustrate him without ceasing in this endeavor.

Not being able to enter an answer he may want to provide, the student may become frustrated, which could impair his ability to answer subsequent questions on the test correctly. This is worse than simply getting the one problem wrong and will result in a lower score for the student who was frustrated by the limited options available to him using the online test-delivery system.

That is, even when computer systems are functioning properly and schools don’t have to go to extraordinary lengths to schedule a school full of students into a computer lab with a limited number of computers over the lengthy testing window, other qualities of technology may make the tests invalid, unreliable, or unfair, despite the hoped-for benefit of being able to deliver accommodations to students with special needs.