Education
1:54 am
Thu June 7, 2012

Computers Grade Essays Fast ... But Not Always Well

Originally published on Thu June 7, 2012 5:04 am

Imagine a school where every child gets instant, personalized writing help for a fraction of the cost of hiring a human teacher — and where a computer, not a person, grades a student's essays.

It's not so far-fetched. Some schools around the country are already using computer programs to help teach students to write.

There are two big arguments for automated essay scoring: lower expenses and better test grading. Using computers instead of humans would certainly be cheaper, but not everyone agrees on argument No. 2.

Les Perelman, director of the student writing program at MIT, is among the skeptics. Perelman recently tried out a computer essay grading program made by testing giant Educational Testing Service.

"Of the 12 errors noted in one essay, 11 were incorrect," Perelman says. "There were a few places where I intentionally put in some comma errors and it didn't notice them. In other words, it doesn't work very well."

Perelman says any student who can read can be taught to score very highly on a machine-graded test.

That's because software developers build the computer programs by feeding in thousands of student essays that have already been graded by humans.

Then, by identifying the elements of essays that human graders seem to like, the programs create a model used to grade new essays. If human graders give essays with long sentences high marks, for example, the programs will tend to do so, as well. If human graders like big words, the programs will also, say, "manifest a tantamount predilection for meretricious vocabulary."

So, Perelman says, it's possible for students to score an A on a computer-graded essay simply by combining all the elements of an essay that would be scored highly by a human grader.

Of course, if you know the elements of an A essay and are able to combine them, odds are you're already a pretty good writer.

Mark Shermis, dean of the University of Akron's College of Education, recently co-authored a study of nine different essay-grading computer programs. On shorter writing assignments, Shermis says, the computer programs matched grades from real, live humans up to 85 percent of the time.

But on longer, more complicated responses, the technology didn't do quite as well.

"It will not identify the next great American novelist," Shermis says. "But if what you're trying to do is communicate thoughts and ideas in a very straightforward manner, then the technology is actually a wonderful tool."

But not always. Shermis ran the Gettysburg Address through one of the earlier-generation computer grading programs, one usually used to evaluate the writing abilities of college freshmen.

Suffice it to say, Abe did not ace the test.

"On a scale of 1 to 6, one of the greatest presidents of the United States was only getting 2s and 3s," Shermis says of Lincoln's scores. "We were actually very shocked."

A history professor told Shermis he shouldn't worry; the speech is more famous for its context than for the actual words themselves.

Still, school officials trying to cut expenses are intrigued by the promise of scoring thousands of student essays in seconds, without the need to hire human graders.

Jeff Pence, who teaches writing to seventh-graders in a Georgia middle school, is already sold on the idea.

The computer graders he uses give students instant feedback on every draft. Pence says there's no way he and his red teacher's pen could do that. And quicker responses, he says, lead to more writing.

"The quantity drives the quality up," Pence says. "It's kind of the old bicycle thing — the best way to learn how to ride a bicycle is to ride a bicycle. And the best way to get better at writing is to write and receive consistent, timely feedback."

Pence says it would be great to have a couple of dozen real, live human teachers reading every student draft. It would also be nice, he says, if his district found the money to hire those extra teachers. But until then, he's holding on to his computer programs.

Copyright 2013 WKSU-FM. To see more, visit http://www.wksu.org/.

Transcript

DAVID GREENE, HOST:

OK, we have to admit it - there are some things that computers do better than humans. But essay grading? Molly Bloom, of member station WKSU, reports on a push to use computer programs to teach students how to write.

MOLLY BLOOM, BYLINE: Imagine a school where every child gets instant, personalized writing help for a fraction of the cost of hiring a human teacher. Computer essay graders are on the rise in Ohio and other states. Advocates say they save money, and grade better than some humans.

Sure, computers may be cheaper and more efficient - but better graders? Les Perelman doesn't think so. He directs the writing program at MIT. Perelman recently tried out a computer essay-grading program made by education giant ETS.

LES PERELMAN: Of the 12 errors noted in one essay, 11 were incorrect. There were a few places where I intentionally put in some comma errors, and it didn't notice them. In other words, it doesn't work very well.

BLOOM: Perelman says any student who can read can be taught to score very high on a machine-graded test. Why? Because software developers build the computer programs by feeding in thousands of student essays that have already been graded by humans. The programs discern which elements of essays human graders seem to like, and look for the same things.

So if human graders give essays with complex sentences high marks, the programs will tend to do so, too. If human graders reward big words, the programs then will - manifest a tantamount predilection for meretricious vocabulary.

Perelman and other critics say that makes the computer systems too easy to game. Of course, if you know the elements of an A essay and are able to combine them, you're probably already a pretty good writer.

But computers have their limits. Mark Shermis heads up the University of Akron's College of Education. Earlier this year, he co-authored a study of nine different essay-grading computer programs. Shermis says that on shorter writing assignments, the computer programs matched grades from real, live humans up to 85 percent of the time. But on longer, more complicated responses, the technology didn't do quite as well.

MARK SHERMIS: It will not identify the next great American novelist. But if what you're trying to do is communicate thoughts and ideas in a very straightforward manner, then the technology is actually a wonderful tool.

BLOOM: Not always. Let's take the Gettysburg Address. Well, Shermis ran it through an earlier computer-grading program, one usually used to evaluate the writing abilities of college freshmen. Sad to say, Abe did not ace that test.

SHERMIS: On a scale of 1 to 6, one of the greatest presidents of the United States was only getting 2's and 3's. And we were, actually, very shocked.

BLOOM: A history professor told Shermis not to worry. The speech is really more famous for its circumstances than for the words themselves. The promise of scoring thousands of student essays in a second or two, without hiring human graders, has caught the attention of school officials trying to cut expenses.

Jeff Pence is already sold on the idea. Pence teaches writing to seventh-graders in a Georgia middle school. With computer graders, the students get feedback on every draft instantly. Pence says there's no way he and his red pen could do that. And quicker responses lead to more writing. And with that, Pence says...

JEFF PENCE: The quantity drives the quality up. It's kind of the old bicycle thing: The best way to learn how to ride a bicycle, is to ride a bicycle. And the best way to get better at writing is to write and receive consistent, timely feedback.

BLOOM: Pence says it would be great to have a couple of dozen real, live, human teachers reading every student draft. It would also be nice if his district found the money to hire those extra teachers. Until then, he'll keep his computers.

(Computerized voice) For NPR News, I'm Molly Bloom.

GREENE: That story comes from StateImpact, a collaboration between NPR and member stations examining the effects of state policy on lives and communities. And you're listening to MORNING EDITION, from NPR News. Transcript provided by NPR, Copyright NPR.

Related Program