Temps spend just minutes to score state test ; A WASL math problem may take 20 seconds; an essay, 2 1/2 minutes
Seattle Times staff reporter
Copyright 2000, The Seattle Times Co.
IOWA CITY, Iowa - In a matter of minutes, a $10-an-hour temp assigns a score to your child's test, a grade that helps determine how money is spent in Washington schools, which courses students take and, before long, who is denied a high-school diploma.
Such weighty decisions rely on the judgment of seasonal workers with 16 hours of training who sift through dozens of exams each day.
Working at assembly-line pace, these college-educated moonlighters spend as little as 20 seconds grading each math question on the crucial Washington Assessment of Student Learning, or WASL.
And scorers plow through as many as 180 writing essays a day, at a rate of 2 1/2 minutes each.
Yet the importance of the test makes the temporary workers who score it - 1,500 miles from the nearest Washington classroom - nearly as important to students as their own teachers.
An up-close look at the scoring reveals:
-- On nine of 10 exams, a single grader scores the work of Washington students. Even when a second grader reads an exam and reaches a different conclusion, the first score stands. Many other states pay extra to have tests read by two scorers, with a third called in if the first two disagree.
-- Some scorers are teachers; many are not. Among those grading Washington's test last month were a small-business entrepreneur with cash-flow problems; an office manager juggling two other jobs; and a new college grad waiting to land a "real" job.
-- Scorers must have college degrees, but not necessarily in the subjects they score.
-- Some scorers skim tests. Others question the consistency of grading and complain of pressure to score quickly at the cost of accuracy.
-- Washington does not verify the qualifications of scorers. State officials are not on hand to monitor the scoring process.
-- The scoring company has safeguards to keep grading reliable. But that hasn't prevented bungles that have affected at least seven states, including Washington.
Scores drive weighty decisions
The 4-year-old WASL has become the ultimate yardstick for measuring schools and students. It is given each spring to 240,000 fourth-, seventh- and 10th-graders, and measures reading, writing, math and listening.
Its results, due in early September, will be dissected by policymakers, parents and educators. Newspapers publish the scores. Parents use them to shop for schools. Real-estate agents use them to sell neighborhoods.
The consequences are growing:
Before this year's fifth-graders can get their high-school diplomas in 2008, they will have to pass the 10th-grade exam. A state committee is developing a system of rewards and sanctions for schools based largely on WASL scores.
But there has been little discussion of how the WASL is scored - or who scores it.
"I had the impression it was a little more thorough and scientific than that," said Charles Hasse, a fourth-grade teacher in SeaTac with 23 years of experience. "The amount of time they spend (scoring) surprises me a lot - I couldn't do it in that length of time."
"The part that bothers me is there's no double-check" of scores, said Stephanie Haskins, principal at Seattle's Madison Middle School. Assigning grades on one person's say-so "clearly would be subjective."
The company that scores Washington's test, National Computer Systems, is considered "the Cadillac of the industry," according to one testing expert.
State schools Superintendent Terry Bergeson said she has full confidence in NCS and in the reliability of scores. Company officials say they hire educated people and train them to follow standards set by the state of Washington, which pays NCS about $5 million a year.
But even the company acknowledges that hand-scoring is a human endeavor open to error.
"This is a very fallible process, and mistakes are going to be made," said George Madaus, a professor of educational testing and public policy at Boston College. "And yet people take those numbers as if they're written in stone."
Center of Excellence
The boxy brick-and-glass building where the WASL is scored has a big name to live up to - NCS calls it the Center of Excellence. Set in an office park surrounded by grassy fields on the outskirts of Iowa City, it is the nerve center of NCS's burgeoning scoring business.
NCS, with nearly 5,000 full-time employees and annual revenues of $630 million last year, began scoring tests in 1962. The company, headquartered in suburban Minneapolis, announced a month ago it is being acquired by Pearson PLC, a media and publishing giant based in the United Kingdom, for $2.5 billion.
NCS scores customized tests from 16 states. This year, it expects to hand-score about 82 million student responses - up from 13 million five years ago - at 15 scoring centers nationwide.
About two-thirds of the Washington test is multiple choice, scored by computer. The rest requires students to write answers, including two long essays and a number of short-answer responses to math and reading questions.
In Iowa City, scorers sit in padded swivel chairs in a room the size of a school cafeteria with the feel of an insurance office - lavender dividers, fluorescent lights, bare walls, high windows and a library-like quiet.
Scorers say the work can be mind-numbing, the intellectual equivalent of flipping burgers. They grade up to 80 math exams during a 7 1/2-hour shift, or about 11 an hour. At that pace, with 16 written responses on each exam, they'd average 21 seconds per question. The company says scorers see as many as 180 writing essays a day, or 24 an hour. That's 2 1/2 minutes each. These times are averages and include questions left blank as well as responses that go on for several pages.
NCS officials and scorers say that good graders can get into a "flow" that allows them to score quickly.
NCS agreed to give a Seattle Times reporter a tour of its Iowa City facility last month. Bergeson sent a representative from her office to oversee the visit.
But NCS officials prevented any contact with scorers on site, citing confidentiality agreements to keep test questions secret.
Later interviews with nearly a dozen scorers, contacted independently, reveal a company that is well-regarded by many workers for its efforts to ensure accuracy.
But the same scorers also describe practices that raise questions about the grading process.
"After doing this work, I know for sure that I don't want my own children to take these kinds of tests," one scorer said.
Workers say scoring the WASL is a good way to supplement their income or tide them over between jobs. Most had answered one of the frequent ads in local newspapers.
"I kind of bop in and out according to their workload," said one, a salesman and former teacher.
Other scorers include an artist, a former aerospace worker, several young college graduates and teachers on hiatus.
One woman said she was saving for a wedding and a car. A former teacher, she scored tests at night after spending 10 hours at two other part-time jobs.
"It's low-maintenance, low-cerebral work," she said. Like others, she asked that her name not be used for fear of jeopardizing her job.
Scorers generally give NCS high marks for the quality of training it provides and the workers it attracts.
"The people that do this are top-notch, high-quality people," said the salesman. "I sit by a gal with a master's degree in microbiology."
Getting it right, he said, is more important than doing it fast.
"We know how important this is," he said. "You connect with that kid somewhere in the state of Washington."
But the former teacher said she sometimes felt confused about how to grade a borderline essay.
"The scoring guide isn't always clear when it's between a 2 and a 3," she said.
Several other scorers confessed to skimming tests.
A new anthropology grad said he could scan 100 math tests on a good day. Another scorer said he could read 300 student essays a day, or one every 90 seconds - nearly twice the company average.
A recent art and history grad said she looked for certain key numbers or phrases in math problems so she wouldn't have to read a whole paper. She described herself as an easier grader, giving students the benefit of the doubt.
"You can tell if they know the answer, even if they don't write it all down," she said.
NCS offers no financial incentives for reading faster than average. Rather, fast graders get small treats like candy or the chance to leave early. During last month's grading, a hand-written poster announced "The Thousand Reads Club" - recognition for 10 graders who had read at least 1,000 seventh-grade essays.
Most scorers said any pressure to grade quickly was indirect - at least until the final week.
Then, "they started hounding us about the pace," said one scorer. Supervisors asked them to "pick up the rate" several times a day with one announcing: "Don't pay as much attention to accuracy."
That sense of pressure was confirmed by a second scorer.
Bergeson said she was assured by NCS that there was no last-minute push and that any "drift" in scoring quality would be detected by the company's quality-control system.
While she hasn't been to Iowa City herself, Bergeson said she isn't troubled by the speed of the grading; more important is making sure scorers are well trained and embrace Washington's standards.
"I am very confident they can score an individual test with a high degree of reliability and validity," she said.
Trouble finding scorers
NCS scorers hired to grade the WASL must hold a bachelor's degree from an accredited college or university, provide a writing sample and pass a practice scoring test.
With unemployment at record lows this year, NCS has a tough time finding and keeping good scorers and sometimes relies on temporary agencies.
While NCS boasts that it hires a high percentage of scorers with teaching credentials, it could provide no specific data.
In contrast to Washington's requirements, the national Advanced Placement exams, taken by high-school students for college credit, are graded by AP teachers and college faculty with backgrounds in the subject matter. Each essay is scored by two or three readers.
Other states require that teachers or people with teaching credentials score their tests, sometimes even requiring that scorers be teachers from their own states.
State and company officials, as well as a number of national testing experts, say having teachers as scorers makes little difference.
"It's the performance you're after, not the credential," Bergeson said.
Graders bring their own preferences about writing to the job, said Catherine Taylor, an associate education professor at the University of Washington who has studied the WASL.
Some are drawn to grammar and spelling, while some are swayed by ideas, and others give weight to vocabulary and expression.
The scoring standards are set each year by a team of Washington educators and state officials.
The team flies to Iowa City to select a random sample of student essays that represent the range of possible answers. Those samples become training papers intended to keep grading consistent.
"Picking papers that exemplify the level of performance we want sets the whole stage," Bergeson said.
NCS has safeguards built into the grading process, said John Anderson, NCS's project leader.
-- Student papers previously graded by Washington educators and by scoring supervisors are mixed into stacks of ungraded exams. Scorers' decisions are checked against those first grades.
-- Supervisors rescore a random sample of 5 percent to 10 percent of all exams. Workers who don't score consistently are retrained or, in extreme cases, fired.
-- One in 10 exams is graded by a second scorer, but only to identify scorers grading too high or too low.
"It's not a scientific process, but the reliability has been very high," Anderson said.
Question by question, the reliability rate varies. Two scorers grading an essay question might agree 80 percent of the time, company officials said. A more straightforward computation question might be graded the same 100 percent of the time.
For the exams as a whole, though, statistics show a high rate of agreement among scorers.
Two scorers grading the same exam gave the same score as often as 98 percent of the time, according to an analysis by Taylor, hired by the state to analyze the WASL.
"How confident are we that if another NCS scorer read a whole paper, they'd reach the same score?" Taylor said. "We can be very confident."
Others said chance could account for some of the consistency.
Walt Haney is an education professor at Boston College and senior research associate at the Center for the Study of Testing. His research shows that tests graded on a 2- or 4-point scale, as the WASL is, typically get scores in the middle.
An analysis he did of Ohio test scores found 93 percent of exams, scored by 10 graders from another scoring company, received either a 2 or 3 on a 4-point scale.
Haney recalculated the accuracy rate of the scorers on the Ohio test - a figure the scoring company boasted was 99 percent - and found it dropped to about 66 percent when chance was taken into account.
The 98 percent reliability cited by Taylor is "practically unprecedented," Haney said. "It does raise my eyebrows."
Until now, Washington hasn't paid extra to have all exams double-scored because state officials don't yet consider the WASL a true high-status exam - no student promotions or diplomas ride on the result.
But next year, Washington will pay an additional $700,000 a year to NCS to have all written essays read by a second scorer, and to resolve any differences between the scores.
The state also is considering creating an appeals process for teachers or parents who suspect scoring errors.
In the past several years, NCS has returned test scores late or riddled with errors to Washington and at least six other states.
Earlier this month, NCS employees put the wrong answers into keys used to score Minnesota's basic-skills test. Almost 8,000 Minnesota students were told erroneously that they'd failed the math portion, including some seniors who were denied diplomas. Several have sued.
Florida officials plan to fine NCS $4 million because test results were delivered almost a month late, in part because the company couldn't find enough scorers. Student promotions, summer-school attendance and eligibility for vouchers hang on those results.
In Washington last year, about 410,000 essays written by Washington students were scored incorrectly, with too many receiving perfect scores for grammar and spelling.
The flub occurred when scorers misinterpreted guidelines set by Washington educators and applied a less-rigorous standard to the essays.
The error cost NCS and testing contractor Riverside Publishing $600,000, required each essay to be rescored and delayed release of writing scores by two months.
"I was not happy they didn't have more security measures in place, but they've corrected it," said Bergeson. "I have a lot of confidence in NCS, or I wouldn't have continued this with them."
In June, Riverside Publishing won a second, five-year contract, worth between $40 million and $50 million, to write and manage the WASL. As the scoring subcontractor, NCS will receive nearly two-thirds of that.
"Stakes have gotten too high"
NCS officials call past scoring mistakes anomalies.
Testing experts say they show that the scoring industry is struggling to keep up as each state develops custom-designed tests and demands results quickly.
Even with safeguards in place, there is no way to factor out human error completely, said Robert Linn, a national expert on testing who serves as an adviser on the WASL.
"The stakes have gotten too high," Linn said. "The tests just can't bear the weight."
"Real-estate agents get involved with it. School-district administrators are under the gun," said Gail Kalinoski, managing editor of the Educational Marketer newsletter, which covers the educational publishing industry. "It's all so political, and it's become big business."
Leaders of the two national teachers unions blasted "testing mania" at their conventions this summer. And the American Educational Research Association listed 12 conditions for high-quality testing programs.
No. 1 on the list: Don't make high-stakes decisions based on a single test.
Even testing companies argue against using scores alone to make decisions about how schools are ranked or who is promoted.
Others question whether information gleaned from the tests is all that useful.
"You can set up rules for scoring that are so restrictive, you can achieve remarkable consistency in scoring," said H.D. Hoover, an author of the popular Iowa Tests of Basic Skills.
But such rigid rules make it impossible to judge student creativity and critical thinking, Hoover said - the very skills policy-makers say they want to encourage.
Scorers are taught to look for canned answers - an opening paragraph, a few sentences in the middle and an ending, with no obvious grammatical errors, said Monty Neill of the National Center for Fair and Open Testing, a Massachusetts testing-watchdog group.
"You're not measuring very much that way, and that's what you end up teaching - not much," Neill said.
Jolayne Houtz's phone message number is 206-464-3122. Her e-mail address is firstname.lastname@example.org.
Sample 4th-grade math question:
A store has a special sale on cassette tapes. These tapes usually cost $1.99 each. The store ad is below:
to hold the tapes
Are you really getting the storage box for free? Explain your thinking using words, numbers, or pictures.
Gives a valid conclusion and shows effective reasoning through a complete analysis of the problem.
(Editor's note: `Valid conclusion' indicates correct reasoning, but not necessarily a correct answer. In this case, the correct answer is that the cassette box would cost $2.04.)
I DO NOT THINK your getting box free because if you add then it will come up, to 12 dollors and I don't think there is a $2.98 cent tax on it
Partially flawed reasoning either through incomplete analysis or inadequate/incomplete verification.
YOU ARE NOT getting the box for free you are paying 7 dollars and five cents extra for the tape box
Shows very little or no evidence of mathematical reasoning and shows no evidence of a valid mathematical analysis of the problem.
IT'S A GOOD DEAL but I don't think the box is included, they just say that to make you buy the tapes.
Seventh 7th-grade writing assessment question
Directions: Today you will write a story based on the picture below. Take a minute to look at the picture.
Write a story based on this picture.
An effective writer may consider the following points:
-- Who are these young people?
-- Do they live near the creek, or are they visitors?
-- What were they doing before they started crossing the creek?
-- What has the girl noticed in the water?
-- Why is she signaling for her friend to be quiet?
-- What will happen next?
Of a possible 6 points, this response got 4: 3 of a possible 4 points for content, style and organization and 1 of a possible 2 points for "conventions" - i.e. rules of standard English, spelling, capitalization, punctuation, and sentence formation; complete and fluent sentences; indicates paragraphs for the most part
A LONG TIME AGO there was said to be a creek that had purple and blue fishes and all kinds of weird thing's in it.
One day there was a young girl and boy who didn't believe this story so they diceided to go to this creek and have a pinic because they didn't live to far away so they could walk.
As soon as they were done they starting crossing the creek. As soon as the girl started steping one foot in front of the other. The girl rembered that noone dared to go across the creek but she didn't belivie any of what the people had told her. So her cousin followed her.
All of a sudden the girl saw a purple fish start moving. She couldn't belivie it for a moment she thought she was still dreaming. So she turned around to get her cousin and he ran to get the pinic basket to put the fish in there. Her cousin started runing and yelling so she signaled him to be quiet or he would scare the fish away.
Before they could say fish it started changing color to a dark green and she picked it up and it was a frog.
In a few minutes the frog turned into a bird and it started flying it came down on the girls sholder and it told her that a witch had put a spell on him and he had turned into a fish. The Bird told them how thankfull he was to be back to himself agan.
The boy and girl soon left and they had been happy to save the bird.
Sample 10th-grade critical review essay question
Directions: For this writing task, you will have the opportunity to follow all of the steps of the writing process: prewriting, writing a first draft, revising, editing, and writing a final draft. It is okay to cross out words and sentences and try different ways to get your ideas across. You may use a thesarus and dictionary in print or electronic form. Spell check may not be used. Please note: The only piece of writing that will be scored for this writing task is your final draft.
In literary reviews, critics examine various positive and negative aspects of a literary work to determine the effectiveness and the quality of the work.
Write a critical review for a teacher evaluating a short story, a novel, OR a play you have read.
Answer (rated 4 points for content, organization, and style and 2 points for conventions (contains complete sentences and shows paragraphs clearly; uses words, capitalization, and punctuation correctly; contains grammatically correct sentences; spelling).
"Flowers For Algernon"
THE STORY ABOUT MEMORIES, grief, stamina, and sympathy are all together in Daniel Keyes "Flowers for Algernon." This is a story of a young mentally challenged adult who gets a chance to be "normal," which is his only dream. Yet he finds that his wish only led him to a type of sadness and loneliness that has been with him all of his life, but he has never realized - This heartwarming story gives us a chance to believe and also to wonder.
Daniel Keyes does an excellent job of exibiting the life of a mentally challenged man through his journal entries. He begins the story with misspelled words and punctuations. Right from the beginning Mr Keyes is telling us exactly the type of person Charlie Gordon (the main character) is like. He does an excellent job of already making us feel sympathy & compaision for his protaganst Throughout the story he creates an image in our head that we know something that Charlie does not. When Charlie is getting laughed at, we know that it is ironic that we know that they are laughing at him, but the character does not. As the story moves forward, Charlie begins to realize what is happening to him. At this point this character makes us feel sorrowful and greatly saddened by his realization This is the key idea that we only felt sorry for him when he realized everything about his life.
Through all of the success of "Flowers For Algernon." I still feel that the story lacks ideas and common sense. The title itself is "Flowers For Algernon" refering to the mouse who was also used as an experimental subject. The title itself does not surmurd the main points of the story, therefore, making us think about the story differently. He presents no point towards the story and actually has nothing to do with any of the confilcts. At the end of the story, Charlie asks his tutor to read his Journal and he says good-bye to her. Ths, I feel, is a very unclear ending that tells us nothing about what will occur. Are we to believe that the two shall never meet again? Lastly, Keyes finishes the story with everybody liking Charlie. This is an unbelievable factor in that we have no information on why they ended up to like him. He gives us no reason to see why that action has come to that conclusion.
As a recommendation, I feel that this book is worth the time and effort to understand. The story exhibits pain and suffering that is obvious to us, yet unaware to the protagonist. This gives us a sense of our lives and all of the mysteries that we do not realize about. It also gives us an insight on people who are made fun of (which is all of us). This story exhibits the theme of strength and will. When Charlie worked hard to get smarter, you feel a sense of yourself in that character Only now, you get a chance to realize it. Lastly, I recommend this book because it exhibits acceptance of memories when Charlie realized how abused he was in the past. When Charlie realized his path, a part of us feels the same way. The other wants us to accept what we have done and what we all are.
"Flowers For Algernon" gives us a sense to succeed and work hard. It teaches us to accept the things and people who are all different. Most importantly, "Flowers For Algernon" gives us the feeling of hope, acceptance, and the idea to reach for that `shooting star' no matter how much you know that you will never reach it.
Copyright (c) 2000 Seattle Times Company, All Rights Reserved.