SAT Scoring "Debacle" Undermines Test-Maker Credibility

University Testing

For more than two decades, FairTest has argued that there is stronger public oversight and control over the food we feed our pets than for the tests administered to our children. No incident has illustrated the truth of that position more vividly than the recent series of events which the Chronicle of Higher Education, the trade paper of record for colleges and universities, labeled the SAT scoring "Debacle" in a front page headline.

On Saturday, October 8, 2005, 495,000 high school students from across the nation took the SAT I exam. These test-takers paid a basic fee of $41.50 each to the College Board, the exam's sponsor, and expected accurate scores to be delivered within three to four weeks. Though it rained heavily in the northeast that weekend, no unusual test administration problems were reported.

By the time the answer sheets arrived at Pearson Education Measurement in Austin, Texas, for scanning, however, some of the test papers were apparently contaminated with water, making bubbled-in answers hard to read. As a result, the answers on more than 5,000 test forms were not accurately recorded.

So far, there has been no coherent "root cause" analysis of the problem. Answer sheets from students in more than 30 states were impacted, making precipitation in one section of the country an unlikely explanation. Others have suggested that humidity at the Austin test-scanning center may have been the cause, but the exams arrived there during a period of record drought. Besides, it has likely been rainy or humid at hundreds of test centers around the country on almost every Saturday when the SAT has been administered.

Neither the College Board nor Pearson nor the Educational Testing Service, the firm which translates the number of responses scanned as correct into official SAT scores, noticed the error. Soon after receiving their scores in late October, some students began complaining that the results were inaccurate. Several were so sure that errors had occurred that they were willing to request a little known special form from the College Board, wait days to have it delivered, fill it out, and pay another $50 to have their tests scored by hand.

Additional weeks passed before hand scoring took place. The College Board claims the "industry standard" for this review is three to five weeks, but this time the process appears to have taken longer, again for no clear reason.

Once hand scoring revealed a systematic error, the College Board and Pearson began rescanning answer sheets. Somehow this process took another full month, even though the answer sheets were already in their hands.

During this entire period, as the 2006 admissions season moved into high gear, no warning about this problem was made to test takers, guidance counselors or college officials. Not a word was mentioned at any of the College Board's regional meetings with its members, nor to the news media.

Finally on March 6 and 7, 2006 -- five months after the test was administered -- the College Board told its stakeholders about the problem. But they did not tell them the full truth. The initial set of news stories reported that the errors were "less than 100 points."

Later than week, the College Board changed its story. Rather than "100 points," errors were as large as 400 points. Then the next week, another update: 1,600 answer sheets involved in a still unexplained "special exceptions process" at the Educational Testing Service (ETS) had not been rescanned. And four days later, yet another correction: Approximately 27,000 additional still unscanned answer sheets had been found by Pearson five and a half months after the tests were administered and at least six weeks after the College Board claims that it first noticed "something odd" in the scoring process.

Ultimately, the test makers admitted that 4,411 test-takers had initially received scores that were too low, and 613 received scores that were too high. Erroneously high results were not corrected for what the College Board Calls "a matter of fairness."

Consequences and Next Steps

Reviewing this chronology in a memo to his association's members, College Board president Gaston Caperton likened this series of events to a "sharp rock in my shoe." Test-takers, admissions officers, guidance councilors, assessment reform advocates, and policy makers use much stronger terms, such as "fiasco," to describe the situation.

Speaking after a New York State Senate Higher Education Committee hearing called to investigate the SAT scoring problem, committee chairman Sen. Kenneth LaValle concluded, "The industry cannot regulate itself." LaValle, the "father" of New York's landmark "Truth-in-Testing" law, which forces university admissions exam producers to make public previously administered copies of their tests as well as studies about their accuracy and validity, invited five witnesses to testify before his committee: a test-taker whose scores were wrong by nearly 170 points; the presidents of the College Board, Pearson Education Measurement and the Educational Testing Service; and FairTest. In well-received wrap-up testimony, FairTest called on the Senate Higher Education Committee to require test-makers to automatically return scored exams to test-takers and to establish an oversight panel to monitor industry performance (testimony available here).

A bill drafted by the Higher Education Committee embraced FairTest's recommendations, proposing creation of a board to oversee testing within the New York State Attorney General's office and calling for the return of scored answer sheets plus a copy of all questions when requested by test-takers. The legislature is expected to consider the proposals later this year. Historically, laws regulating testing adopted in New York have become national practice, since test-makers to not want to establish separate procedures for just one state, particularly one that is a major market for their products.

The SAT mis-scoring saga has also had impacts at the national level. U.S. Secretary of Education Margaret Spellings responded by convening a meeting of test company leaders to ask whether they really had the technical capacity to handle the explosion of standardized exams required by the federal "No Child Left Behind" law. Not surprisingly, they answered in the affirmative, though a series of other scoring errors and delays (see articles, this issue) demonstrates their claim is not true.

A class-action lawsuit has also been filed on behalf of test-takers who received erroneously low scores. The case, handled by the same law firm that successfully sued Pearson several years ago on behalf of Minnesota high school schools who were falsely told they had failed the state's graduation test (see Examiner, Fall 2002) seeks damages from both the College Board and Pearson. The plaintiffs charge defendants with breach-of-contract, negligence, and violations of consumer protection laws, among other claims. In addition to compensation for damages, the litigation seeks the immediate correction of erroneously high scores. In response, the College Board and Pearson have asked that the case be thrown out, despite the five month delay in identifying and publicizing their errors, because accurate results were ultimately reported before final admissions and financial aid decisions were made. FairTest is providing plaintiff's attorneys with research support for the case.

Given the "deep pockets" of test-makers (see article, this issue), the court process is likely to be protracted. No matter how the lawsuit turns out, the credibility of the College Board, Pearson and the entire standardized testing industry has suffered serious, lasting damage.