Proposition 227 and Skyrocketing Test Scores: An Urban Legend from California

Proposition 227 and Skyrocketing Test
Scores: An Urban Legend from California

Stephen Krashen
The Journal of the Texas Association for Bilingual Education (in

"An Urban Legend is usually a (good / captivating / titillating
/ engrossing / incredible / worrying) story that has had a wide
audience, is circulated spontaneously, has been told in several
forms, and which many have chosen to believe (whether actively
or passively) despite the lack of actual evidence to substantiate
the story."
(Urban Legends Research Centre, www.

I wish to add another Urban Legend to those that already exist,
legends that in my opinion rank with the legend of alligators
in the sewers of New York City.1 It is the "Skyrocket Legend":
As a result of dropping bilingual education, test scores in California

This legend has had serious consequences. The Skyrocket Legend
was interpreted by many as a demonstration of the superiority
of immersion over bilingual education, and has encouraged anti-bilingual
education advocates to eliminate bilingual education in other
states. I will argue here that there are more reasonable explanations
for the test score increase and review evidence showing that bilingual
education is a very helpful idea.

Bogus means of increasing test scores

Why did test scores go up in California? Proposition 227 took
force in 1998, at the same time the new SAT9 test was introduced.
Research (Linn, Graue, and Sanders, 1990) has shown that after
new tests are introduced, test scores rise, which is why commercial
tests need to be recalibrated every few years. Typical test score
inflation is about 1.5 to two points per year, which accounts
for a great deal of the gains seen in California. "Test inflation"
is especially prevalent in California where the same test has
now been given for four years in a row, punishments for lower
scores are severe, and rewards for higher scores are generous.
This pressure has resulted in districts using unusual and extraordinary
means for raising test scores, some of which have nothing to do
with increased competence.

Among the bogus means of increasing test scores are extensive
training in certain test-taking skills and selective testing,
i.e. excluding low scoring children from taking the test. Asimov
(2000) suggests that selective testing may have occurred in California.
She reported that in many cases in which SAT9 scores increased
from year to year, the number of students tested decreased. According
to Asimov, "questionable pairings" appeared in 22 San
Francisco Area school districts. And of course some test-taking
skills will raise scores without an increase in competence: If
there is no penalty for guessing, for example, simply encouraging
guessing will raise scores. Use of these means to raise scores
is like claiming to raise the temperature of the room by lighting
a match under the thermometer.

There is no evidence linking test score increases to dropping
bilingual education. Stanford professor Kenji Hakuta and his associates
found, in fact, that test scores rose in districts in California
that kept bilingual education, as well as in districts that never
had bilingual education (Orr, Butler, Bousquet, and Hakuta. 2000;
Hakuta, 2000).

Thompson, DiCerbo, Mahoney and MacSwan (2002) examined gains
in SAT9 scores in California adjusting for the number of students
in different schools, and using scaled scores, and found that
the gains made by limited English proficient students and English
proficient students were nearly identical (table 1).
(Scaled scores have equal intervals: a ten point increase, for
example, from 525 to 535 represents the same gain as a ten point
increase from 605 to 615. This is not the case for percentiles.)
Thompson et. al. reported similar results in an analysis of gains
made by cohorts, eg. Students who were second graders in 1998,
third graders in 1999 and third graders in 2000.

Table 1: Gains in SAT9 scores

grade 1998-99 99-2000 2000-01
2 LEP 7.84 4.2 12.9
2 ALL 5.88 5.39 11.21
2 EP   4.4
3 LEP 8.3 3.12 11.58
3 ALL 5.02 4.57 9.6
3 EP   4.72
4 LEP 6.47 2.16 7.6
4 ALL 2.69 3.79 6.48
4 EP   3.58
5 LEP 3.81 1.27 4.54
5 ALL 1.65 1.83 3.44
5 EP   1.88
6 LEP 3.56 1.96 4.18
5 ALL 2.12 1.51 3.68
5 EP 1.97

LEP = limited English proficient
ALL = all students combined
EP = English proficient only
From: Thompson, DiCerbo, Mahoney and MacSwan (2002)


A great of attention has been directed to Oceanside, a district
that dropped bilingual education and embraced English immersion.
As illustrated in table 2, test scores for limited English proficient
children in Oceanside have certainly increased since Prop. 227.
But they were unusually low to begin with, and have only risen
to the state average.

Table 2: SAT9 Scores for Oceanside: grade 2

  1998 pre: 277 1999 2000 2001
CA 19 23 28 31
Oceanside 12 26 32 32

Source: State of California

Why were Oceanside's pre-227 scores in 1998 so low? We have
several reasons to suspect that Oceanside's previous bilingual
program, the one that was dropped, was poorly conceived. In fact,
it was not a bilingual education program: It was a Spanish-only
program. In an article in the Washington Post (Sept. 3, 2000),
Oceanside Superintendent Ken Noonan stated that before Proposition
227 Oceanside's bilingual program was all-Spanish, lasting "for
up to four years, even longer for some. Only after being designated
fluent in English would a child's learning in English begin in
earnest" (Noonan, 2000).

Properly organized bilingual programs, by contrast, introduce
children to English from day one, and academic subjects are taught
in English as soon as they can be made comprehensible. Failing
to provide any English instruction will naturally lead to miserable
results on English-language achievement tests. This explains why
Oceanside's test scores showed substantial improvement, especially
for the youngest children, when English was introduced.

In addition, in an article in the San Diego Union-Tribune on
October 6, 2000 (Parnet, 2000), it was revealed that before Prop.
227 books were in very short supply in at least one Oceanside
school with a significant number of limited English proficient
students: Before 227, "a lot of students (at Laurel Elementary
School) didn't even have books." The Union Tribune article
also gives the reader the impression that virtually any activity
unrelated to test preparation was dropped from the school day,
an impression confirmed by a recent article (Parnet, 2001) which
stated that at one school, "morning assemblies were eliminated.
Noneducational field trips were canceled, Teacher training workshops
shelved. All done so the elementary schools could concentrate
on language arts and math." In addition, strong carrots (financial
rewards) and sticks (threats of school closure) were instituted.

It thus appears that Oceanside dropped an inadequate bilingual
program, and at the same time focused nearly all its energy on
test preparation. In addition, gains for Oceanside's English learners
were "not remarkable" but were similar to gains made
in many California districts that retained bilingual education
(Hakuta, 2000; Orr, Butler, Bousquet, and Hakuta, 2000).

Despite the problems with the SAT9, the results do not show
that dropping bilingual education is responsible for test score
increases. In fact, in another state that voted to dismantle bilingual
education, Arizona, limited English proficient students in bilingual
education have outscored those in all-English programs on SAT9
tests of English reading for the last three years (Crawford, 2000).
Despite numerous efforts to publicize this result, it was rarely
reported by the media.

Why bilingual education helps English language development

Briefly, quality bilingual programs introduce English right
away and teach subject matter in English as soon as it can be
made comprehensible, but they also develop literacy in the first
language and teach subject matter in the first language in early
stages. Developing literacy in the first language is a short cut
to English literacy. It is much easier to learn to read in a language
one understands, and once a child can read in the primary language,
reading ability transfers rapidly to English. Teaching subject
matter in the first language stimulates intellectual development
and provides students with valuable knowledge that will help the
child understand instruction when it is presented in English.

What controlled studies say

The only valid way to determine the effect of bilingual education
is to perform controlled studies. In these studies, programs are
compared in which the only difference is the use of the first
language. SAT9 test score comparisons are not controlled studies.
SAT9 comparisons often include English learners who are not in
bilingual programs, and such comparisons do not consider a host
of other factors that impact performance, such as poverty.

Scientifically valid controlled studies have been done, and
they consistently show that students in properly organized bilingual
programs acquire at least as much English as comparison students
in all-English programs, and usually acquire more. The most recent
review of this research is Greene (1997) (see also Willig, 1985),
who used statistical tools far more precise than those used in
previous reviews. Greene concluded that the use of the native
language in instructing limited English proficient children has
"beneficial effects" and that "efforts to eliminate
the use of the native language in instruction ... harm children
by denying them access to beneficial approaches." 2

Studies from other countries are very consistent with results
from the United States. Children in well-organized bilingual programs
acquire as much of the second language as those in "immersion"
programs or more. Studies confirming this have been done with
Turkish and Urdu speaking children in Norway, Punjabi speaking
children in England, Turkish and Arabic speaking children in the
Netherlands, Finnish-speaking children in Sweden, Gapapuyngu speaking
children in Australia, and Tzeltal and Tzotzil speaking children
in Mexico (Krashen, 1999a).

Rossell and Baker (1996) have also reviewed the research on
bilingual education but concluded that bilingual programs are
not as effective as all-English immersion programs. Their review,
however, inappropriately excluded a number of valid studies, and
inappropriately included studies that were not valid comparisons,
such as comparisons of different types of Canadian Immersion programs
(Krashen 1996, 1999b). Even so, Rossell and Baker conclude that
"additional, methodologically sound research needs to be
conducted in order for the courts and policymakers to make intelligent
decisions" (p. 39) and that "we are struck by how small
the differences are ... between programs with very different amounts
of English instruction" (p. 43). The Rossell and Baker review
is by far the most negative review of bilingual education published;
in fact, it is the only one I know of that claims that all-English
alternatives are better, and it concludes that differences are
not huge and that more research is necessary in order to make
"intelligent decisions."

Clearly, the published research is not consistent with claims
that dropping bilingual education causes scores to "skyrocket"
and does not support movements to make bilingual education illegal.


There is no question that test scores went up in California,
but dropping bilingual education had nothing to do with the increase.
Test score increases in California appear to be a result of the
usual "test score inflation" that occurs when new tests
are introduced. In California, inflation has been particularly
strong because of intense pressure to raise scores. Analysis of
gains in individual districts shows that those that kept bilingual
education improved and those that never did bilingual education
improved; everybody improved in California and there were no obvious
differences between gain made by English learners and English
proficient children. Missing from nearly all discussions of the
effectiveness of bilingual education is the consistent finding
that controlled studies show that bilingual education works. The
Skyrocket Urban Legend is false.


1. Other Urban Legends include: Humphrey Bogart was the original
Gerber baby on their baby food ads, the FBI monitors public libraries
and notes who reads "subversive" books, and my favorite:
If the entire population of China jumped up at the same time,
the US would be swamped by a tidal wave. None of these are true.
See for many others.

2. The most recent study of the effectiveness of bilingual
education was done by a research team headed by K.Oller and Eilers
(in press). At grade five, students in a bilingual program (60%
English, 40% Spanish) did as well as comparisons in an all English
program (with an optional 10% of the day in Spanish) on tests
of English literacy, and did far better on tests of Spanish.


Asimov, N. 2000. Test Scores Up, Test-Takers Down: Link between
participation, improvement on school exam prompts concern. San
Francisco Chronicle, Saturday, July 22, 2000.
Crawford, J. 2000. Stanford 9 scores show a consistent edge for
bilingual education.
Greene, J. 1997. A Meta-Analysis of the Rossell and Baker review
of bilingual education research. Bilingual Research Journal 21(2,3):
Hakuta, K. 2000. Points on SAT-9 Performance and Proposition 227.
Krashen, S. 1996. Under Attack: The Case Against Bilingual Education.
Culver City, CA: Language Education Associates.
Krashen, S. 1999a. Condemned without a Trial: Bogus Arguments
Against Bilingual Education. Portsmouth, NH: Heinemann.
Krashen, S. 1999b. Why Malherbe (1946) is NOT evidence against
bilingual education. NABE News 22(7): 25-26.
Linn, R., Graue, E., and Sanders, N. 1990. Comparing state and
district test results to national norms: The validity of claims
that "everyone is above average." Educational Measurement:
Issues and Practice 10: 5-14.
Noonan, K. 2000. I Believed That Bilingual Education Was Best
. . . Until The Kids Proved Me Wrong. Washington Post, Sunday,
September 3, 2000.
Oller, K. and Eilers, R. (Eds.), Language and Literacy in Bilingual
Children. Multilingual Matters. In press.
Orr, J. Butler, Y. Bousquet, M. and Hakuta, K. 2000. What can
we learn about the impact of Proposition 227 from SAT9 scores?
Parnet, S. 2000. Test-score gains fill schools with pride. San
Diego Union Tribune, October 6, 2000.
Parnet, S. 2001. Gainful change: Oceanside schools seeing steady
improvement on state performance rankings. San Diego Union Tribune,
November 25, 2001.
Rossell, C. and Baker, K. 1996. Bilingual Education in Massachusetts:
The Emperor has No Clothes. Boston: The Pioneer Institute for
Public Policy Research.
Thompson, M., DiCerbo, K., Mahoney, K., and MacSwan, J. 2002.
Exito in California? A validity critique of language program evaluations
and analysis of English learner test scores. Education Policy
Analysis Archives, 10(7).
Willig, A. 1985. A meta-analysis of selected studies on the effectiveness
of bilingual education. review of Educational Research 55: 269-316.