As we enter into a national debate on school improvement and greater public school accountability with a heavy emphasis on testing, educators are concerned that a solitary focus on testing ignores important opportunities to help all students achieve at high levels. Overreliance on testing could have the unintended consequence of hurting more than helping.
NEA proposes Testing Plus -- well-crafted accountability measures that gauge and promote real achievement. The issue is not -- to test or not to test. Students are subject to an array of standardized and teacher developed tests each year. The key question is, "How do we help students achieve, rather than hurting them?" Instead of just applying more tests, NEA calls for smarter testing that also provides students and schools the tools they need to succeed. Such a program includes:
More thorough measures. Schools are complex organizations. It takes improvement in many areas to make an effective school. Multiple indicators (drop out rates, absenteeism, number of students taking advanced placement courses, parental involvement, etc.) would help guide progress toward becoming an excellent school.
Improved tests and assessments. While the state of the art in testing has progressed tremendously, they remain less than perfect measures of teacher or school effectiveness. Test development should include teacher input. The classroom teacher is the person that knows the achievement of each student best. Classroom assessment practices including the use of portfolios, projects, and performance assessment should be enhanced through professional development and included in the comprehensive state assessment system.
Comprehensive reporting to parents. Currently school report cards and pupil report cards are the primary methods of communicating with parents. We call for the expansion of school report data to include (and use) information on the multiple indicators of success and not rely solely on standardized test scores. This could be supplemented by regular reports based upon teacher observation, judgment, and student work samples.
Alternatives to a test as the sole means of accountability. Examples of other accountability systems include school accreditation, visiting teams, and displays of student work to the public. States, school districts, and schools should use multiple indicators of school effectiveness to make improvement decisions at each level. Test scores give little data to improve school operation.
Targeted professional development. In order to succeed in a standards-based program, teachers need continued professional development targeted toward specific skills and knowledge.
Focused investments. Teachers report inadequate availability of instructional materials that are aligned to the standards. Focused investments on materials, professional development, and supplemental programs of instruction are necessary to assist teachers and students.

Testing Checklist

Testing Plus
Real accountability incorporates the tenets of Testing Plus defined on page one. Testing Plus provides a broader picture of student progress by using multiple indicators of learning. Most important, Testing Plus provides students, parents, and teachers with the support and resources required to meet higher standards. A school system that is not accountable for providing continuous, high quality resources for school communities has no business holding students and teachers accountable for performance on tests.

Protection Against High-Stakes Decisions Based on a Single Test
A 50-state survey shows that 11 states identify low-performing schools solely on the basis of test scores. Decisions that affect individual students' life chances or educational opportunities must not be made on the basis of test scores alone. As New York Times columnist Richard Rothstein notes -- a baseball player's batting average is never computed based on one game. Accuracy requires that students have multiple opportunities to pass any test when the test results are used to make high-stakes decision, such as promotion to the next grade or graduation from high school. More importantly, when there is valid evidence that a test score may not accurately reflect a student's true proficiency, predetermined alternatives to demonstrate ability to meet standards should be available to students. Absent such protections, school districts have suffered high drop out rates and degradation in the quality of curriculum. "We could realize significant progress in public education if proponents of standards-based reform joined hands with critics of high-stakes testing and effectively outlawed the use of high-stakes tests as sole indicators of student success," says Panasonic Foundation executive Scott Thompson in a Phi Delta Kappan article.

Adequate Resources and Opportunity to Learn
It's important that the testing cart not be placed before the curriculum and opportunity-to-learn horse. When content standards and tests are introduced as a reform to improve current practice, teachers must have ample opportunity to access professional development and appropriate resources, before schools, teachers, or students are sanctioned for failing to meet the new standards. According to a National Assessment of Title I study, only 40 percent of the schools identified as needing improvement last school year reported receiving additional teacher professional development or other help. Extra learning opportunities and remediation programs for students are imperatives to achieving true accountability.

Clarity in Passing Scores and Achievement Levels
Because the law permits each state to create its own accountability system and its own definition of progress, huge differences in the numbers and percentages of schools identified as low-performing exist across states. For example, Texas identifies only 1 percent of its Title I schools as "in need of improvement," while Michigan identified 76 percent. Worse, many schools are unaware that they have been identified as low-performing. Only four in 10 principals of schools identified as needing improvement reported their status as such.
The purpose and meaning of passing scores or achievement levels must be clearly stated and understood. Terms like "passing" and "proficient" must be clear and defined according to specific goals. Setting scoring levels such as "minimum competency," "grade level achievement," and "world-class" should be based on educational principles. The consequences of failing to meet levels should be clear.

Relevant Tests that Are Updated to Reflect Curriculum
Tests that are valid for one use may well not be valid for another purpose. Each separate use of a high-stakes test for individual achievement, school evaluation, curriculum improvement or any other purpose must be evaluated in order to determine the strengths and limitations of the testing program and the test itself. In addition, before simply adding another required test for all students in all schools, a systematic inventory of current testing programs and tests should be conducted. The information gathered would enable the development of a coherent coordinated system that will routinely let teachers, students and parents know how students are doing academically and provide teachers, school administrators and elected decision-makers with accurate data upon which to base policy.

Full Disclosure of Negative Consequences
Before imposing new required testing, policy makers should be aware of the likely unintended negative side effects of any given testing program. Test developers and users have a responsibility to explain the possible harmful effects in all cases where solid scientific evidence exists that a given type of program may produce undesirable results, such as higher dropout rates. It is essential to assure ongoing evaluation of both intended and unintended consequences. Fairness suggests that the governmental body that mandates the test should also provide resources to help all kids meet high standards.

Alignment of Curriculum and Tests to Standards
Standards, curriculum, instruction and assessment must be aligned in order to produce valid and reliable results. In its Quality Counts 2001 study, Education Week reports that an analysis by Achieve, a nonprofit group based in Cambridge, Mass., shows that the state standards and tests are not closely enough aligned. Current state tests "tend to measure some standards but not others and to emphasize the less demanding knowledge and skills in state standards."

Recognition of Differences and Disabilities
In the interest of assessment accuracy, testing programs must take into account student differences. For students who are learning English, a test written in English becomes, to one degree or another, a test of language proficiency. The degree of English language proficiency must be considered in deciding to administer the test. In testing students, the effects of their differences must be appropriately weighed in drawing conclusions from the test results.

Explicit Rules for Determining Students to be Tested
There must be clear policies identifying which students are to be tested and under what circumstances students may be exempted. Without such policies, there can not be any meaningful comparison of schools, districts or states when changes are tracked over time. The American Education Research Association states, "Such policies must be uniformly enforced to assure the validity of score comparisons. In addition, reporting of test score results should accurately portray the percentage of students exempted."

American Educational Research Association (AERA) --www.aera.net
Consortium for Policy Research in Education: www.gse.upenn.edu/cpre/frames/pubs.html
Fair Test: www.fairtest.org
Quality Counts 2001-- A Better Balance: Standards, Tests and the Tools to Succeed --www.edweek.org/sreports/qc01/
"The First Annual School Improvement Report: Executive Order for Turning Around Low-Performing Schools" --www.ed.gov/offices/OESE/LPS/sirptfinal.pdf
"High Standards for All Students: A Report from the National Assessment of Title I on Progress and Challenges Since the 1994 Reauthorization"--www.ed.gov/offices/OUS/PES/finalNATIreport.pdf
Richard Rothstein -- 562-945-8950; rothstei@oxy.edu
W. James Popham, UCLA -- 808-742-2045; wpopham@ucla.edu
Linda McNeill -- 713-527-4826

Countless state and local success stories prove that a high-quality education can be delivered to all students -- even those deemed least likely to succeed. The yet-to-be-met challenge is taking such success to scale. It takes a shift from a focus on quantity to a focus on quality. It shifts from a concern with outside, bureaucratic dictates, to a community focus on improving the performance of every student. The following examples demonstrate "what works."

AURORA, COLORADO. Superintendent David Hartenbach knew a top-down mandate would not produce the results students needed. To increase shared decision making and investment, thousands of administrators, teachers, parents, and community members were invited to develop solutions.
The result: stakeholders agreed on five learning goals that would drive all instruction and curriculum. Students are asked to become: 1) self-directed learners, 2) collaborative workers, 3) complex thinkers, 4) community contributors, and 5) quality producers. The first class of students to use these goals will graduate this year. Unfortunately, educators report that these goals have become overshadowed by a sole emphasis on a high-states state test.
The district uses a broad array of measures to assess progress in meeting these goals. For example in one elementary school, students may work in small groups to test various hypotheses on static electricity. Across the hall, another classroom of students may listen to and critique essays delivered by classmates on drugs and gangs. Throughout these exercises, teachers observe, take notes charting student progress in achieving certain benchmark skills, and guide students rather than lecture. Students, too, record their progress in journals. Ultimately, teachers distill all of this information in a portfolio that tracks student progress. The portfolios are arranged in accordance with the five learning goals.
This innovative style is anchored with significant investments in lifelong teacher professional development. Courses and workshops are available to help teachers sharpen their skills and share ideas about inspiring new ways to learn. Professional development opportunities also include mentoring and coaching opportunities. The district hopes this approach will not be abandoned in favor of a sole focus on a high-stakes state test. (Strategies, August, 1998.)

CALIFORNIA. The California Teachers Association recognized a high degree of teacher stress surrounding standards and testing. CTA produced a "Survival Guide for Standards and Testing" to help teachers make sense and better use of the standards-based reform in their state. The Survival Guide provides background on how and why tests and standards were developed. It also provides practical advice and tools that can be used in the classroom immediately to improve instruction in a standards-based and test-driven educational system.
Tools include: a checklist that teachers may complete to determine whether they have the resources and support necessary to meet the standards; information on how to bring grading into alignment with standards; helpful tips for test preparation; a list that can be shared with parents and community members; and, a glossary of terms. (www.cta.org/survival_guide/index.html)

MASSACHUSETTS. The teachers union backed a bill for a new comprehensive assessment system to replace the current state MCAS test graduation requirement. It requires students to take a limited number of MCAS tests at various grade levels, and the test results would be used for diagnostic purposes, not for decisions about promotion or graduation. The proposed bill calls for each district to establish a multi-faceted assessment system to measure student performance relative to state standards. This could include portfolios, performance evaluation, work projects and other classroom-based measures of achievement. Further, the bill calls for every school in Massachusetts to become accredited.

NEW JERSEY. The state implemented the Minimum Basic Skills test over two decades ago. After a transition period, scores improved. Along came the High School Proficiency Test. Again, after a transition period, scores improved and officials worked to make the test more challenging. A new test, the Core Curriculum Content Standards, is not being implemented. Currently, there are concerns that the elementary school proficiency test, for 4th graders, is too long. In fact, it takes longer to take that test than the SAT. Teachers complain about confusing directions, topics that are not age-appropriate, subjective scoring, and inadequate training on all aspects of the test.

RHODE ISLAND. Rhode Island boasts an example of a state accountability system that balances the public's need for support and for a genuine measure of autonomy in achieving those results. Rhode Island's SALT (School Accountability for Learning and Teaching), gathers extensive qualitative as well as quantitative data on school quality for the purpose of supporting continuous, standards-based school improvement. Based on a self-assessment, each school in the state develops a school improvement plan. Periodically, a team of teachers, parents, and administrators from outside the district spends a full week in the school, shadowing students, visiting classes, and interviewing teachers, parents, and administrators. The results of the external review, including findings, recommendations, and commendations are read to the entire faculty on the Monday following the visit. (Phi Delta Kappan, January 2001)


Q. How many states have standards and tests?
A. According to Education Week's recent Quality Counts 2001 study, 49 states have academic standards in at least some subjects. All states test students, but these tests are not correlated to standards. In 27 states, there is some type of system to hold schools accountable for results, by rating performance or identifying low-performing ones.
· 45 states issue report cards
· 18 states require students to pass a test in order to receive a diploma;
· 28 states provide no incentives for schools or students subject to high stakes tests;
· According to the Education Commission of the States, 15 states test every student in reading and math in at least every grade from three through eight.

Q. What would an ideal testing program look like?
A. An ideal testing program would use a variety of assessments to provide information about student progress at the national, state and local level. The components of the program would be tied to curriculum and instruction and would offer teachers and students information about what's known and where instructional help is still needed. Likewise, state tests should provide sufficient information about school performance that schools can learn what they're teaching well and where they need assistance in creating meaningful learning for kids.

Q. How many states give tests aligned with its standards?
A. The Wisconsin Center for Education Research found that less than a dozen states can claim their high-stakes tests are aligned with their standards. That disconnect provokes a persistent outcry from teachers and parents for a better system.

Q. How often are tests given?
A. It depends on where you live and which test you are talking about. Large-scale tests are usually administered once a year (with many other tests given throughout the year). In addition, local school districts administer additional standardized tests.

Q. Do teachers like tests and standards?
A. Surveys show that teachers believe the drive to raise academic standards is a move in the right direction. However, there is enormous concern about the level of emphasis on test and test-preparation activities.

Q. Do parents like tests?
A. Only 11 percent of parents said their children's schools give too many standardized tests; 18 percent said "real learning" takes a back seat to test preparation. Still, 78 percent of parents agree "it's wrong to use the results of just one test to decide whether a student gets promoted or graduates." (Public Agenda, Oct. 2000) Sixty-three percent of adults said standardized tests are not an accurate way to measure a student's academic progress. (American Association of School Administrators, June 2000)

Q. How much do tests currently cost?
A. According to the American Educational Research Association, total annual spending for K-12 tests among the 50 states has nearly doubled in the past four years, from $165 million in 1996 to $330 million in 2000.

Q. How much would an ideal testing program cost, and at what cost to other programs?
A. Even though they have been shown to produce the largest gains in student achievement, few states have invested in assessments that use student portfolios or more involved projects, beyond simplistic multiple-choice exams. Portfolio tests were much more time-consuming and costly than off-the-shelf, norm-referenced exams. In Iowa, for example, the cost of administering the fill-in-the-bubble Iowa Tests of Basic Skills is 93 cents per student. (Quality Counts 2001, www.edweek.com)
"Basically, we haven't made the case to the political folks that they should be spending $12 or $14 a test for a student, rather than $2 or $3 a test," says Marshall S. Smith, a professor of education at Stanford University who was the acting deputy secretary of the U.S. Department of Education under President Clinton. "The irony here is that the amount of money is so small compared to the amount of money that states spend educating a student." In 1999, the average total per-pupil expenditure in the United States was $6,408.
The skills demanded on the portfolio exams have valuable real-life applications, such as communicating thoughts in writing, graphing and interpreting data, and synthesizing information. Unfortunately, few states have found them worth the investment.

Q. How should test information be used?
A. It's important to remember that no one test can do everything. Some tests are designed to diagnose instructional needs; others are designed to rank students in order to make decisions about program effectiveness or college admission. Taking test results from a test that's designed to evaluate a program and using those results to draw conclusions about individual students is just plain unsound assessment practice. Testing experts agree that using a single test score to make important decisions about individual students (such as promotion, retention or access to a particular program (e.g., gifted and talented programs] is indefensible. See the 1999 Standards on Educational and Psychological Testing developed by the American Educational Research Association, the American Psychological Association and the National Council on Measurement in Education.

Q. What are some of the "unintended consequences" we might anticipate from an emphasis on high-stakes testing?
A. The original intent of the standards movement was to ensure that all students would have access to a much more challenging curriculum and therefore, would perform to higher levels of academic achievement. High stakes testing is, in many cases, narrowing the curriculum in ways not envisioned previously -- as teachers, students and administrators try to focus on the limited set of knowledge and skills they think will be covered on high-stakes tests. Enrichment programs like music and art have been cut from school curriculum choices. The use of high stakes tests might also discourage students who fail to perform well and thus lead to higher dropout rates and ultimately, lower achievement for many students. Ironically it is precisely these students who are now most at-risk of dropping out who were the initial focus of reform efforts like the standards movement.

Q. What if a community tries everything and a school still manages to underperform?
A. Establishing standards to raise achievement and measuring students' progress against those standards is just half the solution. We must be just as scrupulous in guaranteeing children the opportunity to learn the required material as we are in fashioning the standards and measures.
School systems should know well before the end of a 3-year period whether a specific school is making progress in helping students achieve. By applying research-based "essential supports," educational programs can be designed to fit the needs of a school. The Learning First Alliance, a coalition of 11 national organizations ranging from the PTA to the Council of Chief State School Officers and National School Boards Association, identified five key areas that need to be addressed in designing effective accountability systems. These include:
· Use of instructional programs and curricula that support state and district standards and of high quality testing systems that accurately measure achievement of the standards through a variety of measurement techniques
· Professional development to prepare all teachers to teach to the standards
· Commitment to providing remedial help to children who need it and sufficient resources for schools to meet the standards
· Better communication to school staff, students, parents and the community about the content, purposes and consequences of standards
· Alignment of standards, assessment and curricula, coupled with appropriate incentives for students and schools that meet the standards
In the unlikely event that all of these efforts, including a change in school leadership, fail over a 3-year period to "turn the school around," drastic action is required. The school should then be thoughtfully reconstituted - completely overhauled - according to a tailored plan designed by administrators, teachers, parents and the community. Of course, if student achievement is monitored and problems are identified, correctly diagnosed and appropriately addressed, this "solution" will never be needed.

Q. What are the National Assessment of Education Progress (NAEP) tests? Why can't they substitute for state tests?
A. Since 1969, the National Assessment of Educational Progress, "the Nation's Report Card," has assessed the academic performance of fourth, eighth, and twelfth graders in a range of subjects. Since 1990, they've also been conducted on a voluntary basis at the state level. The 1998 state NAEP assessed writing at grade 8 and reading at grades 4 and 8. In 2000, state assessments were done at grades 4 and 8 in mathematics and science. The next state assessment will be in 2002 and will again assess grades 4 and 8 in reading and writing.
NAEP tests and scores are designed to benchmark progress toward meeting a nationally defined goal set by educators and policymakers from across the nation. The tests are generally much more rigorous than state tests and international tests. NAEP tests are not designed to be a substitute for state tests, because such a practice would risk establishing a national curriculum. It has been a long and valued tradition in the United States to allow state and local communities to make decisions about curriculum and teaching methods.
NAEP tests, unlike many state tests, are not "norm-referenced" -- designed in a manner that spreads student scores and sorts students along a spectrum of achievement levels. Typically, "norm-referenced" tests are of the fill-in-the-bubble, multiple choice variety. The NAEP tests are not. Test questions allow students to demonstrate what they know through a range of activities, including open-ended questions and oral responses. They also include higher level thinking activities such as: describing interpretations, explaining reactions, drawing conclusions, or supporting critical evaluations.