Teacher Evaluation in Practice

(1)

RESEARCH REPORT

SEPTEMBER 2013

Implementing Chicago’s REACH Students

(2)

1

Executive Summary

3

Introduction

Chapter 1

9

The Classroom Observation Process

Chapter 2

15

The Use of Student Growth in

Evaluation

Chapter 3

21

Training and Communication

Acknowledgements

The authors gratefully acknowledge the support of the Joyce Foundation and its continuing interest in this important line of work. This analysis based on one district’s experience with a dramatically different teacher evalu-ation system could not have happened without Joyce’s steadfast support and long term commitment. In addition we would like to acknowledge the support of Chicago Public Schools and the Chicago Teachers Union. We have gained particular insights from Matt Lyons, Paulette Poncelet, Sheila Cashman, Elizabeth Press, Didi Schwartz, Amanda Smith, Susan Kajiwara-Ansai, and Meghan Zefran from CPS central office and from Carol Caref of the CTU as well as other teachers from the CPS-CTU Joint Committee who have provided valuable feedback as the work has progressed.

In addition we thank the teachers and administrators of Chicago Public Schools who shared their thoughts through multiple surveys, and those who provided their valuable time in one-on-one interviews. This report was made possible through their participation. We also learned from those individuals serving as Instructional Effectiveness Specialists and thank them for the extra time they spent sharing with us.

We are also indebted to the Research Assistants who helped in countless ways to make this report possible: Patrick Wu, Elc Estrera, Jen Cowhy, Catherine Alvarez-McCurdy, Josie Glore, and Gabrielle Friedman. Their con-tributions have been invaluable. We thank members of the University of Chicago Consortium on School Research’s Steering Committee who commented on an earlier draft of this report—especially Peter Godard, Karen Lewis, and Luis Soria. It is always humbling to have our work critiqued by those with on-the-ground experience and we thank them for sharing that experience with us. Their thoughts helped shape this final product. Finally, we owe a deep debt of gratitude to members of the UChicago CCSR staff who worked with us as thought partners and readers through numerous iterations to arrive at this final report: The communications staff, Emily Krone and Bronwyn McDaniel, who kept us focused and offered excellent feedback; fellow researchers Elaine Allensworth, Jenny Nagaoka, Melissa Roderick, Penny Sebring, Marisa de la Torre, and Stuart Luppescu; and our superb final technical readers Lauren Sartain and Paul Moore. Our deep discussions with all of them have added to this report as well as our understanding of the promise and challenge of changing the teacher evaluation system in Chicago.

Chapter 4

25

How Principals Manage Classroom

Observation Workloads

Chapter 5

29

Questions to Consider

33

References

35

Appendices

37

Endnotes

TABLE OF CONTENTS

This report was produced by UChicago CCSR’s publications and communications staff: Emily Krone, Director for Outreach and Communication; Bronwyn McDaniel, Communications and Research Manager; and Jessica Puller, Communications Specialist.

Graphic Design: Jeff Hall Design

Photography: Cynthia Howe and David Schalliol Editing: Ann Lindner

(3)

1

Executive Summary

This report focuses on the perceptions and experi ences of teachers and

administrators during the first year of REACH implementation, which was

in many ways a particularly demanding year. These experi ences can be

helpful to CPS and to other districts across the country as they work to

restructure and transform teacher evaluation.

Historically, teacher evaluation in Chicago has fallen short on two crucial fronts: It has not provided adminis-trators with measures that differentiated among strong and weak teachers—in fact, 93 percent of teachers were rated as Excellent or Superior—and it has not provided teachers with useful feedback they could use to improve their instruction.1

Chicago is not unique—teacher evaluation systems across the country have experienced the exact same problems.2_{Recent national policy has emphasized} overhauling these systems to include multiple measures of teacher performance, such as student outcomes, and structuring the evaluations so they are useful from both talent management and teacher professional develop-ment perspectives. Principals and teachers need an evaluation system that provides teachers with specific, practice-oriented feedback they can use to improve their instruction and school leaders need to be able to identify strong and weak teachers. Required to act by a new state law and building off lessons learned from an earlier pilot of an evidence-based observation tool,3_{Chicago Public} Schools (CPS) rolled out its new teacher evaluation system—Recognizing Educators Advancing Chicago’s Students (REACH Students)—in the 2012-13 school year.

The REACH system seeks to provide a measure of individual teacher effectiveness that can simultane-ously support instructional improvement. It incorpo-rates teacher performance ratings based on multiple

classroom observations together with student growth measured on two different types of assessments. While the practice of using classroom observations as an evaluation tool is not completely new, REACH requires teachers and administrators to conceptualize classroom observations more broadly as being part of instructional improvement efforts as well as evaluation; evaluating teachers based on student test score growth has never happened before in the district.

REACH implementation was a massive undertak-ing. It required a large-scale investment of time and energy from teachers, administrators, CPS central office staff, and the teachers union. District context played an important role and provided additional challenges as the district was introducing other major initiatives at the same time as REACH. Furthermore, the school year began with the first teacher strike in CPS in over 25 years. Teacher evaluation was one of several contentious points in the protracted negotiation, and the specific issue of using student growth on assessments to evaluate teachers received considerable coverage in the media.

This study uses data collected from fall 2012 through spring 2013, including:

• Two surveys of all 1,195 principals and assistant principals, administered in December 2012 and April/May 2013, respectively

(4)

2

• Two surveys of teachers, one administered in January 2013 to a sample of 2,000 classroom teachers and one administered in March 2013 to all teachers in the district

• Interviews with a random sample of 31 classroom teachers and six principals from six schools, conducted in spring 2013

• Interviews with nine central office staff members (Instructional Effectiveness Specialists), conducted in November 2012

Summary of Main Findings:

Teachers and administrators find the observation process useful for improving instruction

• Overwhelming majorities of teachers and admin-istrators believe the observation process supports teacher growth, identifies areas of strength and weakness, and has improved the quality of professional conversations between them.

• Most administrators feel confident in their ability to gather evidence and assign ratings; a large majority of teachers believe their evaluator is fair and unbiased and able to assess their instruction. • Some teachers expressed concern that classroom

observation ratings are too subjective to be used in high-stakes evaluations, while others feel appre-hensive about revealing instructional weaknesses for fear of being penalized on their evaluations. Teachers are hesitant about the use of student growth on assessments to evaluate their classroom performance

• Over half of teachers surveyed believe REACH relies too heavily on student growth.

• Special education teachers are particularly critical and find the assessments to be inappropriate measures of their students’ learning and their instruction.

Communication with teachers is an area for improvement; administrators want support on coaching and providing useful feedback

• The frequency and quality of training and communication received by teachers varies widely. • Teachers are confused about how student growth

factors into their final rating. Both teachers and administrators need clarity about score calculations and how they will be used for personnel decisions. • Most administrators list coaching and providing

useful feedback as high priorities for their own professional development.

REACH places demands on administrator time and capacity

• Administrators reported spending about six hours per formal observation cycle, including the observa-tion, pre- and post-observation conferences, and data management. Based on the amount of time administrators reported spending on observations, and the average number of observations performed, the typical elementary school administrator spent approximately 120 hours—or two full weeks—solely on observations that were part of the teacher evalu-ation system. The typical high school administrator spent approximately three full weeks.

• Administrators are expected to train teachers about the system, conduct classroom observations, hold meaningful conversations with teachers about their instruction, and complete required paperwork while balancing their other job responsibilities.

This report is the first in a series of studies on Chicago’s REACH teacher evaluation system. Subsequent work will investigate the consistency in observation ratings, the multiple measures of student growth, and the relationships among these variables. As the initiative continues to unfold, future work will also examine changes in these measures over time.

(5)

3

Introduction

In the fall of 2012, Chicago Public Schools (CPS) instituted a sweeping

reform of its teacher evaluation system with the introduction of REACH

Students. REACH Students replaces CPS’s former 1970s–era checklist

policy by incorporating a detailed classroom observation process and

student growth measures into teachers’ effectiveness scores (i.e., formal

or summative evaluation ratings).

4

With this policy, Chicago joins other states and districts across the country in developing new systems to evaluate teacher performance. More than 40 states now incorporate student test scores or other achieve-ment measures into their teacher evaluations.5_Over the next few years, several large urban districts (e.g., Los Angeles, Philadelphia, and New York) will be pilot-ing or implementpilot-ing similar new teacher evaluation systems required by their states.

This report provides an initial look at the first-year implementation of REACH (Recognizing Educators Advancing Chicago) Students (hereafter referred to as “REACH”). Recent reports on teacher evaluation have highlighted the problems that systems like Chicago’s attempt to correct, but there is still much to learn about districts’ implementation experiences and their early successes and challenges. We begin by describing the REACH evaluation system and the specific questions that guided our study.

Purpose and Design of REACH

Recent efforts to revamp teacher evaluation systems reflect the education field’s increasing shift in focus from schools to individual teachers.6_{A growing number} of studies are examining how student learning is re-lated to teacher effectiveness. This work shows student achievement gains vary significantly across teachers. Furthermore, teacher effectiveness accounts for more

variation in student outcomes than any other school factor.7_{Policymakers have responded to these research} findings: federal policy under the U.S. Department of Education’s Race to the Top grant competition encour-ages states to identify strong and weak teachers by in-corporating multiple measures of teacher performance in state evaluation requirements.8_Combined, develop-ments in education research and policy have put teacher effectiveness front-and-center of efforts to improve students’ educational outcomes.

The possibility of receiving a federal Race to the Top grant prompted the Illinois State Board of Education to pursue key goals for providing students with access to high-quality teacher and leaders, and it incentivized Illinois legislators to pass the Performance Evaluation Reform Act (PERA) in 2010. PERA requires every dis-trict in Illinois to adopt new teacher evaluation systems that assess both teacher practice and student growth.9 The teacher practice measures required by PERA must include multiple formal classroom observations, as well as support for teacher improvement. For student growth, the law defines various qualifying assessment types and combinations of assessments that must be used. Teacher performance and student growth ratings must then be combined to create a single, summative rating of teacher performance.

To comply with PERA requirements and to build off a generally successful pilot of an evidence-based

(6)

4

observation rubric (see CPS’S Experiment with Teacher Evaluation: EITP, p. 8), CPS rolled out its new teacher evaluation system—REACH—in the 2012-13 school year. The main components of REACH in 2012-13 include: • Multiple classroom observations: Non-tenured

teachers must be observed four times per year, and observations must last for at least 45 minutes and include a pre- and post-observation conference. REACH requires administrators to provide feedback to teachers after each observation.

• An explicit observation rubric: REACH utilizes a modified version of the Charlotte Danielson Framework for Teaching.10 _{In this rubric teachers are} rated on four areas, or domains, of teaching practice: Planning and Preparation, Classroom Environment, Instruction, and Professional Responsibilities. Each of the domains is further broken down into 4-5 components in which expectations for each level of performance are described in detail.

• Trained evaluators: REACH requires all administra-tors to be certified by completing a series of training modules and passing two assessments. It further employs trained specialists that work with adminis-trators on calibration and assigning evidence-based ratings aligned with the rubric.

• Student growth measures: REACH utilizes two dif-ferent measures of student growth (Performance Tasks and either value-added or expected gains). Although REACH is intended to provide a more accu-rate measurement of teacher practice, CPS has been clear that the system should also be a vehicle for professional growth. The CPS observation rubric (hereafter referred to as “the Framework”) provides a common language about what constitutes effective teaching and a struc-ture for having conversations focused on supporting instructional improvement (see Appendix B). Recent research on such process-based observations systems suggests that they can lead to improved student learn-ing.11_{Furthermore, while test score data are intended to} provide an additional measure of teacher effectiveness, they are also intended to inform teachers’ choices about appropriate instructional content for their students.

REACH implementation was a massive undertaking. It required a large-scale investment of time and energy from teachers and administrators alike—in the form of training for administrators to be certified as observ-ers, more frequent and time-intensive observations and conferences for both teachers and administrators, and overall training on a new and complex system. By the end of this year, the observation process had resulted in over 36,000 observations for about 6,000 non-tenured teachers and 13,000 tenured teachers. REACH also required the district to create a whole new set of assess-ments since many teachers do not teach in grade levels or subject areas that are captured on typical standard-ized assessments. In order to link students and teachers to provide accurate student growth information, the CPS central office had to redesign the way data on teachers and students are collected.

TABLE 1

CPS School and Personnel Statistics (2012-13)

Schools* 578 Elementary Schools 472 High Schools 106 Non-Tenured Teachers 5,743 Tenured Teachers 15,109 Administrators** 1,195

Source: CPS Stats and Facts, Administrative records * Does not include charter or contract schools ** Only includes principals and assistant principals

The 2012-13 school year was particularly difficult time to launch such a large-scale and complex teacher evaluation system: The school year began with the first teacher strike in more than two decades; the CEO of CPS resigned in October, ushering in the third leader-ship change in four years; all schools had a longer day and year; and CPS began transitioning to the Common Core State Standards for teaching and learning. On top of all of this, debates about school closings, enrollment declines, and budget shortfalls began in the fall. A series of heavily attended and emotional public hearings were held throughout the year, and a controversial decision was made in the spring to close 49 schools.

(7)

5

Guiding Questions

The increased attention to teacher evaluation from policymakers and practitioners has been accompanied by increased attention from researchers seeking to evaluate implementation of these new systems. Many studies have focused on technical aspects, such as the reliability of the measurement tools.12_Another important, but smaller, body of work has examined the use of new teacher evaluation systems in schools and districts.13_{Building on this early research, this} report provides information on the first year of REACH implementation, answering questions about teachers’ and administrators’ perceptions of the system and their experiences with the new system. The specific questions and issues explored in this report include: QUESTION 1: What are the benefits and drawbacks of observation systems designed for both teacher development and evaluation? One of the benefits of using classroom observations in evaluation systems is that they have the potential to meet schools’ dual needs of supporting profes-sional growth and differentiating teacher practice.14 Observations can create structures for providing teachers with timely and individualized feedback on their classroom practice. This information can guide coaching and professional development activities, as well as help teachers develop goals for improvement. In addition, observation ratings provide administrators with standardized and defensible evidence for making personnel decisions.

Yet, using observations for both purposes may also create a number of tensions. One study suggests that some teachers may be less likely to seek instructional support from administrators if exposing their weak-nesses could result in a poor evaluation.15_Furthermore, teachers may not respond positively to encouragement from administrators after receiving low ratings or disciplinary actions from them.16_{Finally, classroom} evaluators who are responsible for supporting teacher growth and formally assessing effectiveness may intro-duce bias into the accountability process.17_{In short, if} not implemented well, the benefits of using classroom observations may devolve into dueling purposes with each cancelling the benefits of the other.

UChicago CCSR’s study of CPS’s earlier pilot pro-gram, which was called the Excellence in Teaching Pilot, found that teachers and principals thought their discussions about instruction were more reflective and objective using the Danielson Framework than the CPS checklist.18_{Observations conducted under the pilot,} however, did not count toward teachers’ official evalu-ation score. In Chapter 1, we ask: How fair and useful do teachers and administrators find REACH classroom observations as a means of improving instruction? Does using school administrators as both coaches and evaluators raise any concerns or challenges? QUESTION 2: How do teachers view the use of student growth on standardized assessments in their evaluation?

The incorporation of student growth measures into teachers’ evaluations has been a contentious issue, both in Chicago and nationally. While supporters maintain teachers should be held accountable for stu-dent learning, critics contend that metrics designed to assess student progress are poor measures of teacher performance.19_{Additionally, opponents fear that} adding stakes to student assessments increases the likelihood that teachers will narrow their curriculum or “teach to the test” so as to avoid a negative evaluation. Despite these issues, states and districts have moved forward with including student growth measures in teachers’ evaluations.

Addressing teacher and administrator skepticism of student growth measures is critical for leveraging the full potential of the system to improve instruction. In Chapter 2, we ask: To what extent do teachers perceive student growth measures can provide an accurate as-sessment of their performance? How, if at all, are teach-ers using the assessment data produced by REACH? QUESTION 3: What are the successes and chal-lenges related to training and communication? While REACH addresses many of the limitations of the previous teacher checklist system, thoughtful design is not enough to guarantee success.20_{Before REACH can} improve teacher evaluation or instructional practice, it first has to move from a written policy document to a system embedded in the work of teachers and school

(8)

6

administrators. Implementation is critical to achieving intended outcomes.21

The task of implementing a teacher evaluation sys-tem of this scale and complexity should not be under-estimated. REACH involves over 20,000 teachers and other school staff and 1,200 administrators in nearly 600 schools. Principals and assistant principals had to be certified and trained on using the new observation rubric. Teachers had to be informed about the goals of the new system, trained on how to engage in the new observation process, and taught how their summative evaluation score would be calculated. Observations and the pre- and post-observation conferences had to be scheduled and completed.

Understanding the experiences of teachers and administrators as they implemented such a complex and time-intensive system—in addition to all their other responsibilities—is a critical first step toward understanding any potential effects that REACH might have. If teachers and administrators are not informed of REACH’s goals and do not understand its various elements, they may not implement the policy as intended. Insufficient training and resources are reasons for implementation failure.22_{In Chapter 3,} we ask: How knowledgeable were teachers and admin-istrators about REACH? How did they describe their training experiences? What aspects of implementation did participants identify as needing improvement? QUESTION 4: How do principals understand and describe their capacity to manage classroom observation workloads?

In the last decade, principals have been increasingly called upon to be instructional leaders in their schools, especially through supporting effective instructional practices.23_{Given this emphasis on principals as} in-structional leaders, many assume that it is the principal who should be responsible for conducting observations and evaluating teacher practice.

It is not clear, however, whether principals have the time and capacity to manage the observation workload created by new evaluation systems. To increase the reliability of ratings, most systems call for teachers to be observed multiple times a year. Each observation typically involves scheduling and conducting the obser-vation, writing up evidence and entering it into a data-base, having pre- and post-observation discussions with teachers, and coaching teachers on areas for improve-ment. The entire process for a single teacher can take several hours. While assistant principals in CPS also became certified evaluators, it still fell to the principals to ensure that all of the observations and pre- and post-conferences required by REACH were conducted.

Previous studies conducted by UChicago CCSR researchers have highlighted some of the capacity is-sues created by the introduction of new teacher evaluation systems. For example, workload demands contributed to lower engagement in the new system for some principals, while others reported giving less attention to tenured teachers in order to complete all of their required evaluations.24_{In Chapter 4, we ask:} How much time did administrators spend on classroom observations during the first year of REACH? How do they feel about the demands the new REACH system places on them?

In This Report

Chapters 1 and 2 of this report describe the observation and student growth elements of REACH and pro-vide participants’ perceptions about the value of this initiative as both an evaluation and development tool. Chapters 3 and 4 describe participants’ experiences with implementation, focusing on communication, training, and time demands. Finally, in Chapter 5, we present some questions to consider as implementation continues. Additional reports in this series will inves-tigate observation ratings, student growth ratings, and the relationship between them.

(9)

7

What Goes Into a Teacher’s Evaluation Score?

A teacher’s REACH summative evaluation score is comprised of a teacher practice score and up to two measures of student growth. The teacher practice component consists of classroom observations com-pleted by a certified administrator utilizing the CPS

Framework for Teaching, a modified version of the Danielson Framework for Teaching. Student growth measures are discussed in detail in Chapter 2. In 2012-13 only non-tenured teachers were to receive a summative evaluation score.

Teacher Practice: CPS Framework for Teaching Student Growth: REACH Performance Tasks Student Growth: Value-Added

75% 10%

15%

Elementary Teachers in Tested Subjects/Grades

(Receive individual value-added)

75% 15%

10%

Elementary Teachers in Untested Subjects/Grades

(Receive schoolwide value-added in literacy)

Source: Chicago Public Schools

FIGURE 6

2012-13 summative evaluation scores for non-tenured elementary school teachers.

90% 10%

High School Teachers in Core Subject Areas

100%

(10)

8

Research Activities

To answer our questions we used multiple sources of information, including surveys and interviews. Surveys pro-vide a broad picture of participants’ perceptions; interviews propro-vide deeper insights into participants’ experiences.

• Winter 2012-13 Surveys: We surveyed all 1,195

prin-cipals and assistant prinprin-cipals in December 2012, receiving 733 responses (a 61 percent response rate). We surveyed a random sample of 1,000 non-tenured teachers and 1,000 tenured teachers in January 2013.A_{We received 901 responses (a 45}

percent response rate). The entire content of this survey administration was related to REACH.

• Spring 2013 Surveys: We included survey items as

part of CPS’s annual My Voice, My School survey. This survey was administered to all teachers in

March 2013 and had a response rate of 81 percent. Then, we surveyed all principals and assistant principals in April/May 2013, receiving 687 respons-es (a 57 percent rrespons-esponse rate). Some qurespons-estions were the same as in the winter 2012 survey to gauge changes in perception; others were different

because the initiative was more mature. Survey content was shared with other topics.

• Spring 2013 Principal and Teacher Interviews:

We randomly selected three high schools and five elementary schools for our interview sample. We then randomly selected teachers from within those schools to interview. We were able to interview six principals and 31 classroom teachers from six schools.

• Fall 2012 CPS Central Office Staff

(Instructional Effectiveness Specialists)

Interviews: We interviewed nine specialists

(about half of the staff in this position) in November 2012. These specialists were charged with providing technical assistance to administrators in conducting classroom observations.

CPS’s Experiment with Teacher Evaluation: EITP

Between 2008 and 2010 CPS implemented the Excellence in Teaching Pilot (EITP), a pilot teacher evaluation program that used the Charlotte Danielson Framework for Teaching to guide the evaluation of classroom instruction. EITP provided an alternative system to the teacher evaluation checklist CPS had used for 30 years. Over the two-year period, a total of 100 elementary schools participated in the pilot.

CPS’s current REACH evaluation system

resembles EITP in many ways. Like the pilot, trained administrator evaluators observe teachers’ classroom instruction using a modified version of the Danielson Framework. Some observations are unannounced and others are planned in advance and include a pre- and post-observation conference. The new system, however, differs from EITP on several important dimensions. First, although the pilot had no stakes attached, REACH is the official evaluation system for non-tenured teachers in its first year, and will expand to include stakes for all teachers over time. Second, while administrator training for the smaller-scale pilot was done in-person, training for the new system was provided statewide via an online platform. And finally, the pilot provided measures of performance based

only on observations, while the new system includes a student growth component for all teachers regardless of the grade or subject they teach.

UChicago CCSR’s two-year study of the pilot found most principals and teachers were supportive of EITP and found it beneficial for their practice. Specifically, principals and teachers reported using the Danielson Framework and evidence from classroom observations made their conversations about instruction more objective and reflective. In addition, the study found principals’ ratings of teachers were both valid and reliable.B_{The pilot also uncovered some challenges.}

For example, many principals lacked the instructional coaching skills required to have deep discussions about teaching practice. Where principals were less proficient at conferencing with teachers, teachers were less posi-tive about the new system and more doubtful of their principal’s ability to use the Framework accurately or rate them fairly. A later follow-up study of EITP found that the pilot had a positive effect on both reading and math scores. Higher-achieving schools and those serving fewer low-income students were the primary beneficiaries.C

(11)

9

The Classroom Observation

Process

CHAPTER 1

The main element of the REACH evaluation system is the observation process used to rate teacher practice. The process is centered around the CPS Framework for Teaching (the Framework), a classroom observation rubric based on the Charlotte Danielson Framework (see Appendix B). REACH also establishes a set of pro-cedures for how evaluators should conduct classroom observations, collect evidence about what was observed, and discuss the evidence and ratings with teachers (see Table 2).

The teacher practice component is intended to serve two functions. Drawing on its roots in the Danielson Framework, the classroom observation process is structured to provide teachers with information they can use to improve their teaching practices. It includes a pre- and post-observation conference to create a forum for evaluators to provide constructive feedback to teachers on their practice and offer support for improvement. In addition, the teacher practice compo-nent is intended to provide school administrators with a means to evaluate the instructional effectiveness of TABLE 2

What does the formal observation process include in 2012-13?

Source: Modified from REACH Students Teacher Practice

Note: In 2012-13 administrators were required to conduct at least four formal observations for each non-tenured teacher and at least one formal observation for each tenured teacher.

Pre-Observation Conference

• A brief 15-20 minute conference with a focus on Domain 1 (Planning and Preparation) • The teacher and

administrator decide which lesson will be evaluated

Post-Observation Conference

• The teacher and administrator discuss the classroom observation • The teacher's self-reﬂection

is evaluated for Component 4A (Reﬂecting on Teaching and Learning)

• Ends with suggestions for improving teacher practice

Classroom Observation

• The administrator observes teacher for about 45 minutes • Observation primarily

focuses on the components in Domain 2 (Classroom Environment) and Domain 3 (Instruction)

• The administrator gathers evidence and assigns ratings

teachers in their building. Ratings across classroom observations are combined with test score gains to give each teacher an official evaluation score.

In this chapter we examine teachers’ and adminis-trators’ perceptions of the teacher practice component. Our findings draw on both survey and interview data. Survey data show the extent to which teachers and principals across the district have positive or negative views about the observations process. Overall, both groups find the process to be a useful means of helping teachers improve their instructional practice. Teachers appreciate the feedback they receive from their evalu-ators and believe the rating process is transparent. Administrators think the observation process will lead to improvements in teaching and student learning. Interview data provide insight into how the observa-tion process supports instrucobserva-tional improvement. In addition, it highlights teachers’ descriptions of how the coordination of the evaluation process can undermine the value of the observations as an improvement tool.

(12)

10

The Observation Process

Supports Professional Growth

Administrators and teachers expressed positive views of the teacher practice component’s potential to sup-port teacher growth and professional development. On the survey, 76 percent of teachers said the evaluation process at their school encourages their professional growth. Similarly, 76 percent of administrators reported believing that the observation process would result in instructional improvement at their school, and 82 percent reported noticeable improvements in half or more of the teachers they had observed over the school year (see Figure 1).

In interviews, teachers identified three ways in which the observation component supports teacher learning. First, they remarked that the Framework rubric sets clear expectations about quality instruc-tion. As one teacher succinctly put it: “The observation

rubric describes what really good teaching looks like. It gives me a clear description of what teaching looks like at each level.” Responses on surveys indicate that many

teachers and administrators agree with this sentiment:

75 percent of teachers and 91 percent of administra-tors reported that the Framework provides a common definition of high-quality and effective teaching. Clear descriptions of quality instruction help teachers tran-scend their own individual opinions about teaching and begin to compare their practice to others. One teacher explained:

You get into your own practices and form your habits and methods. But because everyone is working within the REACH system, you can start to see where you are in the system. Everyone is breathing the system language. If they all are reflecting the same language, you have to think about others and all the other teachers.

Because it creates explicit and shared expectations of quality instruction, teachers and administrators commented that the rubric also provides clear guidance about what teachers need to address in order to improve their practice:

I always thought there needed to be higher standards in teaching, and I think the observation rubric has made the standards higher. [Before] it was up to the principal’s discretion of how he or she felt. [Now] it’s clearer about what that means, how to grow, how to improve. —CPS Teacher

[In post-conferences] instead of just saying, ‘You got a 3 here and a 2 here,’ we can say, ‘What is the difference between a basic and proficient [rating]? I didn’t see this, I didn’t see this.’ And it was a really clear thing, ‘Start doing that,’ or, ‘Stop doing something else.’ —CPS Principal

Administrators were virtually unanimous on this point: 96 percent responded on the winter survey that the Framework helps them identify areas where teach-ers can improve.

Second, teachers reported that the teacher prac-tice component has potential to improve instruction

Percent of Administrators 70 60 50 100 90 80 40 20 10 0 30 Have incorporated your feedback into their teaching?

(n=622)

Have made noticeable improvments over

this year? (n=621)

Source: Spring Administrator Survey, May 2013

Note: Percentages may not add up to 100 due to rounding.

All Most About Half A Few or None FIGURE 1

Most administrators report at least half of their teachers have incorporated feedback and improved

Of the teachers you have observed this year, how many...

10% 63% 19% 8% 5% 48% 29% 17%

(13)

11

because it creates opportunities to discuss teaching

with administrators and colleagues. In particular, the pre- and post-conferences were a way of getting needed feedback and support:

I love being able to refine what I do and talk about it with somebody. So the idea is that I get to sit down every month or so and say ‘this isn’t really working for me’ and my administrator will find something that can help me. That is really beneficial. —CPS Teacher

I think that it is nice to have someone in the classroom frequently to really see how you’re doing and what you’re doing and give you feedback in a way that is not really an attack. It’s more like a positive, constructive criticism on different aspects of teaching. —CPS Teacher

These comments highlight that teachers value feedback on their instruction. They also show that conversations with administrators tend to be respectful and supportive. In fact, only 6 percent of teachers on the winter survey said feedback was delivered in a hurtful manner. Across the district, 82 percent of teachers indicated they have professional conversations with their administrators focused on instruction, 89 percent said their evaluator supports their growth, and 76 percent reported that their evaluator’s feedback was useful. Among administrators, 94 percent thought the Framework has improved the quality of their conversations with teachers about instruction.

Finally, teachers also noted the conversations helped them intentionally reflect on their own class-room practice. “I think it’s good to see what you did

and how you can improve,” one teacher said. “I can’t see myself teach, and I love to hear how I can improve.”

By creating opportunities to examine their own prac-tice, the observation process helps teachers identify their strengths and weaknesses, as well as prioritize areas on which to focus their improvement efforts. For some, the reflection habit carries outside of the

formal observation structure to their teaching more generally: “[The Framework] causes us to be more

con-scious of our planning and the words coming out of our mouth. It causes us to really look at what we are doing in our classrooms.”

Administrators agreed: 92 percent of principal and assistant principal survey respondents thought the Framework encourages teachers in their school to reflect on their instructional practice (see Figure 2). On the winter survey, 81 percent of teachers said it helps them identify areas where their teaching is strong, and 82 percent said it helps them identify areas where they can improve.

Most Teachers Believe Administrator

Ratings were Accurate and Fair

Teachers were generally positive about the accuracy of the ratings they received from school administra-tors. On the spring survey, 87 percent of teachers said their evaluator was fair and unbiased, and 88 percent said they were able to assess their instruction accu-rately (see Figure 3). On the winter survey, 72 percent of teachers said their ratings were about the same or higher than they thought they should have been.

One reason teachers were positive about their ratings is they believe the specificity of the Framework helps makes ratings more concrete:

I like that they [the Framework] actually specify what it is that we are being

evaluated on, versus the old system where your principal essentially gave you a rating and some comments about what you’ve been doing. With this [system], you’re either doing this or you’re not. If you’re not, then you’re not meeting [the standards]. If you are, then you’re proficient.

Another reason teachers feel ratings tend to be generally objective is because administrators have to collect and present evidence about what they specifi-cally saw during observations. “[The Framework] holds

a lot of accountability,” one teacher said. “Not only for the teachers but also for the administrator; they have to prove everything they’ve found.”

(14)

12

FIGURE 2

Nearly all administrators report the CPS Framework is useful for instructional improvement

The CPS Framework... Percent of Administrators 2% 5% 53% 40% 70 60 50 100 90 80 40 20 10 0 30

Is a useful tool for identifying teacher

effectiveness in this school

(n=622)

Is a useful tool for providing targeted support for teachers

(n=623) Encourages teachers in this school to reflect on their instructional practice (n=622) Has provided a definition of effective teaching in this school (n=623)

Has improved the quality of my conversations with teachers in this school about instruction (n=620) 2% 5% 50% 44% 2% 7% 58% 33% 2% 6% 54% 38% 2% 5% 54% 38%

Strongly Agree Agree Disagree Strongly Disagree Source: Spring Administrator Survey, May 2013

FIGURE 3

Most teachers believe their evaluator has the capacity to assess instruction fairly and accurately

My evaluator... Percent of Teachers 5% 8% 24% 63% 70 60 50 100 90 80 40 20 10 0 30 Is fair and unbiased (n=13,600) Is able to accurately assess teachers’ instruction (n=13,622) Knows my strengths and weaknesses (n=13,624) Knows what’s going on in my classsroom (n=13,627) 6% 13% 30% 51% 5% 10% 30% 54% 3% 9% 29% 59% 54%

To a Great Extent Some A Little Not At All Source: Spring Teacher Survey MVMS, April 2013

(15)

13

While having positive perceptions of their rating,

some teachers thought using multiple raters could improve the reliability of the ratings and how they are used in personnel decisions. One concern is that being observed by only one evaluator may lead to inaccurate ratings. Recent research seems to validate this view, finding that multiple observers produce more reliable ratings.25_{“I think it would be something to think about…}

having each observation done by a second person,” one

teacher stated. “Yes, ratings are evidence based, but

[evaluators] do make interpretations based on previous knowledge of you. So I think that ties into how they view what you are doing and if you are doing it well.”

In addition, some teachers worry administrators may use observation ratings to remove or deny tenure to staff they do not like. Thus, the ratings one receives may have little to do with what actually happens dur-ing an observation. “I just feel like it shouldn’t be the

administration who is doing the observation on you,” one

teacher said. “Because the bottom line is that if they don’t

want to keep you in the school, they are not going to. They are going to give you a bad observation rating.” Some

cities have addressed this problem by incorporating governance structures that support personnel systems. These systems typically include the use of expert men-tor teachers as evaluamen-tors and coaches, as well as review structures for personnel decisions that involve teachers and administrators in the decision-making process.26

Administrators’ Dual Role

Can Undermine Professional

Learning Benefits

REACH’s reliance on administrators both to officially evaluate teacher practice and to provide instructional coaching may undermine the learning potential of the observation process. Because observation ratings have such a big impact on summative evaluation scores, teach-ers are highly motivated to demonstrate their professional competence when they are observed. Since the admin-istrator giving official ratings is simultaneously provid-ing instructional support, however, teachers are forced to weigh the costs and benefits of using the observation process as an opportunity to share their instructional weaknesses. For some teachers, the risk of receiving a poor rating is too great. As one teacher explained:

Because there is such an emphasis placed on assessing the quality of teachers, there is no incentive for teachers to admit insecurity or talk about areas in which he or she strug-gles. I felt like I had to mask the things that I didn’t do as well and try to explain why they didn’t go well because, at the end of the day, I’m being rated. So there is more of an incen-tive to present myself favorably than to have an honest discussion about instruction.

Several teachers across our interviews schools described instances when they perceived that attempts to get support for addressing weaknesses led to nega-tive consequences on their evaluation. For example, one teacher said he was very honest at the start of the year about his practice by highlighting for his evaluator “things

in my daily teaching that I need to strive to fix.” After being

informally warned that his evaluator would pay more attention to those areas, he felt as though the evaluator:

…ended up putting a laser focus on the things that I do want to fix, but are hard to fix. Instead of being rewarded for being self-aware and honest about improvements, I feel like I’m actually being penalized.

Another teacher recounted going to her principal for help regarding classroom management issues in one class. Instead of receiving support, she felt the request led to her receiving a low observation rating:

My principal joked around and said he’ll do my next observation in that classroom [in which I was struggling]. It was a joke, and then he actually did it. When I said I really don’t know if it is appropriate for me to be judged based on that classroom, when there are so many other classrooms and grades that I teach where I’ve already been observed, I was scold-ed and told that I am essentially saying that I can’t do my job…It makes me feel like I can’t even come to my own administrator for help, because that information was essentially used against me in the observation process.

(16)

14

These two comments highlight the potential risks involved in asking an evaluator for support. If teach-ers present a realistic view of their teaching, they may be rated as less skilled compared to others who put on a “performance” during a scheduled observation. It is important to keep in mind that we do not have evidence

about how widespread instances like the ones above are. Nonetheless, these cautions show how the learning opportunities created by the observation process could, in the long run, be undermined when the evaluators giving ratings are also primary instructional coaches.

(17)

15

The Use of Student Growth

in Evaluation

CHAPTER 2

Prior to REACH, teachers in Chicago were not held for-mally accountable for the performance of their students. The use of student growth to measure teacher perfor-mance breaks new ground. A teacher’s student growth score summarizes the change in his or her students’ standardized test scores between two time periods: under REACH, the beginning and end of a school year.27 PERA requires that student growth be a “significant factor” in teachers’ evaluations, though CPS and the CTU agreed to phase in this requirement so that student growth accounted for no more than 25 percent of a teach-er’s evaluation score in the initial year. The weight given to student growth in a teacher’s final evaluation varies according to the subject and grade level of the teacher (see What Goes Into a Teacher’s Evaluation Score? on p. 7). Student growth is calculated differently depending on the assessment that is used. These assessments also vary by grade and subject, but they can include: • A gain score on district-developed Performance

Tasks, which are written or hands-on assessments specifically designed for the grade and subject of the course and are most often scored by the teacher • A value-added score on the NWEA MAP, an adaptive,

computer-based test administered to students in grades 3-8 in reading and math

• An expected gains score on the subject area EXPLORE, PLAN, or ACT (EPAS) assessments administered to students in grades 9-11 in English, reading, math, and science28

• A measure of average schoolwide literacy growth from either the NWEA MAP or the EPAS

The Student Growth Component of REACH box on page 20 further describes the measures of student growth used in REACH.

As set forth by PERA, student growth is incorpo-rated strictly for evaluation purposes. However, CPS has been clear that they expect REACH to positively

affect teacher development and student learning. If the student growth component is to be useful beyond teach-ers’ evaluations, it must provide teachers with informa-tion that can inform their instrucinforma-tion. A student growth score alone does not provide teachers with information that is timely or detailed enough to guide improve-ments in their instructional practice; it is one number that summarizes changes in test scores across a group of students over a given period of time. In contrast, students’ performance on the individual assessments used to calculate student growth might inform teach-ers’ instruction by providing them with information on their students’ skills or level of understanding.

In this chapter, we describe teachers’ responses to the use of student growth in their evaluations, as well as how useful teachers found the assessments for their in-struction. We find apprehension among teachers about the incorporation of student growth metrics into their evaluation. Teachers were generally positive about the potential instructional value of the assessments used to measure student growth, though the perceived use-fulness varied considerably by the assessment.

Teachers Are Apprehensive

About the Use of Student

Growth in Their Evaluation

Given that student growth is a new addition to teach-ers’ evaluation, it is not surprising that many teachers expressed concerns over its use in measuring teacher performance and in personnel decisions, or that many were misinformed or confused about how student growth factors into their evaluation. Additionally, some teachers raised concerns about the potential for bias when applying the student growth measures across different classroom contexts.

On our survey, 57 percent of teachers said that they believe or strongly believe that the REACH system relies too heavily on standardized tests (see Figure 4). Another 30 percent said that they somewhat believe

(18)

16

this, while only 13 percent of teachers said that they do not believe that REACH relies too heavily on standard-ized tests. We asked teachers an open-ended question about what they found most problematic about the REACH system. Nearly one-third of the 552 teachers who responded to this question identified the student growth component and the assessments used to mea-sure student growth, making these the most frequently cited problematic aspects of REACH.29_{While some of} these teachers maintained that test scores should never be used in teachers’ evaluations, others identified more specific concerns. These concerns included the narrow representation of student learning that is measured by standardized tests, the numerous influences on student performance that are outside of a teacher’s control, and an increase in the already heavy testing burden on teachers and students.30

Teachers’ responses to our interview and an open-ended survey item revealed that many of them were misinformed or unclear on how much student growth contributes to their summative evaluation. For exam-ple, one teacher wrote, “I am concerned about my effort

as a teacher completely relying on the test scores of my

students.” In fact, student growth does not account for

more than 25 percent of any teacher’s evaluation this year, and student growth will not account for more than 30 percent once REACH is fully implemented; there-fore, no teacher’s evaluation will completely rely on test scores. Some teachers further attributed the incorpora-tion of student growth to the district, rather than to the state law. As we show in the next chapter, most teachers reported receiving information about REACH from their school administration. Yet only 45 percent of principals and assistant principals reported having a strong or very strong understanding of how student growth factors into a teacher’s summative rating, so it is not surprising that teachers are also unclear.

Several teachers expressed concerns that measures of student growth are unfair to teachers in more chal-lenging schools because student growth, and therefore a teacher’s evaluation score, is related to the supports that students may or may not receive outside of the classroom. One teacher explained this concern:

… I’m not going to want to work in a [strug-gling] school if my evaluation is tied to test scores, because there are things that I can’t control. I can’t stop gang violence. I can’t stop poverty. I can’t stop the parents who don’t care if their kids go to school. I think the part that I find unfair is that so much of what goes on in these kids’ lives is affecting their academics, and those are things that a teacher cannot possibly control.

Related to the issue of fairness, many teachers expressed apprehension over how the student growth measures would be used by the district— in particular that they would be used to fire teachers or to institute merit pay. For example, one teacher explained that she had “grave concerns” that her students’ performance could negatively impact her job security, in part because there are so many other factors outside of the classroom that influence student growth.

Two groups of teachers—special education teachers and non-core subject teachers—were

particularly critical of the student growth component. Special education teachers raised concerns that

Percent of Teachers 70 60 50 100 90 80 40 20 10 0 30

Strongly Believe Believe Somewhat Believe Do Not Believe

(n=731) 13%

30%

24%

33%

Source: Winter Teacher Survey, January 2013

FIGURE 4

Most teachers believe or strongly believe REACH relies too heavily on standardized tests

Please indicate the extent to which you believe that REACH, overall, relies too heavily on standardized tests.

(19)

17

the REACH Performance Tasks, NWEA MAP, and

EPAS assessments were inappropriate measures of their instruction and of their students’ learning. One special education teacher explained: “The grade level

REACH Performance Tasks were nearly impossible for my special education students, and it will be difficult to show improvement for many students who are four and five grade levels behind.” This teacher’s concern was

echoed by many special education teachers who believed that holding their students—and, therefore, their own evaluation—to the same standard as regular education students and teachers was unfair. Many teachers were unclear on what accommodations could be provided for their special education students as they took the assessments, which were the same assessments that were given to regular education students.31

Some non-core subject teachers (e.g., art, music, and physical education) were troubled by the incorporation of schoolwide literacy growth into their evaluation. These teachers disliked being held accountable for the work of other teachers and for a content area that they were not necessarily prepared to teach. For example, a high school art teacher explained his feelings about the schoolwide literacy measure:

The comment has been made that I will get judged on reading scores because we are all teachers of literacy, and there’s a part of me that agrees with that. But...there is no part of my certification or training that says I need to learn how to teach a student how to read.

Teachers Found Beginning-of-Year

REACH Performance Tasks Useful

The REACH Performance Tasks were developed by teachers and district specialists as Type III assessments (see The Student Growth Component of REACH on p. 20). As defined by PERA, Type III assessments are rigorous, aligned to the course’s curriculum, and mea-sure student learning in that course. Performance Tasks are “a written or hands-on demonstration of mastery,

or progress towards mastery, of a particular skill or standard,”32_{which make them very different from} traditional multiple-choice assessments. The primary purpose of the REACH Performance Tasks is to provide

a measure of student understanding at the beginning and end of the school year so that a growth score can be calculated and incorporated into teachers’ evalua-tions. In the best case, however, the beginning-of-year Performance Tasks would also provide teachers with information that is useful for their instruction, such as information about their students’ skills or about the district’s expectations for what content should be covered in their class.

Among teachers who administered a beginning-of-year REACH Performance Task, 70 percent reported that it was somewhat or very useful for their instruction (see Figure 5). In interviews, teachers reported using the Performance Tasks as an indication of what mate-rial they needed to cover. Moreover, teachers seemed to appreciate the more comprehensive set of skills that students could demonstrate on the Performance Tasks: 72 percent agreed that the tasks provided information that is not measured on traditional multiple-choice assessments.

While the Performance Tasks provided teachers with insight into what material they needed to cover, few teachers used the Performance Tasks as measures of student understanding. Two-thirds of teachers (67 percent) agreed that the Performance Tasks were rigorous assessments of student learning, but they may have been too rigorous; nearly the same proportion (66 percent) indicated that the tasks were too chal-lenging for beginning-of-year assessments. Because the Performance Tasks often covered material that the students had not yet been exposed to, they did not provide a measure of students’ understanding of that material. Rather than test students’ prior knowledge, the Performance Tasks assessed students on content that they had not been taught. One teacher explained why the level of challenge is particularly a problem for a beginning-of-year assessment:

…[my students] were really upset by it. Not only because it was something they had never seen before, but they didn’t know me, so it was kind of like they didn’t know me and I was giving them something and challenging them in a way that it was unfair for them and it made me feel really bad.

(20)

18

This teacher explained that once she has gotten to know her students, she can motivate them to persevere through a challenging assessment. But without the time to build those relationships, it was difficult for her to help her students overcome their frustration.

There was fairly widespread confusion among teachers about the administration of the Performance Tasks. Just 41 percent of teachers who had adminis-tered a beginning-of-year Performance Task indicated that they were clear on how the tasks should be scored. Over one-third (36 percent) of teachers indicated they did not have adequate time to score the tasks, and one-third (35 percent) indicated they had difficulty recording the scores on the district’s internal site. Just under half (43 percent) of teachers indicated that they were not at all clear on what accommodations could be made for students with IEPs who were taking a Performance Task.

Apart from their instructional value, several teachers raised concerns about how easy it would be to game the scoring of the Performance Tasks. Teachers score their own students’ Performance Tasks at both the beginning and end of the year. In an interview and on an open-ended survey item, teachers noted that if they wanted to maximize their student growth score, they could simply give all students a low score on the beginning-of-year task and a higher score at the end of the year. While we have no measures of how frequent-ly—if at all—this practice occurred, it has the potential to undermine how teachers and administrators per-ceive the accuracy of the evaluation ratings.

NWEA MAP Provided Timely

and Useful Data

The NWEA MAP is a series of computer-based, adaptive assessments administered to CPS students in grades 3-8 at the beginning and end of the school year.33_{Seventy-eight percent of teachers who} admin-istered a beginning-of-year NWEA MAP assessment found it somewhat or very useful for their instruction (see Figure 5) and a similar proportion (75 percent) agreed that the NWEA MAP helped them to target their instruction to meet students’ individual needs.

Percent of Teachers 70 60 50 100 90 80 40 20 10 0 30 Performance Tasks (n=13,253) EPAS (n=3,225) NWEA MAP (n=6,316)

Source: Spring Teacher Survey MVMS, April 2013

Note: Responses only include teachers who conducted the assessment(s) in the fall of 2012. Only high school teachers in core subjects administered EPAS. Only elementary school teachers in grades 3-8 reading and math administered NWEA MAP. All elementary school teachers and a subset of high school teachers in core subjects administered Performance Tasks. Percentages may not add up to 100 due to rounding.

Very Useful Somewhat Useful A Little Useful Not Useful FIGURE 5

Teachers report assessments vary in instructional usefulness

How useful is the following for your instruction?

14% 17% 32% 38% 8% 14% 34% 44% 22% 29% 34% 16%

The rigor of the NWEA MAP assessments and the timeliness with which teachers receive their students’ results may help to explain the NWEA MAP’s instruc-tional value. Eighty-five percent of teachers who had administered the beginning-of-year NWEA MAP found it to be a rigorous assessment of student learning. Additionally, the majority of teachers agreed that the results provided by the NWEA MAP are both timely (92 percent) and easy to understand and use (72 percent).

While the computerized nature of the NWEA MAP assessments likely contributed to their instructional usefulness, it also created problems for some teach-ers. Over two-thirds of teachers who had administered the beginning-of-year NWEA MAP reported that they experienced technical difficulties, such as issues with computer hardware or internet access. One teacher explained how technical problems can affect student performance:

(21)

19

At our school, our technology isn’t up-to-date. The computers themselves are about nine or 10 years old….When everybody was taking the test at once, that was an issue because our routers couldn’t handle the amount of traffic. So the internet would go out. I think that really skewed our test results because students, especially on reading, would have to read the story and then go back to it and then they were stuck and they would have to go back. The students won’t reread what they read, so they might forget a part, and then they’re asked questions.

How to use the NWEA MAP with particular popula-tions of students was again a concern for teachers: 30 percent were not at all clear on what accommodations were acceptable for special education students and 39 percent were not at all clear on whether ELL students should take the NWEA MAP.

EPAS Results Not Timely or Detailed

High school teachers were less positive about the value of the beginning-of-year EPAS assessments for their instruction than elementary teachers were about the NWEA MAP assessments. EPAS assessments are given in grades 9-11, as part of ACT’s testing system. While 71 percent of teachers who had administered a beginning-of-year EPAS assessment agreed that the test was a rig-orous assessment of student learning, only 50 percent of those teachers reported that it was somewhat useful or very useful for their instruction (see Figure 5).

One issue limiting the instructional value of the beginning-of-year EPAS assessments is the timeliness with which teachers receive their results. Unlike the computer-based NWEA MAP that provides teachers with results immediately following the assessment, teachers do not receive their students’ EPAS scores for several months. Just 50 percent of teachers who admin-istered a beginning-of-year EPAS assessment indicated that they had received their students’ results in a timely manner. In our interviews, we heard why this delay is problematic for teachers:

We didn’t get the results back until basically almost January, so it’s kind of like the data is dead...[it] reflected what they knew three months ago. If I had gotten an item analysis, that would have been more helpful. But I just got a raw score so I know that they scored a 14….I know I want to improve that score, but I don’t know why they’re getting that score.

Moreover, as the teacher above explained, the results of the EPAS exams are not detailed enough to guide teachers’ instruction.34_{Teachers receive their students’} subject scores, but not an item analysis. Just 44 percent of teachers indicated that the beginning-of-year EPAS helped them to pinpoint individual students’ strengths and weaknesses.

Since high schools had administered EPAS as paper and pencil exams for a number of years before the implementation of REACH, their administration caused less widespread confusion among teachers than the NWEA MAP or REACH Performance Tasks. However, the issue of how to use the tests with special education and ELL students remained: 25 percent of teachers indicated that they were not at all clear on what accom-modations were acceptable for students with IEPs and 35 percent were unclear on whether their ELL students should take the assessment.

As this chapter and the one that precedes it show, the implementation of the observation process and student growth component of REACH required substantial effort from teachers and administrators. In the next chapter, we explore the challenges and successes of training and communication to support this effort.

(22)

20

The Student Growth Component of REACH

Illinois’ Performance Evaluation Reform Act (PERA) defines three different assessment types:

• Type I assessments can be the typical multiple-choice standardized assessment that “measures a certain group or subset of students in the same manner with the same potential assessment items, is scored by a non-district entity, and is adminis-tered either state-wide or beyond Illinois.” • Type II assessments can be “any assessment

developed or adopted and approved for use by the school district and used on a district-wide basis by all teachers in a given grade or subject area.” • Type III assessments are “rigorous, aligned to

the course’s curriculum,” and are determined by the teacher and qualified evaluator to measure student learning in that course.D

PERA stipulates that all teachers must be evalu-ated using at least one Type I or Type II assessment and at least one Type III assessment.E_{To meet this}

requirement, CPS has identified two different types of student assessments to be used as part of REACH:

REACH Performance Tasks

As its Type III assessment, CPS utilizes REACH Performance Tasks, which were administered in the fall and the spring and are intended to measure change in student mastery over one or two skills or standards. The REACH Performance Tasks were developed by over 150 teachers organized into teams

aided by content area specialists from central office. These teams developed over 90 Performance Tasks that covered all elementary teachers, including those teaching in areas such as art, music, physical educa-tion, and library studies that are not traditionally covered by standardized tests and a subset of teach-ers in high school core courses. Each Performance Task fall/spring pair took approximately 40 hours to draft, revise, and pilot.

Value-Added and Expected Gains Measures

For its Type I assessment in elementary schools, CPS has chosen to compute teachers’ value-added score on the math and reading NWEA MAP. A value-added score from the fall to spring administrations of the NWEA MAP will be computed for teachers who teach grades three through eight reading or math. All other elementary school teachers will receive a schoolwide literacy growth score.

For its Type I assessment for high school teachers, CPS is exploring using the EPAS suite of tests (EXPLORE, PLAN, and ACT) to measure expected student gains. In 2012-13, the EPAS assessments were administered without stakes. EXPLORE was administered twice to ninth-graders, PLAN twice to tenth -graders, and ACT twice to eleventh-graders. While these scores will not count towards teachers’ evaluation this year, the data will be used to develop an expected gains metric for possible use in the 2013-14 school year.

(23)

21

Training and Communication

CHAPTER 3

As with the implementation of any major policy initia-tive, REACH required extensive communication and training efforts at both the district and school levels. In Chicago, almost 1,200 administrators and over 20,000 teachers needed to be informed about and trained on the new system in this first year of implementation.

In this chapter, we describe teacher and administrator experiences with training and communication. We draw upon our surveys and interviews of teachers and admin-istrators to understand how well-informed and prepared participants felt in this first year of implementation, and to explore what areas they felt still needed improvement. We find that, while administrators received extensive training, training and information for teachers varied widely both across and within schools. Finally, teachers and administrators alike expressed a need for transpar-ency not only about how final summative scores would be calculated but also about how teacher evaluation would ultimately be utilized in personnel decisions.

Administrators Felt Prepared

to Conduct Observations and

Assign Ratings

More than 80 percent of administrators reported their proficiency as strong or very strong in recording and aligning evidence and determining observation ratings. Administrators received extensive training in these areas. Prior to conducting any observations,

administrators had to complete an online certification process that included video-based scoring practice and an assessment of their rating accuracy. On average, administrators reported spending over 30 hours on this certification process. Administrators who did not pass the assessment portion of this certification after two attempts were required to attend additional in-person training. Administrators who did not pass the assessment portion after four attempts did not conduct observations. As of November 2012, almost 90 percent of CPS administrators had been certified.35

Beyond certification, the district required admin-istrators to attend four half-day, foundational REACH professional developmen