Merrill_unc_0153D_18388.pdf

(1)

CLEARING A PATH: GENERATING AND VALIDATING A CONSTRUCT DEFINITION OF AND MEASURES FOR TEACHER WORKING CONDITIONS

Becca Merrill

A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department

of Education.

Chapel Hill 2019

(2)

(3)

ABSTRACT

Becca Merrill: Clearing A Path: Generating and Validating a Construct Definition of and Measures for Teacher Working Conditions

(Under the direction of Lora Cohen-Vogel)

Teacher working conditions (TWCs), meaning the circumstances of schooling under which teachers perform their jobs, have been linked to teacher retention and student learning. These connections make TWCs important to understand. However, the literature on TWCs lacks cohesiveness, with widely varying definitions, conceptualizations, and measures of TWCs. Contributing to this lack of cohesion is that there is no accepted construct definition delineating and defining what TWCs are and how to measure them. This dissertation fills a gap by

developing a construct definition of TWCs through a two-step process. First, I developed a draft construct definition of TWCs through a systematic review and narrative synthesis of the

literature, with a focus on defining TWCs narratively and organizing its components into a typology.

Second, I employed the Delphi method, new to education policy research, to revise and validate the TWCs construct definition. Using this method, I administered two rounds of iterative questionnaires to teachers in North Carolina to obtain their feedback, suggestions, and input on the draft construct definition. Additionally, to address an overreliance on subjective measures in the literature, the questionnaire was designed to generate observable measures of TWCs.

(4)

(5)

TABLE OF CONTENTS

LIST OF TABLES.. ………...xi

LIST OF FIGURES. ………...xiii

LIST OF ABBREVIATIONS. ………...xv

CHAPTER 1: INTRODUCTION……….……...1

Research Questions and Summary of Findings……….…...3

Significance of Research………...5

Motivation for Studying TWCs with Review of Literature………...6

What is missing from literature on TWCs……….…..6

TWCs and student learning……….………...9

TWCs and teacher retention……….…..10

Background and Historical Context……….…..12

Early working conditions……….…..13

Major features of schooling………..….14

Teacher labor market dynamics………...15

Cycles of regulation………...…18

Summative comments………19

Paradigmatic Approach………..…20

Research Approach………....21

(6)

Methods………..…26

Phase 1: sample selection………...26

Search strategy..……….26

Critical appraisal..………..31

Search strategy redux: reference list check.. ………...39

Phase 2: element extraction..………...……40

Phase 3: data synthesis..……….………41

Step 1: line-by-line coding..……….………..…41

Step 2: translation and thematic coding..……….…..…44

Step 3: construct definition configuration..………53

Findings………...……...54

A definition of TWCs………...……..55

Applying the TWCs definition to the categories of TWCs from the sample………...60

A typology of TWCs………...61

Actors………...67

Constructs………...….70

Interrelations of TWCs categories………...………...73

Variation in concepts of TWCs………...………...75

Discussion and Conclusions………...……….82

(7)

Research Questions………...…..86

Delphi as a Technique for Studies Requiring Judgment………...…..87

Choice of Delphi for this Study………...……89

Background on the Delphi Method………...92

Delphi method overview………...…92

Major design elements of Delphi………...…...93

Philosophical theory and assumptions………..98

Function of Delphi method………...….102

History and use of the Delphi method………..…...102

Critiques………...105

Strengths of the Delphi Method………...111

Summative Comments………...112

Methods……….……...113

Overview of study elements.. ………...114

Round 1.. ……….115

Round 1 sample . ………...116

Round 1 sample eligibility criteria and selection. ………...116

Round 1 study initiation.. ………...121

Round 1 study participation.. ………...121

Round 1 sample description. ………...123

Round 1 Questionnaire.. ……….128

Round 1 analysis.. ………...131

(8)

Round 1 Questionnaire qualitative analysis. ………...132

Round 1 Feedback for participants. ………...142

Round 2.. ……….142

Round 2 sample. ………...143

Round 2 sampling frame. ………...143

Round 2 sample selection.. ………...145

Round 2 panel participation.. ………...145

Round 2 Questionnaire. ………..153

Round 2 personal workspace. ……….154

Round 2 analysis. ………...155

Round 2 feedback for participants.. ………...158

Round 1 and Round 2 Joint Analysis. ………...159

Narrative definition. ………...159

Typology of categories and constituent components. ………...159

Threshold decisions. ………...160

Re-organizations. ………...162

Counts, correlation, and comparison. ………...163

Comparison to NC TWCS. ………...164

Subjectivity of measures.. ………...166

Single component analysis. ………...166

CHAPTER 4: FINDINGS. ………...168

What are TWCs?.. ………...169

(9)

Typology of categories and components. ………...170

How can TWCs be measured?. ………...174

Total Count of Measures and Distinct Measures.. ………...177

Teacher Ranks of Importance of TWCs components.. ………...181

Comparing counts, percent, and ranks.. ………...184

Comparing NC TWCS to study data.. ………...185

Items only in Distinct Measures.. ………...186

Items only on NC TWCS. ………...189

Comparing counts, percent, and ranks with items only in Distinct Measures.. ………...192

Subjectivity of measures. ………...192

Single component focus. ………...195

CHAPTER 5: DISCUSSION.. ………....206

Limitations. ………...206

Defining TWCs. ………...209

Measuring TWCs. ………...211

Conclusions.. ………...216

APPENDIX A: NARRATIVE SYNTHESIS SAMPLE ARTICLE CITATIONS……...219

APPENDIX B: STUDY TIMELINE………..………..…………...226

APPENDIX C: ROUND 1 LETTER OF INTRODUCTION AND SUPPORT………...227

APPENDIX D: INVITATIONS TO PARTICIPATE………...229

APPENDIX E: REMINDER EMAILS TO PARTICIPANTS………...………...233

APPENDIX F: ROUND 1 QUESTIONNAIRE TEMPLATE………...239

(10)

APPENDIX H: ROUND 1 FEEDBACK TO PARTICIPANTS ………...353 APPENDIX I: ROUND 2 FEEDBACK TO

PARTICIPANTS……….………....………...…...412 APPENDIX J: TABLE OF DISTINCT MEASURES COMPARED with

(11)

LIST OF TABLES

Table 1. Relevance Rubric………...…36

Table 2. Theories Represented in TWCs Sample Articles………...42

Table 3. Excerpt from TWCs Concept Decomposition Spreadsheet………...45

Table 4. Narrative Definitions and Key Elements of TWCs from Sample Articles……...…56

Table 5. Teacher Working Conditions Typology………...…62

Table 6. Definitions of TWCs Typology Categories, Components, and Sub-Components………....………....………....……....63

Table 7. Summary Statistics of Frequency of TWCs Categories Among Sample Articles.. …………....………....………....…………...76

Table 8. Count of the Number of Measures Per Construct with Heat Map Overlay………....………....………....……...79

Table 9. Number of Teacher Generated Measures Recategorized from Each Category. …………....………....………....…………....133

Table 10. Teacher Suggestions from Questionnaire 1 for Additional or Missing Components to Add to the Teacher Working Conditions Typology...140

Table 11. Rubric for Germaneness of Responses. …………....………....……...146

Table 12. Rubric for Nuance in Responses. …………....………....…………...147

Table 13. Round 1 Percent Agreement Statistics by TWCs Category. …………....…………..170

Table 14. Revisions to Teacher Working Conditions Typology. …………....…………...175

Table 15. Teacher Working Conditions Final Typology. …………....…………...176

Table 16. Number of Total Measures and Distinct Measures Generated. …………....………..178

Table 17. Count of Ranks of TWCs Components from Round. …………....…………...182

Table 18. Comparison of Top Components by Method of Measurement. …………....……...184

(12)

Table 20. Comparison of Top Components by Method of Measurement with

(13)

LIST OF FIGURES

Figure 1. Process for narrative synthesis of researcher concepts of teacher

working conditions ………...…27

Figure 2. Workflow map of teacher working conditions systematic document sample selection ……….……….………....………....…..29

Figure 3. Synthesis of decomposed researcher concepts ………...…48

Figure 4. Area graph of teacher working conditions category overlap ………....….…74

Figure 5. Major elements of Delphi and related decision points for this study...…94

Figure 6. Delphi process overview………...…….113

Figure 7. Round 1 sample selection process.…………....………....…………...117

Figure 8. Round 1 Questionnaire study participation. …………....………...….122

Figure 9. Round 1 participation by years of teaching experience. …………....…………...123

Figure 10. Round 1 participation by gender. …………....………....…………...125

Figure 11. Round 1 participation by ethnicity or race. …………....………....…...125

Figure 12. Round 1 participation by teaching tested and non-tested subjects...126

Figure 13. Round 1 participation by teaching electives. …………....…………...126

Figure 14. Round 1 participation by school setting.. …………....………...…127

Figure 15. Round 1 Participation by urbanity or rurality. …………....………...….128

Figure 16. Round 1 participation by school level. …………....…………...128

Figure 17. Excerpt from Questionnaire 1, Budget & Spending item.. …………....……....……130

Figure 18. Analysis of open-ended teacher-suggested measures. …………....…………...135

Figure 19. Round 2 process for composing sampling frame. …………....…………...144

Figure 20. Round 2 participation by years of experience. …………....…………...149

(14)

Figure 22. Round 2 participation by ethnicity or race. …………....…………...150

Figure 23. Round 2 Round 2 participation by teaching tested or non-tested subject...150

Figure 24. Round 2 participation by teaching an elective…………....…………...151

Figure 25. Round 2 participation by school setting…………....…………...151

Figure 26. Round 2 participation by urbanity or rurality.. …………....………...…152

Figure 27. Round 2 participation by school level. …………....………....….…….152

Figure 28. Excerpt from Facilities tab of personal workspace. …………....………...….156

(15)

LIST OF ABBREVIATIONS

TWCs Teacher working conditions

(16)

CHAPTER 1: INTRODUCTION

“[These observations] lead to the proposition that ‘everything,’ presumably the quality of education provided by a school, depends on the interaction between teachers–more or less competent, more or less satisfied–and the circumstances of schooling.”

-Goodlad, 1984 p. 178

The importance of TWCs is reified in numerous studies that show an association between both teacher retention and student learning (e.g., Hirsch & Emerick, 2006, 2007; Johnson, Kraft, & Papay, 2012). As working conditions improve the rate of teacher turnover decreases (Boyd et al., 2011; Ingersoll & May, 2011; 2012; Johnson, Berg, & Donaldson, 2005; Johnson, et al., 2012; Kraft, Marinell, & Yee, 2016; Ladd, 2011; Podolsky, Kini, Bishop, Darling-Hammond, 2016). In one study, Kraft and colleagues (2016) examined a set of working conditions at middle schools in New York City and found that a one standard deviation increase in teacher working conditions in a school at the 50th_{percentile in the distribution of measured working conditions is} associated with a 25% reduction in the likelihood of teacher turnover. Evidence further shows that TWCs are associated with teacher stress (Geving, 2007; Grayson & Alvarez, 2007; Pearson & Moomaw, 2005; Skaalvik & Skaalvik, 2009), teacher exhaustion, personal accomplishment (Pearson & Moomaw, 2005; Skaalvik & Skaalvik, 2009), and overall job satisfaction (Basak & Ghosh, 2011; Hirsch & Emerick, 2007; Liu & Ramsey, 2008).

(17)

Furgeson, Strauss, and Vogt (2006) operationalized TWCs as the levels of student achievement, student income, and crime that characterize the school environment in which a teacher works; whereas, Boyd and colleagues (2011) approached TWCs using very different categories. They constructed narrower categories of working conditions, focusing on six salient features from the literature: teacher influence, administration, staff relations, student behavior, facilities, and safety. More recently, using NYC middle school survey data, Kraft and colleagues (2016) used factor analysis to develop yet another set of working conditions along four dimensions of TWCs: leadership and professional development; academic expectations; collegial and social

relationships; and school safety. These studies exemplify the lack of agreement among researchers about how to define TWCs and the components that make up TWCs.

The lack of agreement in defining and parameterizing TWCs lends to the disorganization of the literature. Further, there is evidence to suggest that the requirements in No Child Left Behind (NCLB), signed into law in 2002, materially changed the school environment (Dee, Jacob, & Schwartz, 2013; Nichols & Berliner, 2008). The influence of NCLB on the school environment necessitates contemporary documentation. Currently, there is no common definition of what constitutes a “teacher working condition.” Further, no list, system, or framework for detailing the components of TWCs is present in current literature (Berry, Smylie, & Fuller, 2008). With every researcher, or even every study, defining TWCs differently, how can a cohesive knowledge around TWCs emerge? What are policymakers to make of this work? How can practitioners and school leaders learn from it?

(18)

objective and observable methods for measuring TWCs. With a construct definition of TWCs, research will be able to use a common language to study and measure TWCs, moving beyond their importance to their interrelations and avenues for creating productive working conditions.

Improvements in TWCs should improve student learning on two fronts. Given the associations of TWCs with teacher turnover, improvements in TWCs can be expected to decrease rates of teacher turnover and, subsequently, increase the number and quality of candidates willing to teach. Second, research by Johnson and colleagues (2012) and Kraft and Papay (2014) connect TWCs with student learning and teacher growth. Hence, improvements in TWCs are expected to improve the learning conditions of students and the caliber of their teachers.

In the remainder of this chapter, I outline my research questions, explain the significance of the research I have conducted, and review the literature on TWCs integrated with my

motivation for studying TWCs. Then, I provide historical background to inform the development of TWCs over time and explain the paradigmatic and research approaches of my work.

Research Questions and Summary of Findings

(19)

I leaned on the literature and researcher concepts of TWCs in configuring a construct definition, learning from previous attempts at defining TWCs. From the research on TWCs, I generated a database of researcher concepts of TWCs from which I synthesized a narrative definition. Second, I hypothesized that a survey of researcher concepts of TWCs would provide the breadth of concepts necessary from which to build a comprehensive typology of TWCs. The construct definition generated from the narrative synthesis includes a narrative definition and typology. It defines TWCs overall; defines broad categorizations of TWCs; details components of each category; and synthesizes the theories that underpin research on TWCs. I present a detailed account of this process in Chapter 2.

Second, I improved and validated the construct definition by utilizing teacher judgment in a Delphi study that employed a random sample of North Carolina teachers as well as a purposive selection of teachers from the random sample. The resulting literature-based and teacher-vetted construct definition contributes a clear and purposeful definition of TWCs. I also provide a typology that is more comprehensive than concepts used in contemporary quantitative research as well as at least one widely-used survey of TWCs in current use, the North Carolina Teacher Working Conditions Survey (NC TWCS). An additional purpose of the Delphi study is to address the over-usage of subjective measures of TWCs, by generating objective and

observable measures, which I accomplish through iterative questionnaires to teachers. Teachers generated a wide variety of measures, the bulk of which are objective and observable.

(20)

thematic analysis of the measures in each component of the TWCs typology show promise for revealing insights into how teachers conceive of each component, the methods for measurement that teachers find valid, and important areas for researchers to focus on in future studies of TWCs.

Significance of Research

The development of a research-based construct definition of TWCs is necessary to change the present course of research on TWCs, a course that has thus far resulted in a myriad of definitions of TWCs, heterogeneous understandings of what TWCs are, and a lack of

cohesiveness. A construct definition of TWCs provides a touchstone for researchers to frame studies, define which working conditions they are studying, situate their work among the broader context of TWCs, and aid in designing programs to improve TWCs.

(21)

without a common understanding of what TWCs are and how to measure them. Therefore, the work of this dissertation, to build a construct definition of TWCs that defines and provides methods of measuring them, is necessary for progress in improving the working conditions of teachers and learning conditions of students.

The main product of this dissertation is a research-based and teacher-vetted construct definition of TWCs. With the input of and exposure to stakeholders in TWCs, this dissertation promises to develop a useful and necessary resource for future research, practice, and policy.

Motivation for Studying TWCs with Review of Literature

Though history provides some motivation for studying TWCs, research further catalyzes the need to generate a common construct definition of TWCs. The next three sub-sections outline my motivations for studying TWCs and the need to define and parameterize the study of TWCs. In the first section, I discuss what is missing from the literature on TWCs: a common construct definition and observable measures. Next, I discuss the associations between student learning and TWCs, and in the last section, I review what is known concerning student learning and TWCs as well as teacher retention and TWCs. These three sections together reveal that TWCs are related to various aspects of student learning and teacher retention, but we are hindered in answering questions about why these associations exist or the mechanisms that drive them due to the lack of a common construct definition of TWCs and observational methods for measuring them.

(22)

that researchers differentially conceive of TWCs, consider the different ways that these three researchers measure and describe them. Bacolod (2007) uses three measures to study TWCs: pupil/teacher ratio, the percent minority students, and the percent of students eligible for free and reduced-price lunch. Barnes, Crowe, and Schaeffer (2008) also cite three topics in relation to TWCs but these three topics are: administrative support, teacher satisfaction, and teaching assignment. As a third point of reference, Dagli (2012) uses a concept of TWCs that includes 10 topics: administrative support, classroom autonomy, influence in school curricular decisions, influence in global school policy decisions, job satisfaction, colleague support, professional support, interference, satisfaction with salary, and burnout. These 10 topics are measured using 38 survey items. There is no overlap of topics common to all three studies and only one topic common between two studies (Barnes et al., 2007; Dagli, 2012), which is administrative support. Are all of these studies of TWCs? Which topics should be included to be able to claim that a study is of TWCs or to capture integral elements of TWCs? With no construct definition defining and detailing the components of TWCs, the answers to these questions are unknown.

Hirsch and Emerick (2007) find evidence that teachers and principals conceive of or at least perceive the working conditions present in schools differently. In their North Carolina survey of teacher working conditions, principals report statistically significantly more positive conditions than teachers, especially regarding school leadership. Hirsch and Emerick (2007) report that these findings have been replicated in Arizona, Ohio, and Kansas, suggesting that this pattern holds across several state education systems.

(23)

of measure they use is clear from the author’s description. I differentiate between observational measures (e.g. demographic measures or percent free and reduced price lunch) and professional opinion, meaning how a teacher perceives their working conditions (e.g. the adequacy of support or effectiveness of communication). Fifty of the 69 articles employ professional opinion items for 90% or more of their measures, with the majority using opinion items for all of their

measures. Eight articles use a more or less even mixture of observational and opinion measures, and 11 articles use only observational measures, which are usually economic measures (e.g. percent free and reduced-price lunch or a binary variable indicating whether or not the school is designated as Title I) or demographic measures (e.g. percent of minority students). These results illustrate the reliance of research on professional opinion items to measure TWCs.

Though the professional opinion of teachers and principals are important in their own right, Hirsch and Emerick’s (2007) findings, describing the disagreement in perceptions of TWCs between teachers and principals, illustrate the challenge in relying heavily on these measures. They conclude that the lack of agreement between teachers and principals exposes a serious hindrance to prioritizing TWCs in policy. Their solution is to produce “school-based data-driven working conditions conversations” (Hirsch & Emerick, 2007, p. 21). While these conversations may need to occur, their solution cannot come to fruition without: 1) a common understanding of what TWCs are and 2) measures of working conditions which do not have the same teacher-principal variation as professional opinions do.

(24)

employed as a common point of reference in dialogue with teachers, principals, researchers, and policymakers. Next, a second product of the Delphi study is the observational measures of TWCs generated by a select panel of teachers. The two products of this study, the construct definition of TWCs and observational measures, together produce the opportunity to develop a common set of measures for working conditions that can be compared across schools. Following Hirsch and Emerick’s (2007) logic, these products will aid in creating the conditions necessary for TWCs to become a sustained priority in policy.

TWCs and student learning. Despite the wide variation in defining and measuring TWCs, there is research that supports their importance in education. In the next two sections, I relate what is known about TWCs regarding two major policy and societal concerns: student learning and teacher retention.

(25)

evidence that schools that have poorer TWCs and show improvement, have stronger gains in student achievement. Conversely, schools with higher measures of TWCs show stronger gains relative to schools with moderate and low measures of TWCs. Kraft et al.’s (2016) findings corroborate the other three papers, including the finding that the quality of TWCs exists on a continuum, and that where a school measures on that continuum affects the degree to which TWCs affect student achievement. Last, Kraft and Papay (2014), employing the NC TWCs Survey, illustrate that improvements in TWCs are associated with the rapidity of novice teacher gains in quality. They find that teachers, in school contexts that had high measures of TWCs, experience more rapid improvement in teaching quality.

In addition to studies that provide an overall measure of TWCs, there are several studies that investigate single or cross-sections of TWCs. Among these studies, teacher autonomy (Lee & Smith, 1996); collaboration; cooperative effort (Kraft et al., 2015; Lee & Smith, 1996); learning climate (Lee & Smith, 1996); student behavior (Houchens et al., 2017; Kraft et al., 2015); leadership; parental involvement; feedback; and time (Kraft et al., 2015) all have positive associations with student achievement. Jackson and Brugemann (2009) provide evidence that a teachers’ colleagues are a working condition that effects a teacher’s professional development and skill acquisition. They find that teachers’, especially new teachers’, student achievement scores increase in the presence of colleagues with high value-added scores. Whether

investigating a single working condition or a more comprehensive concept of TWCs, these studies agree that TWCs are positively associated with student learning.

(26)

2012; Kraft et al., 2016; Ladd, 2011; Podolsky et al., 2016). Some studies have found that TWCs explain the majority of variation in both teacher satisfaction (Shen, Leslie, Spybrook, & Ma, 2012) and teacher retention (Ingersoll & May, 2012). In the case of teacher retention, the effect of TWCs on the construct definition reduced the coefficient on student SES to statistical insignificance (Ingersoll & May, 2012).

(27)

et al., 2012; Waddell, 2010). The host of components of TWCs that are associated with teacher retention motivates the need for a better understanding of what TWCs are and how to study them, which mirrors my research questions asking what TWCs are and how to measure them.

Background and Historical Context

In the previous section, I discussed my motivations for developing a contemporary construct definition and measures of TWCs through the first part of this study, which was a narrative synthesis of the literature. In part two, I followed-up the narrative synthesis with a Delphi study, which is an iterative investigation. In this section, I provide historical background that delineates how features of present-day TWCs can be traced in history. This background also sheds light on the important relationship between teacher retention and TWCs. TWCs have evolved alongside policy changes, cultural shifts, economic conditions, and educational goals. Four major studies of TWCs provide background and descriptions of the development of TWCs in the United States: Elsbree (1939), Goodlad (1984), Johnson (1990), and Lortie (1975). This background illustrates some of the historical, cultural, and political characteristics that contribute to the landscape of contemporary TWCs.

Elsbree (1939) describes the conditions and context of TWCs starting in the colonial period and spanning to his contemporary time. Following Elsbree (1939), Lortie (1975) published his landmark study of school teachers, describing their daily routines, activities, and attitudes. Next, Goodlad (1984), publishes A Place Called School. Though the text covers a broad swathe of topics, Goodlad focuses a chapter specifically on the work of teachers. He concludes that:

(28)

of the workplace that appear to them not to be within their control, it is reasonable to expect frustration and dissatisfaction to set in. Undoubtedly, teacher effectiveness, in turn, is constrained and the very problems frustrating teachers are exacerbated. Students’ perceptions of the quality of the education being provided decline. It is reasonable to assume that the actual quality of this education declines also. (1984, p. 180)

The last major contribution describing the working conditions and routines of teachers is Johnson (1990). Similar to Goodlad (1984), Johnson’s work develops from in-depth interviews with teachers. However, where Goodlad’s (1984) study includes the perspectives of school leadership, parents, students, and teachers, Johnson’s (1990) research focuses solely on teachers. Diverging from the earlier account of Lortie (1975), who reports teachers preferring isolation from their colleagues, Johnson (1990) provides one of the first accounts of the importance of teacher collaboration in schools, noting that “cultural bonds generate rather than consume

energy” (p. 218). Taken together, these studies pan across centuries of historical developments in TWCs, providing context to present-day TWCs.

In the next section, I briefly describe how historical conditions shape the contemporary landscape of TWCs. Working conditions for teachers changed alongside economic

developments, urbanization, industrialization, and the political structures of the day. Therefore, in this section I present: an overview of early TWCs in the United States, the development of features of schooling that define major structural TWCs, dynamics of the teacher labor market that respond to TWCs, and cycles of regulation that dictate features of policy in TWCs.

(29)

today’s full-time teaching positions, these colonial-period teachers were hired to teach for a stipulated period and held other jobs simultaneously.

Supervision of colonial-period teaching was scarce, and the teacher’s schoolhouse was, to a large extent, his or her domain. As schools moved into cities alongside residential urbanization, teachers lost the privacy of isolated schoolhouses in rural locations. In cities, citizen governing boards more closely supervised teachers’ duties. Further, within cities, larger student populations grew, so that city schools employed more than one teacher. However, one-teacher school houses survived in rural areas far into the 20th_{century. Tyack (1967) documents that in 1956 there were} still 34,964 one-teacher schools in the United States (as cited in Lortie, 1975).

The general structure of the hierarchy of authority in schools has not changed for over 200 years (Lortie, 1975). School boards, superintendents, and principals dominate decision trees in school systems in contemporary school systems as they did two centuries ago. This structural authority means that teachers lost the autonomy to govern their daily routines and duties as historians suggest they did during the colonial period and leading up to industrialization (Elsbree, 1939; Lortie, 1975).

Major features of schooling. Some major features of schools, still prominent today, were introduced in the early 1900’s, which have changed the conditions of teachers’ work (Elsbree, 1939). In response to school boards’ unchecked powers to certify, hire, determine pay, and fire at will, teachers responded by advocating for tenure, legislation to prevent dismissal after a teacher has successfully completed a probationary period, and the single-salary schedule (Lortie, 1975).

(30)

According to these scholars, women could be paid far less than men to do the same job, so hiring women made economic sense. On the supply side, women had few socially acceptable or viable options in occupation. The burgeoning and acceptable teaching occupation was, therefore, attractive. To explain the dwindling number of men in teaching, scholars turn to the developing industrial and commercial technology, which presented more lucrative avenues than teaching, avenues generally not available to women. Further, other early scholars suggest that as more women took up teaching, men were less encouraged to enter the profession (Lortie, 1975). However, Lortie (1975) points out that this pattern did not hold for high school teachers, where men and women taught in equal proportions. As teaching became a female-dominated profession and administration remained a male-dominated profession, the contemporary landscape of working conditions for teachers became more recognizable.

Teacher labor market dynamics. Concomitant with demographic changes in the teaching profession, changes in the teacher labor market arose. Teacher shortages, meaning a lack of qualified candidates to fill teaching positions, and teacher turnover, meaning the rate at which teachers leave their schools and the profession, have been endemic to teaching for decades (see Anderson & Eliassen, 1943; Dworkin, 1980; Keeler, 1973 for historical perspectives). This concern remains a contemporary challenge.

History provides some insight into the teacher turnover phenomena but does not

completely account for it. Technology often enables a workforce to shrink, replacing people with machinery. Consider the self-checkout systems of grocery stores, online shopping, and the use of robotics in manufacturing that have replaced human workers. However, teaching has the

(31)

Historically, both the candidates that teaching attracts and laws against married women teaching, a policy prevailing into the 1900’s, defined the stability, or rather instability, of the workforce. Without married women, the majority of the teaching pool was comprised of young single women, the majority of whom planned to marry and therefore necessarily quit teaching. This designated teaching as a temporary job, not a career, and instituted a systematic cycle of turnover within the profession (Lortie, 1975).

In response to this reality, the structure of teaching was designed to be independent, rather than interdependent (Lortie, 1975). School administrators structured teacher exits and entries to accommodate the school year. Additionally, single classrooms were employed, where new students moving into new classrooms could seamlessly be paired with a newly hired

teacher. Lortie (1975) points out that if teaching became interdependent, the high level of teacher turnover would cripple the school system. Thus, even as schools grew, classrooms were

organized and partitioned off by age (Lortie, 1975), isolating teachers even though colleagues were only across the hallway. These features define the structural working conditions for teachers, working conditions that were designed to accommodate a churning workforce, not the preferences of teachers or the learning conditions of students.

Though teachers left, school leaders had a pool of qualified candidates to replace them, due to the aforementioned narrow number of alternate occupations for women. As a result of the steady supply of qualified teaching candidates, school administrators did not need to be

(32)

May, 2011; 2012; Johnson et al., 2005; Johnson et al., 2012; Kraft et al., 2016; Ladd, 2011; Podolsky et al., 2016).

Since Lortie’s (1975) work, however, not only have the number of occupations open to women expanded, but with greater social acceptance of women in the workforce, women have begun and continue to challenge male-dominated occupations. Rather than the teaching

profession competing with laundering, factory work, and nursing, today’s teaching profession competes with medicine, law, aeronautics, research, and engineering (Johnson, 1990). Though the labor market has changed dramatically for women, the dominant sex in the teaching workforce, the working conditions of teachers have largely remained unaddressed.

Johnson (1990) lays out a domino effect of the expansion of labor market options for women, TWCs, and teacher retention. She connects women’s expanding ability to reject poor school working and teaching conditions in favor of more favorable working conditions in other fields to the deterioration of aptitude in the teaching profession. She goes on to cite the

(33)

Both Johnson’s (1990) and Keeley’s (1973) conclusions support the importance of TWCs, and thus the necessity of understanding TWCs in an effort to improve them. According to Johnson’s (1990) logic, improvements in TWCs will increase the retention of high aptitude teachers which will improve outcomes for schooling systems and, finally, produce a more educated citizenry. I argue that improved and more attractive working conditions will not only increase retention of high aptitude teachers already interested in teaching, but it will attract a richer pool of teaching candidates. Yet, researchers do not currently even agree on what TWCs are, much less how to improve them. My work aims to remedy the lack of agreement in defining TWCs by providing a literature-based and expert-vetted construct definition of TWCs and methods for measuring them, through a narrative synthesis of the literature followed by a Delphi study.

(34)

The cycle began anew in early 2002 with the passage of No Child Left Behind (NCLB), a reauthorization of the 1965 Elementary and Secondary Education Act. NCLB re-enacted

prescriptive regulations and standardized student testing in reading and math. It also instituted sanctions for schools not meeting required improvement and performance targets. NCLB has been harshly criticized by states as heavy-handed federal encroachment into education, a state domain. In response, Congress crafted a less prescriptive bill in the most recent incarnation of the ESEA, Every Student Succeeds Act (ESSA), which passed with bipartisan support in 2015. ESSA relaxed some testing provisions and reallocated some decision-making authority to states, representing less federal prescription from the federal government.

These cycles of regulation and reform do not wash over schools and teachers with no effect. Administrators, teachers, and students experience the effects of regulation and reform, both intended and unintended. For example, recent research into the effects of NCLB suggest that the law had some negative effects on school environments (Dee, Jacob, & Schwartz, 2013; Nichols & Berliner, 2008). The uncertainty of the political climate towards education is a working condition that looms over and directly affects teachers. A better understanding of the interrelations of TWCs and policy shifts would guide future policy formulations. However, a foundation for studying TWCs needs to be formulated first, which is the purpose of this study.

Summative comments. Since schools followed populations moving into cities and became supervised and regulated by school boards, the working conditions of teachers have remained largely unchanged. They are overseen by administrators, who dictate the distribution of power in the school; they teach in single classrooms, which can isolate them from their

(35)

unaddressed. Johnson (1990) suggests that the price the teaching profession paid for this inattention is the loss of high aptitude women, who sought better working conditions. The attention I am giving to TWCs is an effort to begin to reverse this trend. I conducted a narrative synthesis of the literature to develop a construct definition of TWCs. Next, I will conduct a Delphi study, where panels of experts will vet the construct definition and generate methods for measuring TWCs.

Paradigmatic Approach

(36)

(Noblit & Hare, 1988), a procedure I applied while conducting the narrative synthesis and analyzing results from the Delphi study.

Due to the interpretive nature of translation, I follow an interpretivist paradigm, because I “seek an explanation for social or cultural events based upon the perspectives and experiences of the people being studied” (Noblit & Hare, 1988, Ch. 1, p. 3). In the narrative synthesis I

conducted, “the people being studied” were researchers or the concepts of TWCs they use in studies of TWCs. The narrative synthesis was a translation across studies of the definitions of TWCs and how researchers decompose the concept of TWCs. By synthesizing across studies, I developed a construct definition of TWCs, including a definition and typology that details the major categories and components of TWCs. Additionally, during the narrative synthesis, I documented the theories that underpin studies of TWCs, which adds an additional layer of understanding of how researchers conceive of TWCs. Next, in the Delphi study, I vetted the construct definition of TWCs with practicing teachers and generated methods for measuring TWCs.

Research Approach

In this section, I delineate my research approach as it fits within the interpretive

paradigmatic approach described above. My aim in conducting the narrative synthesis was to use the studies of TWCs as cases and to do the comparative work described above, looking across cases in configuring researcher concepts of TWCs into a cohesive construct definition. This construct definition was then utilized as a starting point in the second part of my dissertation, the Delphi study.

(37)

authors demarcate between aggregative and configurative syntheses, presenting the two types as ends on a continuum. Aggregative syntheses most closely resemble a quantitative version of the reciprocal synthesis outlined by Noblit and Hare (1988), where studies are synthesized through an analysis of their agreement. These can be narrative syntheses or meta-analyses. In contrast, configurative syntheses aim to generate and explore by piecing together data from multiple sources roughly on the same topic (Gough et al., 2012). Gough and colleagues (2012) describe the process below:

Configuring data involves some form of interpretive conceptual analysis and is thus particularly associated with reviews where: (i) the concepts are the data for

analysis…Good examples are…thematic analysis (Thomas and Harden, 2008), where a range of concepts from primary studies is interpreted and configured to create higher order or meta-concepts. (p. 52)

I classify the narrative synthesis I conducted as configurative (Gough et al., 2012) because it configured a typology of TWCs by translating, according to the interpretivist paradigm, across researcher concepts of TWCs from peer-reviewed studies. Further defining the synthesis as a configurative, I employed thematic analysis (Thomas & Harden, 2008) to organize the concepts into higher categories and detail each categories’ components.

(38)

TWCs became points on a map of TWCs, with various areas of overlap. By sourcing from an exhaustive set of peer-reviewed quantitative literature on TWCs, I translated across researcher concepts of TWCs to identify themes among these concepts, which then became the components in the draft typology of TWCs. Therefore, though studies may differ markedly in the ways in which they conceive of TWCs, the differences are not conceptualized in this synthesis as needing to be reconciled; rather, they need to be configured, which entails translation and synthesis.

In a departure from common synthesis methods, the current synthesis did not investigate the findings of the studies but rather the elements present in each study’s concept of TWCs, regardless of research design. Therefore, the studies’ concepts of TWCs were under scrutiny, rather than their research methods. Specifically, I synthesized three elements of researcher concepts: 1) theory underpinning the concept, 2) narrative definitions of TWCs, and 3) TWCs concept decomposition, meaning how researchers breakdown the TWCs concept to study it. The resulting construct definition generated from the synthesis provides a starting construct definition of TWCs in K-12 public school contexts in the United States. From this starting point, I began the Delphi study, where a select panel of teachers vetted and revised the construct definition, as well as generated measures of TWCs.

(39)

CHAPTER 2: SYSTEMATIC NARRATIVE REVIEW OF CONCEPTS OF TEACHER WORKING CONDITIONS: CONFIGURING A CONSTRUCT DEFINITION

The product developed in this chapter, a construct definition of TWCs, is designed to provide the groundwork from which the second part of this dissertation, the Delphi study, is based. Through narrative synthesis methods, I developed a draft construct definition of TWCs. Though in traditional dissertations, a formal methods section is not expected to precede the literature review, or in this case the narrative synthesis, I present a detailed methods section, findings, and brief discussion in the tradition of a stand-alone review that might be published, for example, in AERA’s Review of Education Research. My justification for doing so is three-fold. First, the credibility and validity of the construct definition of TWCs I develop will rest on rigorous and replicable narrative synthesis methods. Second, as I discussed in Chapter 1, the construct definition will subsequently be the starting point of the study proposed in Chapter 3 of this dissertation. Thus, to a significant degree, the quality of the study proposed in Chapter 3 also relies on the quality of the methods employed in the development of the TWCs construct

definition. Last, my orientation to this narrative synthesis is not common. Rather than synthesizing findings from studies of TWCs, I propose to synthesize concepts of TWCs. Put differently, I will synthesize the concepts of TWCs in peer-reviewed articles. In this chapter, I provide the methods, findings, and a brief discussion to support the credibility of the product of the narrative synthesis (the TWCs construct definition), the starting point of the Delphi study.

(40)

narrative definition and typology. However, even though each study in the synthesis indicated that they were studying TWCs as a set, rather than individual components, I have no assurance that the resulting construct definition is complete. Therefore, the construct definition, consisting of the definition and typology, is considered a draft that will be tested and revised in the Delphi study. Further, I find that though researchers that provide narrative definitions of TWCs are in general conceptual agreement, there is wide variation in the ways that researchers decompose these concepts to study them. Additionally, the ways that researchers measure TWCs imply that they conceive of the categories that makeup TWCs (that emerged from the synthesis) as

interrelated. This is evidenced by overlap of the categories within the measures researchers use to study TWCs. The disagreement among researchers in how to decompose and study TWCs suggests that further investigation and validation is needed of the TWCs construct definition, and the findings regarding overlap of TWCs categories within researcher measures of TWCs are evidence of the interrelated nature of TWCs. Last, the synthesis findings document that the majority of studies in the synthesis measure TWCs through teacher reports. While teacher reports or perceptions of their working conditions are valuable, this finding suggest an over-reliance on one method for measuring TWCs. To remedy this over-reliance, one of the main products of the Delphi study, which will follow the narrative synthesis, will be participant-generated observable measures of TWCs.

(41)

Methods

In designing the narrative synthesis, I followed Gough and colleagues’ (2012)

three-phase procedure, outlined in Figure 1. In Phase 1, I conducted a search of the literature for relevant and appropriate articles that investigate TWCs. From the search records, I created a sample of studies that meet inclusion and relevance criteria. In Phase 2, I extracted data from each sampled study, coding for three foundational elements of each study’s concept of TWCs: 1) its theory underpinning TWCs; 2) definitions of TWCs overall; and 3) TWCs concept

decomposition, meaning how researchers breakdown concepts of TWCs to study them. Each of these elements was broken into smaller sub-elements for coding. Last, in Phase 3, I synthesized the data elements extracted in Phase 2 through translation (Noblit & Hare, 1988) and thematic analysis (Thomas and Harden, 2008). Each phase is discussed in detail below.

Phase 1: sample selection. In this phase, I created the database of studies from which the synthesis was conducted. First, I conducted a search for studies on TWCs with a detailed search strategy. After the search was complete and studies were gathered, I refined the resulting dataset by removing duplicate studies. Last, to critically appraise the sample, I applied an adapted form of Gough and colleagues (2012) Weight of Evidence (WOE) framework. In applying the WOE framework, I omitted inappropriate studies through application of inclusion criteria and ensured relevance of remaining studies through an author-generated relevance rubric.

(42)

Figure 1. Process for narrative synthesis of researcher concepts of teacher working conditions •Search Strategy

•Critical Appraisal

•Appropriateness

•Relevance Phase One:

Sample Selection

Concept elements coded:

•Theory

•Definitions

•Concept Decompostion Phase

Two: Element Extraction

•Line-by-line coding

•Translation and thematic coding

•Construct definition building

•Definition synthesis

•Construct definition configuration Phase

Three: Data Synthesis

(43)

Academic Search Premier. All database article abstracts and titles were searched using the following search term: “teach* AND work* AND condition*.” Next, I utilized database thesauruses to match keyword search terms to databases. Keywords are words or phrases that indicate the topics of articles. Articles are tagged with keywords by a computer algorithm or human reviewers, depending on the search engine. Through keyword indexing, I was able to find articles on TWCs that I may not have found through the search term, because the authors use different phrasing. Each search engine indexes articles under different keywords, so I conducted keyword searches specific to each search engine.

The processes of compiling, screening, and deleting are documented in a work flow map (Gough et al., 2012) in Figure 2. The work flow map tracks the number of articles imported from the initial search, the number of duplicates deleted, the number of articles omitted due to each of the inclusion criteria, documentation of reasons for omitting any other articles, and the number of articles included in the final sample for the systematic review. In developing the work flow map, I referenced recent systematic reviews published in the last four issues of the Review of

Educational Research (Adesope, Trevisan, & Sundararajan, 2017; Peltier & Vannest, 2017; Sabey, Charlton, Pyle, Lignugaris-Kraft, & Ross, 2017; Singer & Alexander, 2017; Surr et al., 2017). The results from term and keyword searches, totaling 7326 records, were imported into software specifically developed for systematic review, EPPI-Reviewer 4 (Gough et al., 2012). The software has the functionality to import references in a variety of formats; label the date the citation is returned and the search engine each citation is returned from; and detect duplicates.

(44)

Figure 2. Workflow map of teacher working conditions systematic document sample selection

(45)

the level of similarity between the master article and each other article. Remembering that these articles are grouped because they are likely duplicates, ratings in my review did not drop below 0.78. A rating of 1.00 indicates an exact match in every search field, with few exceptions, and a 100% chance that the article is a duplicate of the master (EPPI Reviewer 4 User Manual, 2015). Search fields include author(s), title, year of publication, journal name, and page numbers.

Though the user-manual states what a score of 1.00 indicates, it does not provide information regarding how EPPI Reviewer’s algorithm scores articles. Therefore, I hand-coded approximately 200 articles to gauge the sensitivity of the algorithm and found that a rating of between 0.99 and 0.90 indicates small differences in punctuation, case, or formatting. I found that ratings between 0.90 and 0.85 indicate a major difference in one field but major agreement in all other fields. This was often that the article title in one record was typed in all caps, while the article title in the second article capitalized only the first letter of the first word. Articles with ratings below 0.84 had a major difference in two fields, the major differences again being those of capitalization or abbreviated versus full journal names.

(46)

hand-coded the remaining 527 likely-duplicate records, with ratings ranging between 0.84 and 0.78. Of these low-rated records, only two non-duplicates were found. One was an errata report of the master record and the other was a review article of the master record. The precision of the algorithm, evidenced through the initial hand-coding and hand-coding of the lowest-rated records, supports the decision to mark records with 0.85 ratings and higher as duplicate records.

After duplicate articles were deleted, the remaining sample’s title, keywords, and abstract were hand-screened to detect articles that meet the inclusion criteria described below, ensuring appropriateness of articles remaining. During screening of record titles and abstracts, I identified 59 additional duplicates not recognized through the software’s algorithm. These records often contained differences in author name formatting, such as including the entire first name, rather than author initials or a difference in the ordering of authors. These differences made the records “too different” for the algorithm to detect as a likely duplicate. The manual detection of more duplicates not found by the software and the very low number of non-duplicates (two) hand-coded, suggest that the software’s algorithm is conservative in its identification of duplicates, further supporting my decision to auto-assign articles with a duplicate likeliness rating of 0.85 and higher as duplicates. In total, 1650 duplicates were removed from the sample through auto-assignment, hand-coding likely-duplicates, and manual detection during the screening of titles, keywords, and abstracts.

(47)

methods because they have no bearing on the outcome of the synthesis. I adapted the WOE framework by applying only Steps Two and Three: appropriateness and relevance criteria.

Appropriateness of studies (inclusion criteria).The WOE framework’s section on appropriateness of studies aligns with sections often labeled as “inclusion criteria.” My search had six inclusion criteria, which were applied through a review of the title and abstract. If ambiguity remained after reviewing these two sources, I left the record in the sample for further review. First, I only included studies of working or school conditions that focus on the working conditions of teachers, as opposed to principals or students. A total of 440 non-teacher related articles were removed. I was guided in applying the TWCs criterion by referencing the language about TWCs used by the authors of the study. Specifically, I searched for articles that

self-describe the purpose of the study as investigating “working conditions,” “workplace conditions,” or other variations of the topic. Through application of this criterion, I removed 3368 articles with purposes that do not include the investigation of TWCs.

Second, only articles published in English and studying K-12 public schools in the United States remained in the sample. Two, 662, 112, and 809 records were removed due to criteria related to language of publication (English), the level of schooling (K-12), the funding of schooling (public), and the country in which the study took place (United States), respectively.

(48)

employed search engine filters to omit pre-2002 articles, so I do not know the exact number omitted.

Fourth, following quality assurance criteria applied by researchers conducting reviews of the literature in current academic journals (see Gast, Schildkamp, & Van der Veen, 2017;

Muenks & Miele, 2017; Singer & Alexander, 2017; Gierl, Bulut, Guo, & Zhang, 2017; Welsh, 2017), I kept in the sample only academic articles published in peer-reviewed journals. Though search engine filters for peer-reviewed journals were employed, greatly limiting the number of non-scholarly material, 64 articles that are opinion and editorial pieces were removed, as are two historical articles. Fifth, 69 studies of virtual public schools were omitted from the sample, as the working conditions in these schools are materially different than physical schools (Muirhead, 2000).

Last, the research question of this synthesis inquires after the conceptions of researchers about TWCs. Inductively derived concepts of TWCs are more representative of participant conceptions of TWCs, which are interesting in their own right but are not within the scope of this synthesis. Therefore, only articles that contain a pre-conceived (a priori) articulation of TWCs were included in the sample; applying this criterion, I removed 52 articles from the sample. Resulting from the article search and exclusion process, a total of 87 full-text articles remained in the sample for relevance analysis.

Relevance of studies. At this point, I explain the nomenclature I use as I move to different areas of analysis, software analysis platforms, and levels of analysis. I apply these terms

(49)

Relevance Rubric. In this stage, the data were in larger “chunks” of the major elements of the coding schema: Theory, TWCs Definitions, and TWCs Concept Decomposition, meaning how researchers breakdown the TWCs concept to study it. After initial coding, the coded “chunks” of the elements and sub-elements were transferred to spreadsheets or datasheets (I use these terms interchangeably). Theory and TWCs definitions codes were relatively small in number when compared to the TWCs Concept Decomposition data. Therefore, these two elements’

terminology need no further explanation.

The TWCs Concept Decomposition data consist of the labels, sub-labels, and measures of TWCs labels or sub-labels extracted from articles. For clarity concerning what is being analyzed and how I moved through the analysis, I use one set of words for how article authors organize their concepts of TWCs and another set for how I synthesized those concepts. I reserve the word “label” for words from sample articles describing the broadest elements of TWCs. Thus, articles include labels, sub-labels, and measures of TWCs. I iteratively organized these labels, sub-labels, and measures. In the resulting typology from the analysis and synthesis of concept

decomposition data, categories are the broadest concepts, followed by components and then the narrowest concepts are sub-components. As an example of the usage of these terms, Burkhauser (2017) includes this label “Human capital,” which the author breaks into five sub-labels

(50)

After the application of inclusion criteria to ensure appropriateness of the studies in the sample, I analyzed the relevance of each study against the review’s research question. Relevance criteria are applied to sample articles to ascertain whether they are “fit for the purpose” of this narrative synthesis (Petticrew & Roberts, 2006, p. 131). This review’s question inquires after how researchers conceptualize TWCs, including the theory underpinning the study, as well as how TWCs are defined, broken into labels or sub-labels, and measured.

In an effort to systematically gauge the relevance of articles to this review’s question, I developed a rubric for rating sample articles. The following sections describe the rubric, including rationale for weights allocated to each element, provide a template of the rubric in Table 1, and explain how the rubric rating was applied to omit irrelevant articles and limit the number of poorly contributing articles to the current synthesis.

Relevance rubric description and weighting rationale. The criteria that make up the rubric mirror the coding schema of the three TWCs concept elements I code during the data extraction phase. These elements are 1) theory underpinning TWCs; 2) definitions of TWCs overall; and 3) TWCs concept decomposition, specifically the labels, sub-labels, and measures of

TWCs used in each study. Each element that an article provides increased the relevance of that

article to this review’s research question. The rubric criteria were differentially weighted relative

to the contribution of each element to the development of the TWCs construct definition.

(51)

Table 1

Relevance Rubric

Elements of TWCs

Conceptualization Sub-Elements Value Points

Supporting Text

Theory underpinning TWCs Name/Description of theory _0.5

Application of theory to TWCs _0.5

Definition of TWCs Narrative definition of TWCs 2.5

Origin of TWCs definition 0.5

TWCs Concept Decomposition Labels of TWCs 2

Narrative definitions of TWCs labels 2

Measures of TWCs labels or proxy variables 2

Total: 10 0

i. Theory underpinning TWCs. This element captures any theory that authors cite in sample studies that pertain to TWCs. The “theory” element is broken into two sub-elements in the rubric: 1) name and description of theory and 2) application of theory to TWCs. Articles that name or describe the theory that underpins their study were given the points for this sub-element. Concerning the second sub-element, articles that explain how the theory applies to TWCs were given the point value for the second sub-element. For example, Ingersoll (2002) cites

organization theory and explains the role of TWCs in the school context through the lens of organization theory. Therefore, this article received credit for including both sub-elements in its conceptualization of TWCs.

(52)

ii. Definition of TWCs. This element is sub-divided into two sub-elements: 1) narrative definition of TWCs and 2) origin of definition of TWCs. The first sub-element captures a sample author's narrative definition of TWCs as an overall concept, not of individual sub-components of TWCs. For example, Bascia and Rottmann (2011) received the points for this sub-element for including this passage: “By ‘teaching conditions,’ we refer to factors that repeatedly have been identiﬁed by teachers as critical to the quality of their work” (p. 789). An article was not credited with including this sub-element if only individual TWCs labels are defined without defining TWCs as a group.

The second sub-element captures a sample author’s explanation of the origin of that definition. For example, this sub-element would be present in an article if the author explains that the definition of TWCs produced was derived from the combined work of Johnson (1990) and Goodlad (1984). This element provides a sense of how researchers arrive at their concepts of TWCs.

An article author’s narrative definition of TWCs contributes directly to the development of the TWCs construct definition and provides detail through the narrative nature of the

definition. Therefore, I allocated this sub-element the highest relative weight possible on the rubric, 2.5 points. The “origin of TWCs definition” sub-element contributes to the TWCs construct definition in a fashion similar to the “theory underpinning TWCs” element, in that it provides valuable background and context but does not contribute directly to construct definition development. Thus, it is allotted the same point value as the theory sub-elements in the rubric, 0.5 points.

(53)

variables. Each of these sub-elements is critical to the development of the TWCs construct definition and is weighted with two points. For example, an article received credit for inclusion of this sub-element if the article author lists “facilities, resources, professional development, time usage, and leadership” as their conceptualization of TWCs. Alone, this sub-element offers

valuable information on how sample authors compartmentalize TWCs, which contributes directly to the development of the TWCs construct definition. Articles did not receive additional points for breaking down labels into sub-labels or measures.

The sub-element “narrative definition of TWCs sub-components” is the description of TWCs labels and sub-labels provided by sample authors. If the same example author above goes on to define “professional development” as, for example, “the available learning opportunities for teachers that are focused on the improvement of pedagogy,” that author received credit for both sub-elements. In practice, if an author provides a narrative definition of one TWCs label or sub-label, the author systematically provides definitions of all TWCs labels they list. However, in the context of this synthesis, the inclusion of even one narrative definition of a TWCs label is valuable and increases the relevance of the article. Therefore, an article with only one label definition received full credit for this sub-element.

Last, the sub-element, “measures of TWCs labels or proxy variables,” represents how sample authors measure TWCs labels and sub-labels. This sub-element most often takes the form of an item on a survey but can also be proxy variables (i.e. percent of students on free and

(54)

by indicating a sample author’s non-narrative definition of TWCs labels, providing insight into the author’s conceptualization of those TWCs labels. For each article rated, I provide excerpts from the article to support the rubric rating of each sub-element. The spreadsheet containing the application of the Relevance Rubric is available upon request1_.

Application of rubric ratings. I applied rubric ratings applied in two ways. First, articles with a rubric rating of zero contribute nothing to the study and are irrelevant; they are, therefore, omitted from the study. Second, in order to preserve credibility of synthesis results, articles with rubric ratings at or below 1.5 will be omitted from the sample. This threshold is selected because a rating at or below 1.5 includes studies that receive credit for a maximum of three sub-elements: “name/description of theory,” “application of theory to TWCs,” and “origin of TWCs

definition.” These sub-elements provide background information but do not contribute directly to construct definition development and so have limited applicability to this review’s research question. In practice, no articles received a rating between 0.5 and 1.5. All authors, who include an explanation of theory or the origin of their definition of TWCs, also include at least one other sub-element from the other two rubric elements, increasing the articles’ rating beyond the 1.5 threshold. However, 18 articles were removed from the sample due to relevance ratings of zero, leaving 69 articles in the sample.

Search strategy redux: reference list check. I checked the reference lists of all 69

articles, looking broadly for titles that appear to investigate TWCs. A graphical representation of this process is included in Figure 2, the work flow map. I found an additional 65 articles through this process, seven of which were duplicates. After applying inclusion criteria to the 58 non-duplicate articles, 27 articles remained in the reference check sample, which then had full-text

(55)

articles retrieved for review. I removed eight additional articles due to lack of relevance criteria (rubric ratings of zero), leaving 19 articles. I combined reference check articles with the online database-derived sample to arrive at the final sample of 88 articles. A citation list of the articles that make up the sample for the narrative synthesis is available in Appendix A.

Phase 2: element extraction. Theory, definitions, and concept composition of TWCs together represent how authors conceptualize TWCs and construct definition them. These elements are the building blocks from which I developed a literature-based construct definition of TWCs. Using NVivo Qualitative Software, the 88 articles that comprise the sample for this study were hand-coded and organized by the elements listed in the Relevance Rubric.

I analyzed the full text of each article in the sample for the elements in the rubric. In this phase, I coded large “chunks” of excerpts from the sample articles. I coded chunks at this point because I wanted to isolate and capture each element in its entirety before decomposing them to make sure I understood the author’s intent and message. The same codes were applied in both the Relevance Rubric and element extraction analysis, though for different purposes. Thus, the Relevance Rubric became the coding schema. When applying the codes for the Relevance Rubric, I inquired after the existence of elements or sub-elements within the potential article. During this phase, I applied the same coding scheme for extraction of the elements and sub-elements in preparation for further analysis.

In practice, application of the Relevance Rubric for each article aided in the data extraction phase. For efficiency, while analyzing potential articles for relevance, I