Conceptual background

(1)

Removing the contextual lens: A multinational, multi-setting

comparison of service evaluation models

Michael K. Brady

a,∗

, Gary A. Knight

a,1

, J. Joseph Cronin Jr.

a,2

, G. Tomas

b

,

M. Hult

b,3

, Bruce D. Keillor

c,4

a_{Florida State University, College of Business, Tallahassee, FL 32306-1110, USA}

b_{Graduate School of Management, Michigan State University, N370 Business College Complex, E. Lansing, MI 48824, USA} c_{The University of Akron, College of Business Administration, Akron, OH 44325-4804, USA}

Abstract

Four service evaluation models are identified from the literature that are commonly offered to depict the relationships amongst the primary service evaluation constructs of sacrifice, service quality, service value, satisfaction, and behavioral intentions. We comparatively test the models using samples of service consumers in Australia, Hong Kong, Morocco, the Netherlands, and the United States, as well as across varied temporal and service settings. The results of the comparative analyses reveal that one conceptualization, the “comprehensive” model, best captures the identified relationships. This model is the best fitting across all countries and settings, which indicates it has the greatest external validity. These findings are discussed relative to the limitations associated with the use of less generalizable models.

Keywords: Contextual lens; Service evaluation models; Comprehensive model

Introduction

Research related to marketing services has gained consid-erable momentum over the past 25 years (e.g.,Fisk, Brown, & Bitner 1993;Shostack 1977). The result of this discourse is a growing knowledge base relative to how service cus-tomers conceptualize, perceive, and evaluate service delivery, as well as how these factors influence purchasing behavior. Researchers and practitioners now have better information concerning such things as how service quality is evaluated, how customers derive value from a service offering, what drives customer satisfaction, and who will become loyal patrons of a service provider. Indeed, considering the

migra-∗_{Corresponding author. Tel.: +1 850 644 7853; fax: +1 850 644 4098.}

E-mail addresses: [email protected] (M.K. Brady),

[email protected] (G.A. Knight), [email protected] (J.J. Cronin Jr.), [email protected] (M. Hult), [email protected] (B.D. Keillor).

1 _{Tel.: +1 850 644 1140; fax: +1 850 644 4098.} 2 _{Tel.: +1 850 644 7858; fax: +1 850 644 4098.} 3 _{Tel.: +1 517 353 6381; fax: +1 517 432 1112.} 4 _{Tel.: +1 330 972 8839; fax: +1 330 972 5798.}

tion of traditional physical goods producers such as GE, Microsoft, Xerox, and IBM towards value-added service provision, even physical goods producers have become assid-uous services marketers.

A critical focus of service research to date is the identi-fication and study of those factors that “drive” consumers’ service purchases. Practitioners and researchers are partic-ularly interested in uncovering the factors instrumental to understanding service evaluations. The list of such factors is growing, but five constructs are particularly prevalent: sacrifice, service quality, service value, customer satisfac-tion, and behavioral intentions1 (e.g., Anderson, Fornell, & Lehmann 1994; Cronin & Taylor 1992; Parasuraman, Zeithaml, & Berry 1988;Zeithaml 1988;Zeithaml, Berry, & Parasuraman 1996). These five constructs have been studied individually, but are more often depicted in subsets or

mod-1_{Behavioral intentions is a construct that captures multiple outcome}

dimensions (cf.Zeithaml et al., 1996) and is employed here to account for the range of outcome measures (e.g., loyalty, word-of-mouth) that appear in extant service evaluation studies (e.g.,Bolton & Drew 1991;Fornell et al. 1996;Heskett et al. 1994;Ostrom & Iacobucci 1995).

(2)

els of service evaluation processes (e.g., Fornell, Johnson, Anderson, Cha, & Bryant 1996;Heskett, Jones, Loveman, Sasser, & Schlesinger 1994).

There are numerous examples of service evaluation models that include these five core constructs (e.g.,

Athanassopoulos 2000;Bolton & Drew 1991;Chenet, Tynan, & Money 1999;Cronin, Brady, & Hult 2000;Fornell et al. 1996;Ostrom & Iacobucci 1995). While there is some agree-ment about how these constructs relate to each other, there is very little agreement about how service quality, sacrifice, value, and satisfaction collectively relate to behavioral inten-tions and other outcome measures. Moreover, there is little effort made to resolve the discrepancies in the relationships between these critical “drivers” of service evaluations. Thus, no one of the models suggested to date can be viewed as more generalizable than the others. This is unfortunate because attempts to develop more complex models of the service evaluation process are being undertaken without this knowl-edge (e.g., Johnson, Anderson, & Fornell 1995; Johnson, Gustafsson, Andreassen, Lervik, & Cha 2001; Oliver 1997).

It was recently suggested that services research is now ready to embark upon a new phase that reflects the matu-rity or “commoditization” of many service industries (Pine & Gilmore 1998). There is a similar push to advance service evaluation models (Johnson et al. 2001). Much of the new research in conceptualizing service evaluations is directed toward resolving specification errors. There is evidence that constructs such as equity, disconfirmation, and affective com-mitment influence service evaluations (Fornell et al. 1996;

Johnson et al. 2001;Oliver 1997). However, in the absence of widespread agreement as to the nature of the relationships among the constructs investigated here, much of this research may be premature.

In this paper, our aim is to elucidate a standard conceptual service evaluation model using a thorough empirical anal-ysis of several currently recognized models. Because this involves uncovering the most robust of a collection of com-peting models, all of which have support in the literature, we will test and compare these models in two major studies, with a view to maximizing external validity. The studies will assess all the models in terms of their appropriateness and fit across a range of demanding conditions that reflect extremes of national business environment, industry type, and temporal setting.

We first draw on the literature to specify four plausible and competing models of the service evaluation process. We then comparatively analyze the models to identify the most externally valid conceptual structure of the five constructs using consumer samples from five diverse countries (Aus-tralia, China/Hong Kong, Morocco, the Netherlands, and the United States). To extend our assessment of external validity, we also assess the models within a range of service settings using a second data sample from the United States. Lastly, resultant findings are discussed with regard to their scholarly and managerial implications.

Conceptual background

A number of both national and international customer sat-isfaction barometers or indices have been introduced in the last decade. For the most part, these satisfaction indices are embedded within a system of cause and effect relationships or satisfaction model. Yet there has been little in the way of model development. Of critical importance to the validity and reliability of such indices is that the models and methods used to measure customer satisfaction and related constructs continue to learn, adapt, and improve over time.Johnson et al. (2001), p. 217.

The above quotation highlights two critical points. First, as addressed in the present paper, there exists a critical need for a clearer understanding of service evaluation models. Second, the quotation reveals how models of the service encounter can be oriented around specific constructs. For example, while other scholars have emphasized the direct effects of service quality and value on behavioral intentions, research byJohnson et al. (2001)portrays satisfaction alone as playing this role. While this latter conceptualization might accurately reflect some service evaluations, it may be too parsimonious and lack external validity in other contexts. In general, the literature supports the view that variables in addition to satisfaction play key direct roles in shaping con-sumers’ service evaluations. In this paper, we seek to extend understanding of how customers evaluate service encounters by comparatively analyzing a collection of theoretically and empirically plausible models that portray various configurations of antecedents to behavioral intentions. The models

The importance of the five investigated constructs is evi-dent in the size of the research streams devoted to studying them. Some of this research is focused on only a particu-lar construct (e.g.,Parasuraman et al. 1988; Yi 1990), but analyses that include several of these constructs are far more common (e.g., Bitner & Hubbert 1994; Bolton & Lemon 1999; Heskett et al. 1994; Ostrom & Iacobucci 1995). In the latter case, the constructs are arranged in a conceptual model wherein theory and empirical analysis drive the spec-ified relationships. A review of these models suggests that there is similarity in how some of the relationships are spec-ified, yet there is also considerable dissimilarity with respect to other relationships. We begin with a discussion of the the-ory driving the consistent relationships and then turn to a discussion of the disparity.

The relationships among the five investigated constructs that are consistently modeled in the literature have to do with the antecedents of value. Theory and research on value integration suggest that service value is determined by the difference between gains and losses or, in the case of ser-vices, the difference between service quality and sacrifice (Sirohi, McLaughlin, & Wittink 1998;Sweeney, Soutar, & Johnson 1999; Zeithaml 1988). Service evaluation models

(3)

that include service value are therefore consistent in modeling service value in this way. Sacrifice is specified as antecedent to value, which accounts for the “loss” side of the value inte-gration process. Similarly, service quality is also specified as antecedent to value, but as the “gain” side of value integration. Taken together, there are consistent and direct relationships between sacrifice, service quality, and value.

There is less agreement, however, on the intervening mechanisms that lead to behavioral intentions, with par-ticular disparity regarding the antecedent relationship of service quality and satisfaction. Some service evaluations models (e.g., Bitner 1990; Bolton & Drew 1991) specify satisfaction as antecedent to service quality based on the premise that service quality is a general evaluation similar to an attitude, and therefore is superordinate to satisfaction. However, other service evaluation models (e.g.,Anderson & Fornell 1994; Anderson et al. 1994; Anderson & Sullivan 1993;Gotlieb, Grewal, & Brown 1994) follow the appraisal-response-coping sequence (Lazarus 1991) or the cognitive-emotive causal order (Oliver 1997), which posi-tion satisfacposi-tion as superordinate to service quality. Thus, specification of the relationship between service quality and satisfaction has been debated and therefore both positions will be investigated here.

Additional dissimilarity in service evaluation models emerges when outcome measures (e.g., behavioral intentions) are added to the models. Theory and empirical evidence sup-port an antecedent link to behavioral intentions from all three of service quality, service value, and satisfaction (e.g.,Chang & Wildt 1994; Fornell et al. 1996; Zeithaml et al. 1996). However, it is rarely the case that all three direct links are specified. It is much more common that one construct is pre-sented as the locus of service evaluation models. Moreover, the construct selected as the central mediator or “lens” tends to be congruent with the context of the research. For example, if the focus of the research is satisfaction, links to behav-ioral intentions tend to be mediated by satisfaction (e.g.,

Andreassen 1998; Chenet et al. 1999;Fornell et al. 1996;

Patterson & Spreng 1997). Similar mediating conceptualiza-tions can be found in the value and service quality literatures (e.g., Athanassopoulos 2000;Chang & Wildt 1994;Sirohi et al. 1998;Sweeney et al. 1999;Wakefield & Barnes 1996). This has resulted in several competing conceptualizations of service evaluations (see Fig. 1) that differ primarily in the respective constructs positioned as the key drivers of behav-ioral intentions. It has also resulted in ambiguity concerning the appropriate conceptual structure of service evaluation models that include the five constructs.

The specification of the four models draws from contem-porary attitude theory, which is the theoretical basis for each model. One of the goals of attitude theory is to determine how attitudes drive intentions. Several theories exist, but the the-ory of reasoned action (Ajzen & Fishbein 1980) is perhaps the most prominent contemporary attitude theory. The the-ory of reasoned action suggests that intentions are the direct outcome of attitude (and subjective norms) such that there

are no intervening mechanisms between the attitude and the intention. More recent work in attitude theory (e.g.,Bagozzi 1992), however, challenges this perspective and contends that attitude theories “trade specificity for parsimony” (Bagozzi 1992, p. 201), meaning that there may be other potential links to intentions that are not included in the theory of reasoned action.Bagozzi (1992)recommends refining attitude theory to consider intervening mechanisms that may better explain intentions.

A similar course is followed in the service literature with respect to the specification of service evaluation models. Ser-vice evaluation models tend to be parsimonious in terms of the relationships leading to behavioral intentions, and this view has been challenged (Cronin et al. 2000). Most service evalu-ation models specify a single variable that leads to behavioral intentions and that also acts as an intervening variable for the effects of the other constructs in the model. An example of such a model is the “satisfaction model” (e.g.,Fornell et al. 1996) depicted inFig. 1. The satisfaction model specifies sat-isfaction as a central mediating variable such that the effects of service quality, sacrifice, and value on behavioral inten-tions are mediated by satisfaction. The rationale for such a model is, since satisfaction is primarily an affective variable whereas quality and value are cognitive evaluations (Oliver 1997), a direct link to intentions is justified by theoretical models that specify a cognition-affect causal ordering (e.g.,

Bagozzi 1992; Lazarus 1991). That is, satisfaction is posi-tioned as an affective-oriented mediator that follows from quality and value evaluations.

There is little argument that satisfaction influences behav-ioral intentions. The primary point of contention in service evaluation models is whether satisfaction directly affects behavioral intentions and whether it is the only direct effect, as is specified in the satisfaction model. Indeed, there are competing service evaluation models that specify value as the central construct so that all paths to behavioral intentions are mediated by value. These value-centric models appear in the value literature (e.g.,Chang & Wildt 1994;Grewal, Monroe, & Krishnan 1998) and contend that value is the lone direct determinant of behavioral intentions. There are also models in the service quality literature that position service quality in the central mediating role (e.g.,Zeithaml et al. 1996). The direct effects of value and service quality in the latter two models are theoretically justified with attitude theory (e.g.,

Fishbein & Ajzen 1975), since value and service quality are similar to an attitude (Parasuraman, Zeithaml, & Berry 1985) and attitude theory suggests a direct link between attitude and intentions.

The result of this discourse is that there are several theoreti-cally justified service evaluation models that are encountered in the literature and there is no agreement on which is the most appropriate model. We identify four competing ser-vice evaluation models that are commonly used to depict the antecedents to behavioral intentions (seeFig. 1). The mod-els are named according to their specifications and according to the literatures in which they are encountered. The first

(4)

Fig. 1. Four service evaluation models. Descriptions—SAC: sacrifice; SQ: service quality; SAT: satisfaction; VAL: value; and BI: behavioral intentions.

model, the “value” model, positions value as the central medi-ating construct. It is conceptually similar to service evaluation models that appear in the value literature (e.g.,Chang & Wildt 1994;Gale 1994;Parasuraman & Grewal 2000;Sirohi et al. 1998;Sweeney et al. 1999;Wakefield & Barnes 1996). The second “service quality” model positions service quality as a central driver of behavioral intentions. It is similar to the ser-vice evaluation models identified in the serser-vice quality liter-ature (e.g.,Athanassopoulos 2000;Boulding, Kalra, Staelin, & Zeithaml 1993; Lee & Cunningham 2001; Zeithaml et al. 1996). The third conceptualization is the “satisfaction” model and is similar to models that position satisfaction as the key determinant of behavioral intentions (e.g.,Anderson & Fornell 1994; Andreassen 1998; Clow & Beisel 1995;

Fornell et al. 1996; Hallowell 1996; Heskett et al. 1994;

Mohr & Bitner 1995). The fourth “comprehensive” model adopts a different stance that is in line withBagozzi’s (1992)

comments about specificity in attitude models. The compre-hensive model specifies that service quality, service value, and satisfaction are all directly related to behavioral inten-tions and is therefore similar to service evaluation models that specify multiple direct links to behavioral intentions (e.g.,

Anderson & Sullivan 1993;Cronin et al. 2000). It is compre-hensive in the sense that all three of service quality, value, and satisfaction are suggested to influence behavioral intentions directly and jointly.

A multinational setting

The first set of analyses presented here is conducted within a multinational setting using samples from five diverse coun-tries. The recent growth in the importance of services in international markets highlights the need for a cross-national investigation of the variables and the relationships that are the focus of our study (e.g.,Hult 1999). This growth is due largely to the globalization of markets, declining trade barri-ers, and the emergence of modern information technologies that facilitate cost-effective international services operations (Fisk 1999;Knight 1999). Despite this trend and the growing role of the international-services trade, research in this area is sparse.

Of the few studies that have examined international ser-vices issues, most are related to “traditional” international business topics such as mode of entry (e.g.,Erramilli 1990, 1991), trade barriers (e.g.,Dahringer 1991), cultural issues (e.g.,Stauss & Mang 1999), and the role of services in differ-ent world regions (e.g.,Kassem 1989). However, no study has examined relationships among key elements of service eval-uations and their linkage to behavioral intentions in a broad range of national settings. This is an important gap because findings from research conducted in a range of national set-tings tend to have higher reliability and external validity than findings from single-country studies.

(5)

The four models are comparatively analyzed in samples from Australia, Hong Kong, Morocco, the Netherlands, and the United States. With the exception of Morocco, all of these represent developed countries. Although consumers in devel-oped countries may view services somewhat differently from those in developing countries, it is not our intention to pre-dict the effect of economic conditions or other nation-level factors on the linkages and models investigated here. The purpose of testing the models across five country settings is to assess external validity of the models investigated and to assess the costs and benefits of specifying service evaluation models through the “contextual lens” of a particular study. The service setting

One of the initial contributions of service research was to distinguish service products from physical goods (e.g.,

Zeithaml, Parasuraman, & Berry 1985). Four distinguish-ing characteristics were identified that refer to elements such as the intangibility, inseparability, and variability of service products. Classification schemes across service industries also have been developed that recognize the relative pres-ence of these characteristics within service industries (e.g.,

Lovelock 1983). That is, while services are generally intan-gible, some service products are more intangible than others. Similarly, most services require interaction between the ser-vice provider and customers, but this is more prevalent in professional services than in convenience services such as fast food restaurants. It is possible that the processes that gov-ern service evaluations are different depending on the nature of the service investigated. Thus, the models are analyzed using data from a range of service industries. Fast food and retail grocery stores represent services with prevalent tangi-ble products involved in the exchange, which require little interaction between buyer and seller, and that have relatively little variability. Airlines and physicians represent services that are less tangible, with more interaction required, and that have more variability in the service delivered.

The temporal setting

A third contingency that may influence the conceptual structure of the four models is the temporal setting of a ser-vice encounter. It is believed that alternative processes may apply depending on whether a service exchange is viewed at the global level or in reference to a specific service encounter (Oliver 1997). An implication is that the key drivers of behav-ioral intentions may change depending on whether a service provider is assessed across all experiences or just the last or present experience. For example, it is suggested that the rela-tionship between service quality, satisfaction, and behavioral intentions is contingent on temporal setting (Oliver 1997).

In an encounter-specific setting, prior research demon-strates that satisfaction mediates the relationship between service quality and behavioral intentions (i.e., sq→ sat → bi;

Gotlieb et al. 1994). Theory suggests that the opposite order

(i.e., sat→ sq → bi) may apply in global settings, although this has not been tested (Oliver 1997). A temporal assess-ment that includes other key service evaluation constructs (i.e., sacrifice and service value) has also not been reported. The four models are therefore comparatively analyzed in both encounter-specific and global temporal settings.

Methods

Two studies were used to test external validity of the four models within the diverse settings described above. The first study was aimed at testing the models across a diverse sam-ple of countries and the second study varied the service and temporal settings in samples obtained from U.S. service cus-tomers.

Study 1

For the multinational samples, the four models were assessed in a global temporal setting (i.e., respondents were directed to consider all of their experiences with the restau-rant or store) using samples from fast food restaurestau-rants and retail grocery stores in each of five countries: Australia, Hong Kong (a territory of China), Morocco, the Netherlands, and the United States. These particular countries were chosen because they provide a considerable degree of variety in terms of national conditions related to cultural, economic, and polit-ical circumstances (e.g., World Bank 2000). According to

Hofstede’s (1980)classification of countries based on cultural dimensions (power distance, uncertainty avoidance, individ-ualism, and masculinity), large-scale differences exist among these dimensions across the countries. In addition, substan-tial variations in competitive conditions, economic systems, and political/legal environments are prevalent among these countries and they represent Asia, Europe, and North Amer-ica, which are the three most significant world regions in terms of their roles in international trade. The fast food and retail grocery stores industries were selected as the venues for the study because these services are familiar to, and used for the same purposes by, consumers in the five countries (cf.

Douglas & Craig 1983;Kumar 2000).

Trained marketing researchers were used to develop the surveys in all the countries and to collect the data. Pro-fessional marketing research firms were used to conduct the studies in Morocco, and professional colleagues were employed for this purpose in Australia, Hong Kong, the Netherlands, and the United States. Initially, an English-language questionnaire was created and pretested among a sample of 100 graduate students native to the countries sampled. Following minor refinements, the foreign-language versions of the questionnaires were then developed using appropriate methods (Douglas & Craig 1983;Kumar 2000). For example, in creating the Chinese version, the instrument was first translated by a native Chinese professional translator and reviewed for linguistic and functional equivalence. The

(6)

resulting questionnaire was then back translated into English by a bilingual whose native language is English. Next, the Chinese version was refined so that it was comprehensible to the native, while being equivalent in both languages. The other foreign-language questionnaires were created in a sim-ilar fashion. Throughout, great care was taken to ensure that the English and foreign-language versions were functionally and semantically equivalent (Douglas & Craig 1983;Kumar 2000).

Study 2

To assess the models in varied service settings, the survey used in Study 1 was reworded to apply to airlines and physi-cians services, and was administered to a sample of service consumers in the United States. The global temporal setting was maintained and the wording was adjusted to reflect the new industries. For example, a satisfaction question in the airline survey was “I am happy with the service I receive at this airline” (see Appendix A).

To investigate conceptual distinctions associated with temporal setting, the global surveys used in the fast food and grocery store samples in Study 1 were reworded to reflect an encounter-specific temporal setting. Specifically, items were reworded to the past tense and respondents were specifically directed to only consider their last encounter. The surveys were then distributed to a sample of U.S. fast food and retail grocery store consumers. The sample sizes and demographic characteristics for both studies are presented inTable 1.

All respondents were self-selected using the mall intercept method (Bush & Hair 1985), but were disqualified if they had not had a grocery or fast-food service encounter during the previous month (Study 1) or an airlines or physicians service encounter within the previous three months (Study 2). This ensured that respondents’ memories of the service encounters would be recallable and reliable. Additionally, each respondent participated in a survey on only one of the industries.

Measurement

A nine-point Likert-type response format ranging from “strongly disagree” to “strongly agree” was used for all indi-cators in an effort to maximize respondent specificity, as opposed to employing the more commonly used five or seven-point response format (cf.Fornell 1992;Kumar 2000). The measures used to assess the five constructs are presented in Appendix A.

The service quality construct (“SQ” inFig. 1) has been the subject of much discussion and debate (e.g.,Brady & Cronin 2001;Cronin & Taylor 1992;Parasuraman, Zeithaml, & Berry 1994). However, in light of the need to predict behavioral intentions, the predominant view supports the use of performance perceptions when measuring service qual-ity (Brady, Cronin, & Brand 2002;Cronin & Taylor 1992;

Parasuraman et al. 1994). Because of the need to ensure con-struct and measurement equivalence across the multinational settings, it was especially important to use a broad range of scale items that are applicable across the five countries, while keeping the items to a manageable number. As such, we used a 10-item service quality scale based onParasuraman et al. (1985)10 original dimensions of service quality (items 1–10 in Appendix A). Similar scales are used by Gotlieb et al. (1994),Hartline and Ferrell (1996), andVoss, Parasuraman, and Grewal (1998).

The perceived service value construct (“VAL”) was devised in light of Zeithaml’s (1988) “get versus give” and

Grewal et al. (1998)“net gain” definitions. We used three appropriate indicators to measure value across national cul-tures (items 11–13). The items are similar to those used by

Sweeney et al. (1999)andSirohi et al. (1998).

Consumer satisfaction (“SAT”) has received consider-able attention in the literature (cf.Giese & Cote 2000; Yi 1990) in light of its strong effect on behavioral intentions (e.g.,Anderson & Fornell 1994;Anderson & Sullivan 1993;

Fornell 1992). In order to capture both the evaluative and emotion-based qualities of satisfaction (Oliver 1997), we

Table 1

Samples demographicsa

Sample Age Gender Education

(n) <21 21–30 31–40 41–50 >50 Male Female Elem. HS College

Overall Study 1 1138 5.8 60.5 19.6 9.1 5.0 50.4 49.6 7.3 24.3 68.4 Australia 234 0 59.0 38.5 2.6 0 51.7 48.3 0 6.0 94.0 Hong Kong 198 3.6 96.4 0 0 0 36.5 63.5 2.5 13.2 84.3 Morocco 242 17.6 23.0 17.6 30.1 11.8 48.5 51.5 22.0 20.8 57.2 The Netherlands 207 4.8 53.1 31.4 6.8 3.8 59.8 40.2 4.8 39.6 55.6 USA 257 2.8 77.7 10.4 4.8 4.4 59.0 41.0 0 42.3 57.7 Overall Study 2 329 18.1 22.5 21.4 32.6 5.4 47.1 52.9 24.5 53.2 22.3

n: Sample size; <21: percentage of respondents age 20 and younger; 21–30: percentage of respondents age 21–30; 31–40: percentage of respondents age 31–40;

41–50: percentage of respondents age 41–50; >50: percentage of respondents age 51 and older; male: percentage of males in the sample; female: percentage of females in the sample; Elem.: percentage of respondents having completed elementary school; HS: percentage of respondents having completed high school; college: percentage of respondents having completed college.

(7)

employed two kinds of satisfaction indicators. The “evalua-tive” satisfaction indicator (item 14 in Appendix A) is based onOliver’s (1997)satisfaction measures, and the emotion-based measures (items 15–16) are derived from the work of

Westbrook and Oliver (1991).

Sacriﬁce (“SAC”) is defined as that which is given up or sacrificed to acquire a service. This is consistent with the definitions ofHeskett, Sasser, and Hart (1990)andZeithaml (1988), as well as the multidimensional conceptualizations offered in the literature (e.g., Dodds, Monroe, & Grewal 1991; Zeithaml 1988). The construct was measured using items that reflect consumers’ perceptions of the monetary and non-monetary costs of obtaining and using a service (items 17–19).

Finally, the measure for behavioral intentions (“BI”) is based on the work of Zeithaml, Berry, and Parasuraman (1996). Their study identifies several factors as outcomes of a positive service exchange. Service providers that deliver

good service are suggested to have customers who are loyal, will recommend the service, and say positive things about the provider. In assessing behavioral intentions, we used mea-sures related to these factors (items 20–23).

Following data collection in both studies, all scales were subjected to a purification process involving a series of dimensionality, reliability, and validity assessments (Anderson & Gerbing 1988). The psychometric properties of the five constructs were evaluated via confirmatory factor analysis (CFA) using LISREL. All the items were included in the analysis and the observed variables were restricted to load on their respective latent factors.

Results of the refined scales are presented in

Tables 2 and 3. The model fits were evaluated using the comparative fit index (CFI) and the root mean square error of approximation (RMSEA). The chi-square statistic with corresponding degrees of freedom was also included in order to compare the various models (Anderson & Gerbing

Table 2

Intercorrelations of study variables in Study 1 and Study 2

Items SQ1 SQ2 SQ3 SQ4 SAT1 SAT2 SAT3 VALUE1 VALUE2 VALUE3 SAC1 SAC2 BI1 BI2 BI3

Study 1 (n = 1138)a SQ1 1.00 SQ2 74 1.00 SQ3 66 .72 1.00 SQ4 77 .73 .70 1.00 SAT1 66 .70 .68 .66 1.00 SAT2 67 .69 .63 .66 .86 1.00 SAT3 .63 .61 .55 .58 .73 .84 1.00 VALUE1 .48 .47 .42 .48 .51 .56 .58 1.00 VALUE2 .43 .38 .35 .40 .48 .52 .52 .75 1.00 VALUE3 .46 .41 .38 .44 .48 .56 .54 .76 .81 1.00 SAC1 .38 .31 .31 .36 .41 .42 .36 .36 .37 .45 1.00 SAC2 .32 .29 .30 .37 .39 .40 .31 .31 .35 .37 .62 1.00 BI1 .39 .39 .36 .41 .45 .43 .42 .51 .48 .49 .31 .29 1.00 BI2 .55 .55 .56 .57 .67 .69 .64 .60 .57 .59 .40 .44 .71 1.00 BI3 .53 .52 .52 .56 .63 .65 .62 .59 .57 .58 .39 .42 .71 .86 1.00 Mean 5.37 5.22 5.83 5.51 5.96 5.63 4.94 5.48 5.47 5.40 5.83 6.06 5.40 5.72 5.62 SD 1.92 1.85 1.91 1.86 1.78 1.95 2.11 1.88 1.86 1.81 1.91 1.75 2.22 1.85 2.01 Study 2 (n = 339)a SQ1 1.00 SQ2 .77 1.00 SQ3 .73 .83 1.00 SQ4 .80 .69 .73 1.00 SAT1 .74 .72 .72 .69 1.00 SAT2 .75 .72 .73 .69 .96 1.00 SAT3 .76 .73 .72 .65 .87 .90 1.00 VALUE1 .57 .49 .50 .51 .66 .66 .63 1.00 VALUE2 .45 .42 .41 .46 .54 .53 .52 .81 1.00 VALUE3 .53 .48 .46 .42 .61 .61 .60 .82 .90 1.00 SAC1 .42 .36 .35 .39 .46 .47 .44 .45 .45 .45 1.00 SAC2 .43 .39 .41 .41 .49 .50 .45 .38 .37 .41 .77 1.00 BI1 .56 .56 .50 .50 .66 .66 .65 .55 .49 .55 .40 .41 1.00 BI2 .69 .69 .65 .64 .82 .83 .78 .66 .56 .61 .48 .52 .83 1.00 BI3 .62 .62 .59 .58 .78 .77 .71 .63 .53 .57 .49 .49 .76 .88 1.00 Mean 6.88 6.91 7.60 7.09 7.35 7.18 6.68 6.92 6.74 6.75 6.49 6.77 6.80 7.16 7.18 SD 2.41 2.40 2.29 2.30 2.38 2.44 2.65 2.18 2.18 2.25 2.34 2.25 2.62 2.47 2.57

a_{All intercorrelations are significant at the p <.05 level. All items used a nine-point Likert-type scale ranging from strongly disagree (1) to strongly agree}

(8)

Table 3

Results of the CFA analyses

Sample Study 1a Australia Hong Kong Morocco The Netherlands USA Study 2

n 1138 234 198 242 207 257 339

χ2 ₅₇₀ ₅₂₀ ₁₂₃ ₄₀₈ ₂₈₁ ₁₇₈ ₃₃₈

df 80 80 80 80 80 80 80

CFI .97 .88 .98 .89 .88 .98 .96

RMSEA .07 .05 .05 .13 .11 .07 .09

Service quality (SQ; 4 items)

Composite reliability .90 .93 .79 .89 .82 .95 .92

Average variance extracted (percent) 70 76 49 66 54 82 74

Parameter estimates rangeb .81–.87 .85–.91 .69–.81 .74–.89 .73–.80 .91–.92 .84–.88 Satisfaction (SAT; 3 items)

Parameter estimates rangea _.86–.96 _.89–.96 _.87–.94 _.73–.93 _.61–.86 _.90–.98 _.91–.99

Value (VAL; 3 items)

Sacrifice (SAC; 2 items)

Behavioral intentions (BI; 3 items)

Parameter estimates rangea .75–.94 .65–.96 .70–.91 .72–.95 .74–.84 .85–.98 .84–.98

a_{t-values for the various measurement indicators are all significant (p < .01).} b _{Standardized coefficients are given for presentation purposes.}

1988; Jöreskog & Sörbom 1996). Specific items were evaluated based on their error variance, modification index (<3.84), and residual covariation (<|2.58|) (Anderson & Gerbing 1988;Fornell & Larcker 1981;Jöreskog & Sörbom 1996). Overall, the five measurement scales and the 15 purified scale items (of the 23 original items) were found to be reasonably reliable and valid across the two studies comprising seven individual samples.

The CFA model for the aggregated samples provided excellent fits to the data (seeTable 3).2While the chi-square statistics were significant (p < .01), it is known to be highly sensitive to large sample sizes, such as the ones used here (J¨oreskog, 1993). The CFI estimates were .97 (Study 1) and .96 (Study 2). The RMSEA estimates were .07 and .09.

Composite reliability was calculated using the procedures outlined byFornell and Larcker (1981). We also examined the parameter estimates and their associated t-values, and assessed the average variance extracted for each construct (Anderson & Gerbing 1988;Bagozzi & Yi 1988). The com-posite reliabilities for the five constructs ranged from .77 to .94 in Study 1 and from .87 to .97 in Study 2. The factor loadings ranged from .75 to .96 (Study 1, p < .01) and .84 to .99 (Study 2, p < .01). The average variances extracted ranged from 63 to 84 percent (Study 1) and 74 to 92 percent (Study 2).

2 _{It should be noted that a test for invariance of mean vectors was not}

performed prior to data aggregation.

Corresponding statistics for the sample-specific CFA models can be found inTable 3.

Discriminant validity was assessed using the procedure recommended byAnderson (1987)andBagozzi and Phillips (1982). This entailed analyzing all possible pairs of con-structs in a series of two-factor CFA models. Each model was run twice–once constraining the phi coefficient (φ) to unity and once freeing this parameter. A chi-square differ-ence test was then performed on the nested models to assess if the chi-square values were significantly lower for the uncon-strained models (Anderson & Gerbing 1988). The critical value (χ2₍₁₎> 3.84) was exceeded in all cases.

To assess cross-national validity of the items from Study 1, we followed the measurement validation procedure out-lined in Steenkamp and Baumgartner (1998) and demon-strated inSteenkamp, Batra, and Alden (2003). This process involved testing for configural and metric invariance using multigroup confirmatory factor analysis. Configural invari-ance was supported as the CFA demonstrated good fit to the data (CFI = .97, RMSEA = .07) and all parameter esti-mates were significant (p < .001) and greater than .70 (cf.

Steenkamp et al. 2003). To test for metric invariance, we compared the fit of a model wherein the factor loadings were constrained to be equal to the fit of a freely estimated model. The results are consistent withSteenkamp et al. (2003)in sup-porting metric invariance. Overall, the cross-national validity of the items from Study 1 is supported.

(9)

Table 4

Overall SEM Results for Study 1 and Study 2

Model Path Loading t-value Significance R2 _{Fit indices} _Rankinga,b

Study 1 (n = 1138)

Value model SAC→ VAL .27 7.86 p < .01 χ2_{= 921.29, df = 84}

SQ→ VAL .05 0.92 ns .52 (VAL) CFI = .94 #4

SQ→ SAT .83 22.23 p <.01 .69 (SAT) RMSEA = .09 SAT→ VAL .52 9.78 p <.01 .56 (BI)

VAL→ BI .75 20.12 p <.01

Service quality model SAC→ VAL .38 10.13 p <.01 χ2_{= 735.56, df = 84}

SQ→ VAL .40 11.74 p <.01 .45 (VAL) CFI = .96 #2

SAT→ SQ .84 22.24 p <.01 .71 (SQ) RMSEA = .08

VAL→ BI .45 14.14 p <.01 .64 (BI)

SQ→ BI .45 13.78 p <.01

Satisfaction model SAC→ VAL .35 9.21 p <.01 χ2_{= 775.01, df = 84}

SQ→ VAL .39 11.11 p <.01 .76 (SAT) CFI = .95 #3 VAL→ SAT .29 11.49 p <.01 .41 (VAL) RMSEA = .08

SQ→ SAT .67 19.16 p <.01 .60 (BI)

SAT→ BI .77 20.27 p <.01

Comprehensive model SAC→ VAL .36 9.32 p <.01 χ2_{= 594.46, df = 82}

SQ→ VAL .38 11.03 p <.01 .73 (SAT) CFI = .96 #1 SQ→ SAT .68 19.18 p <.01 .41 (VAL) RMSEA = .07

VAL→ SAT .27 10.44 p <.01 .66 (BI)

VAL→ BI .38 12.13 p <.01

SQ→ BI .17 4.09 p <.01

SAT→ BI .36 7.67 p <.01

Study 2 (n = 329)

Comprehensive model SAC→ VAL .30 4.86 p <.01 χ2_{= 352.02, df = 82}

SQ→ VAL .42 6.71 p <.01 .76 (SAT) CFI = .95 #1 SQ→ SAT .71 11.74 p <.01 .39 (VAL) RMSEA = .09

VAL→ SAT .24 5.82 p <.01 .76 (BI)

VAL→ BI .17 4.24 p <.01

SQ→ BI .17 2.72 p <.01

SAT→ BI .59 7.73 p <.01

a_{Using the} _χ2_{-difference test outlined by}_{Anderson and Gerbing (1988)}_{, the “comprehensive” model outperforms the other three models (smallest}

χ2_{= 141.10, df = 2, p <.01), followed by the “service quality” model, the “satisfaction” model, and the “value” model.}

b _{Using the}_χ2_{-difference test outlined by}_{Anderson and Gerbing (1988)}_{, the “comprehensive” model outperforms the other three models (smallest}_χ2_{= 26.48,}

df = 2, p <.01), followed by the “satisfaction” model, the “service quality” model, and the “value” model.

Results

Results of the analyses of the four models for the two aggregated samples (Studies 1 and 2) are provided inTable 4. Corresponding results for the disaggregated multinational samples are given inTables 5 and 6(Study 1). Results for the varied service settings (airlines and physicians) and the encounter-specific temporal setting are presented inTable 7

(Study 2).Table 8summarizes the results of the model tests in all samples.

In both studies, the structures of the four models were tested using LISREL. The differences among the models were determined by comparing changes in the chi-square values (Anderson & Gerbing 1988). This is an appropriate test because all the comparisons are conducted across nested models and because none of the models are fully specified (Anderson & Gerbing 1988). A significant improvement in a chi-square value therefore suggests that a particular model or added parameter better represents the data. Since the data

analysis considered numerous model analyses, the following discussion focuses on some of the more notable findings. The aggregated sample

A good fit to the data for all four models was the result in analyses using the aggregated samples. Moreover, only the path between service quality and value in the “value model” was found to be insignificant. In the chi-square difference tests, Model 4 (“comprehensive model”) exhibited the best overall fit in both the Study 1 and Study 2 aggregated sam-ples. The three other models exhibited significantly higher chi-square values (p < .01), and thus do not fit as well as Model 4 (see Table 4). This suggests that service quality, service value, and satisfaction all have a significant effect on behavioral intentions when assessed collectively. However, because large samples can produce significant results where they may not actually exist, and because specific countries and varied settings can produce unique findings, it is useful to

(10)

Table 5

Study 1: results from the value and service quality models

Australia, n = 234 Hong Kong, n = 198 Morocco, n = 242 The Netherlands, n = 207 USA, n = 257 Value model SAC-VAL .26 (3.11)** _{.16 (2.60)}** _{.23 (1.88)} _{.23 (2.86)}** _{.31 (4.61)}** SQ-VAL .05 (ns) .31 (2.64)** _{.00 (ns)} _{−.11 (ns)} _{.00 (ns)} SAT-VAL .40 (3.11)** .49 (4.10)** .52 (6.05)** .79 (4.45)** .52 (3.49)** VAL-BI .67 (9.21)** _{.78 (6.98)}** _{.81 (9.20)}** _{.80 (6.21)}* _{.66 (9.21)}** R2_(VAL) _.39 _.66 _.44 _.62 _.49 R2_(SAT) _.75 _.64 _.43 _.65 _.83 R2_(BI) _.45 _.61 _.66 _.64 _.43 χ2_(df) _{582.34 (df = 84)} _{172.53 (df = 84)} _{506.67 (df = 84)} _{376.82 (df = 84)} _{365.67 (df = 84)} CFI .86 .95 .86 .83 .94 RMSEA .16 .07 .14 .13 .11

Model ranking in country #3 #3 (tied) #3 #4 #4

Service quality model

SAC-VAL .30 (4.03)** .18 (2.58)** .68 (6.37)** .30 (3.37)* .37 (5.09)** SQ-VAL .39 (5.31)** _{.70 (6.36)}** _{.05 (ns)} _{.42 (4.56)}** _{.40 (5.76)}** SAT-SQ .87 (10.39)** _{.85 (8.19)}** _{.65 (8.84)}** _{.85 (6.78)}** _{.92 (11.53)}** VAL-BI .42 (6.44)** _{.47 (3.93)}** _{.74 (8.83)}** _{.35 (4.37)}** _{.22 (4.27)}** SQ-BI .45 (6.41)** .35 (3.09)** .15 (2.76)** .67 (5.12)** .70 (9.21)** R2_(SQ) _.76 _.72 _.42 _.73 _.85 R2_(VAL) _.35 _.60 _.50 _.34 _.42 R2_(BI) _.58 _.58 _.66 _.82 _.70 χ2_(df) _{558.09 (df = 84)} _{172.63 (df = 84)} _{467.33 (df = 84)} _{315.95 (df = 84)} _{215.93 (df = 84)} CFI .87 .95 .87 .87 .97 RMSEA .16 .07 .14 .12 .08

Model ranking in country #2 #3 (tied) #2 #3 #3

* _{Denotes p}_{≤ .05.} **_{Denotes p}_{≤ .01.}

Table 6

Study 1: results from the satisfaction and comprehensive models

Australia, n = 234 Hong Kong, n = 198 Morocco, n = 242 The Netherlands, n = 207 USA, n = 257 Satisfaction model SAC-VAL .26 (3.04)** .19 (2.58)** .58 (3.52)** .27 (2.96)** .37 (5.07)** SQ-VAL .38 (4.59)** _{.65 (6.73)}** _{.05 (ns)} _{.36 (4.17)}** _{.38 (5.68)}** VAL-SAT .18 (3.79)** _{.38 (4.22)}** _{.40 (6.04)}** _{.40 (2.59)}** _{.16 (4.22)}** SQ-SAT .78 (9.40)** _{.53 (5.58)}** _{.48 (7.15)}** _{.73 (2.67)}** _{.83 (10.87)}** SAT-BI .65 (8.28)** .76 (7.78)** .74 (9.95)** .90 (2.44)* .84 (10.62)** R2_(SAT) _.79 _.70 _.56 _.95 _.86 R2_(VAL) _.33 _.51 _.38 _.27 _.40 R2_(BI) _.43 _.58 _.54 _.81 _.71 χ2_(df) _{609.64 (df = 84)} _{162.35 (df = 84)} _{519.38 (df = 84)} _{309.57 (df = 84)} _{196.53 (df = 84)} CFI .86 .96 .85 .87 .97 RMSEA .16 .07 .15 .11 .07

model ranking in country #4 #2 #4 #2 #2

Comprehensive model SAC-VAL .25 (2.92)** _{.22 (3.04)}** _{.58 (3.46)}** _{.27 (2.92)}** _{.37 (5.09)}** SQ-VAL .38 (4.53)** .64 (6.71)** .05 (ns) .35 (4.09)** .38 (5.67)** VAL-SAT .15 (3.16)** .36 (3.98)** .38 (5.78)** .39 (4.65)** .15 (3.94)** SQ-SAT .78 (9.60)** _{.54 (5.55)}** _{.49 (7.19)}** _{.63 (5.95)}** _{.83 (10.94)}** SQ-BI .68 (5.31)** _{−.05 (ns)} _{−.09 (ns)} _{.68 (3.92)}** _{.20 (1.92)} VAL-BI .45 (6.58)** .41 (3.75)** .54 (7.09)** .42 (3.59)** .16 (3.31)** SAT-BI .28 (2.45)* .48 (3.84)** .46 (6.16)** −.05 (ns) .55 (4.70)** R2_(SAT) _.76 _.69 _.55 _.76 _.85 R2_(VAL) _.32 _.53 _.38 _.27 _.41 R2_(BI) _.60 _.63 _.72 _.82 _.72 χ2_(df) _{525.34 (df = 82)} _{145.70 (df = 82)} _{437.94 (df = 82)} _{287.00 (df = 82)} _{183.20 (df = 82)} CFI .87 .97 .88 .88 .98 RMSEA .14 .06 .13 .11 .07

Model ranking in country #1 #1 #1 #1 #1

* _{Denotes p}_{≤ .05.} **_{Denotes p}_{≤ .01.}

(11)

Table 7

Results from Study 2

Value model Service quality model Satisfaction model Comprehensive model Airlines/physicians sample, n = 166 SAC-VAL .13 (ns) .24 (2.96)** _{.27 (2.99)}** _{.27 (2.91)}** SQ-VAL .05 (ns) .57 (5.87)** .47 (4.96)** .47 (4.94)** SAT-VAL .76 (5.75)** SAT-SQ .91 (8.70)** VAL-SAT .39 (6.06)** .39 (5.94)** SQ-SAT .85 (8.88)** _{.60 (7.36)}** _{.60 (7.31)}** VAL-BI .78 (8.59)** _{.22 (3.47)}** _{.14 (2.33)}* SAT-BI .90 (8.76)** .58 (5.26)** SQ-BI .74 (6.24)** _{.24 (2.86)}** R2_(VAL) _.63 _.54 _.45 _.45 R2(SQ) .83 R2_(SAT) _.72 _.81 _.80 R2_(BI) _.61 _.82 _.81 _.83 χ2_(df) _{378.15 (84)} _{316.95 (84)} _{262.09 (84)} _{249.43 (82)} CFI .90 .92 .94 .94 RMSEA .15 .13 .11 .11 Ranking #4 #3 #2 #1 Encounter-specific sample, n = 163 SAC-VAL .24 (2.58)** .28 (3.15)** .27 (2.90)** .28 (2.93)** SQ-VAL .18 (ns) .37 (4.09)** _{.35 (3.76)}** _{.35 (3.74)}** SAT-VAL .27 (2.04)* SAT-SQ .85 (9.51)** VAL-SAT .14 (2.39)* _{.13 (2.34)}* SQ-SAT .83 (9.45)** _{.77 (8.69)}** _{.76 (8.68)}** VAL-BI .57 (6.79)** .21 (3.21)** .16 (2.73)** SAT-BI .81 (9.34)** _{.63 (5.89)}** SQ-BI .63 (7.23)** _{.11 (ns)} R2_(VAL) _.35 _.31 _.31 _.31 R2_(SQ) _.72 R2_(SAT) _.69 _.71 _.70 R2_(BI) _.32 _.58 _.65 _.68 χ2₍₇₅₎ _{325.17 (84)} _{251.37 (84)} _{217.37 (84)} _{207.28 (82)} CFI .91 .94 .95 .96 RMSEA .13 .11 .10 .10 Ranking #4 #3 #2 #1 * _{Denotes p}_{≤ .05.} **_{Denotes p}_{≤ .01.}

examine the setting-specific findings individually (Bearden, Sharma, & Teel 1982).

The setting-speciﬁc results

We found consistent results for all the settings investi-gated. The findings again indicate that Model 4, the

com-prehensive model, outperforms the other three models. The comprehensive model fits the data very well in the samples from Hong Kong, the U.S., and in both service and temporal settings (CFI ranged from .94 to .98, RMSEA ranged from .06 to .11). The model fit was not as good in the samples from Australia, Morocco, and the Netherlands (CFI ranged from .87 to .88, RMSEA ranged from .11 to .14).

Table 8

A ranking summary of the model fits

Sample Sample size Value model Service quality model Satisfaction model Comprehensive model

Study 1 overall 1138 4 2 3 1

Australia 234 3 2 4 1

Hong Kong 198 3 tied 3 tied 2 1

Morocco 242 3 2 4 1 The Netherlands 207 4 3 2 1 USA 257 4 3 2 1 Study 2 overall 329 4 3 2 1 Airlines/physicians 166 4 3 2 1 Encounter-specific 163 4 3 2 1

(12)

An examination of the individual paths across the sam-ples suggests some notable distinctions relative to the deter-minants of behavioral intentions. Most of these differ-ences emanated from the multinational samples (Study 1). For example, Australia and the United States were the only multinational samples where service quality (p < .10 in U.S.), service value, and satisfaction had strong effects on behavioral intentions. In the Netherlands sample, the effect of satisfaction on behavioral intentions was insignif-icant. Similarly, the effect of service quality on behav-ioral intentions was insignificant in the samples from Hong Kong and Morocco. Value, however, had a significant effect on behavioral intentions in all five multinational samples.

The results for the airlines and physicians samples in Study 2 were similar to the U.S. fast food and retail grocery store samples in Study 1. This is logical since both of these sam-ples were obtained from U.S. consumers in a global temporal setting. It suggests that the same evaluation processes apply regardless of the nature of the service setting.

The encounter-specific samples in Study 2 yielded interesting results (Table 7). The path from service quality to behavioral intentions was insignificant in this sample, whereas the same path was significant in a global temporal setting. This supports the literature that suggests different evaluation processes apply depending on the temporal setting. These findings also corroborate encounter-specific evaluation models that specify an indirect relationship between service quality and outcome measures (e.g.,Bolton & Drew 1991).

From a broader perspective, the encounter-specific results shed light on a continuing debate concerning whether and how temporal setting may influence evaluation pro-cesses. The analyses in both temporal settings support the comprehensive model. This corroborates the service qual-ity→ satisfaction → behavioral intentions conceptual order as specified by Gotlieb et al. (1994). This order is spec-ified in the comprehensive model and in the satisfac-tion model, which is ranked second in both temporal settings.

Discussion

Services research still represents a relatively new fron-tier for marketers. It is therefore critical at this stage to understand the fundamental processes that influence how customers evaluate service firms. However, in this literature to date, little externally valid knowledge has emerged. We sought to address this gap by removing a contextual lens from our analyses. That is, the purpose was not to study any one of the antecedent constructs per se, but rather how these constructs collectively lead to behavioral intentions. In all, 36 separate tests were conducted across a range of cultural, temporal, and industrial settings. The results are clear in sup-porting one model.

The comprehensive model

A key implication of the results is that service quality, satisfaction, and service value all appear to play a direct role in influencing service consumers’ behavioral intentions. However, our findings reveal that individual constructs and linkages vary in differing national and temporal settings, which underscores the complexity of the service encounter and calls into question the tendency to assume that customers behave similarly, regardless of the national or temporal con-text. For example, in the results for the service quality model in samples from Hong Kong and Morocco, the relationships between service quality and behavioral intentions are sig-nificant (parameter estimate = .35 and .15; p < .01). On the other hand, this linkage is not significant among Chinese and Moroccan consumers in the comprehensive model. Thus, without the more holistic perspective provided by this model, scholars or practitioners studying service evaluations in Hong Kong and Morocco might over-emphasize the role of ser-vice quality at the expense of other critical antecedents. A similar example is evident in the sample from the Nether-lands. In the satisfaction model, satisfaction is a significant antecedent to behavioral intentions. However, when all con-structs are allowed to influence behavioral intentions simul-taneously, the satisfaction-to-behavioral intentions path is not significant. Lacking this more complete picture, schol-ars and practitioners might over-value satisfaction’s role in the Netherlands.

In Study 2, findings reveal there is no difference in the comprehensive model across major contextual settings. More specifically, consumers appear to process the constructs and associated linkages in much the same way regardless of the level of tangibility, the degree of interaction, or the inher-ent variability of the particular service offering. This find-ing should particularly interest practitioners who provide a range of services with varying degrees of tangibility and customer–employee interaction. Regardless of the nature of the offering, it is important to emphasize all of the antecedents investigated here.

We also considered the temporal setting within which con-sumers view the service encounter. Scholars suggest that consumer processing on given constructs and associations may vary as a function of the temporal setting in which the processing takes place. Our findings suggest the comprehen-sive model again provides the best fit to the data regardless of the temporal setting. In addition, the findings indicate that the relationships between service quality and behavioral inten-tions may differ across temporal settings (Oliver 1997).

There are other similar examples within our findings that highlight the importance of examining the antecedent rela-tionships within a holistic perspective, as provided by the comprehensive model. In this regard, our findings cast doubt on the external validity of conceptual models that, a priori, portray any one of the antecedent constructs as having a more important role in behavioral intentions than any of the others in service encounters. Consequently, practitioners are remiss

(13)

to emphasize one of the antecedents if it implies neglecting another.

Validation of key relationships

The results of the comparative tests highlight service qual-ity, service value, and satisfaction as critical antecedents to behavioral intentions in the service encounter. While the antecedent role of these constructs individually has been rec-ognized in the attitude and services literatures, our findings provide strong evidence of their collective role and on the manner in which they are related to each other. First, we have validated the notion of service value as a construct embed-ded in the relationship between sacrifice and service quality. Second, we reveal a linkage between service quality and each of value and satisfaction. Third, we highlight a relationship between value and satisfaction. Most importantly, our find-ings imply that service quality, service value, and satisfaction have a strong, collective influence on behavioral intentions. Our results support these linkages more robustly than any previous study.

A notable outcome is the consistency with which value is a driver of behavioral intentions. Our findings provide the strongest evidence to date about the importance service con-sumers place on the value inherent in a service encounter. Not only do value perceptions directly influence behavioral intentions, they also strongly affect customer satisfaction. This both underscores the importance of value as a strate-gic objective for practitioners and justifies recent scholarly interest in the construct.

Another important finding is that in four out of the five countries (Study 1) and in both of the exchange settings (Study 2), satisfaction emerged as a significant antecedent of behavioral intentions. This outcome is strongly consistent with past conceptualizations (cf.Oliver 1997). Satisfaction affects behavioral intentions directly, but also operates as per-haps the most important mediating variable, linking behav-ioral intentions to both value and service quality. These results lend support to cognition→ affect → intention models and to the suggestion that intervening mechanisms should be con-sidered between attitudinal variables and intentions (Bagozzi, 1992).

Findings also support the importance of sacrifice. Its role as a key antecedent of value is strongly supported in all of the models portrayed here. In emphasizing the creation of value and satisfaction, service providers must account for the sig-nificance of sacrifice in determining consumers’ perceptions of value.

Of the four major antecedents to behavioral intentions modeled here, the results for service quality are perhaps the most surprising. In the comprehensive model, the relationship between service quality and behavioral intentions is insignif-icant in three out of the five countries in Study 1, as well as in the encounter-specific setting in Study 2. Overall, this implies that the direct antecedent role of service quality appears to be relatively less important than those of value and satisfaction.

Accordingly, if forced to choose, value and satisfaction are the antecedents to emphasize in service encounters. More-over, the role of service quality appears to be more complex than previously thought. Its precise role in services should be the focus of future research.

Managerial implications

Managers of service organizations can draw several use-ful implications from this study. First, the configuration of the comprehensive model gives direction on how services should be positioned for maximal consumer appeal. The model implies that managerial focus on all of the direct antecedents of behavioral intentions – service quality, ser-vice value, and satisfaction – may be needed.

The findings add further to the call for managers to con-sider the importance of service value and customer satis-faction as strategic goals. Regardless of the setting, service consumers appear to place considerable weight on these two variables in evaluating their options. This suggests that promotion, pricing, location, service environment, person-nel, operating hours, customer service policies, and all other strategic options should be evaluated based on their likely impact on consumers’ value perceptions and customer satis-faction. Training programs need to emphasize the develop-ment of skills that maximize value and satisfaction. More-over, while our results do not suggest that providing superior service quality is unimportant, they do suggest that it is less important than the other constructs examined here.

Finally, while the comprehensive model fits the data well in all five national samples, managers should consider poten-tial moderators of these relationships in their local markets. For instance, local economic or political conditions might influence the choice of appropriate managerial goals, as in the case of a country that has experienced a severe eco-nomic downturn. Under such conditions, sacrifice might assume even greater importance in the determination of value perceptions. Alternatively, affluent consumers might require quality-based strategies as a means to higher value percep-tions. Such customers may weight non-monetary sacrifices more heavily than the required monetary expenditures. Some consumers may be more affective in their evaluations and base their purchase intentions on satisfaction evaluations. Obviously, service managers need to adjust strategies to best fit the characteristics of their target markets and the specific offerings that they provide in those markets.

Limitations and directions for future research

Given the complexity of multinational and multi-industry research, offering some qualifications on this work is use-ful. First, multinational marketing research comprises var-ious complexities that we have sought to address in the present research. Despite using state-of-the-art methods (e.g.,

Steenkamp & Baumgartner 1998), some methodological or procedural errors common to international research may have

(14)

occurred (cf.Douglas & Craig 1983,Kumar 2000). Accord-ingly, the process of reaching multinational survey equiva-lence may account for some inconsistencies in the results. Specifically, the process of item pruning across the samples may be regarded as a limitation. Best practice suggests that scale purification be done via a pretest within the same target market. This was not possible in our study given the geo-graphic diversity of the samples.

Given the length and complexity of the preceding study, it was not possible to explore a range of associated areas that can contribute to increased knowledge about service evaluations. For example, national character, culture, economic status, and similar variables probably have an influence beyond our macro-level analysis on the relationships studied here (cf.

Clark 1990; Hofstede 1980). Future research should more explicitly assess the effects of national culture and economic status on the linkages investigated here.

Future research also should focus on emerging service economies. For example, in countries such as China and Russia, unprecedented opportunities may exist to investigate how cognitive processing of services evolves in the wake of a transformation to market-based economies, which includes increased consumption experience, emergent entrepreneurial activities, and other such variables.

Acknowledgements

This study was funded by the Federal Express Corpo-ration, the FSU College of Business, and The Institute for Global Business at The University of Akron. We appreciate the input provided by Katherine N. Lemon, Robert F. Hurley and Richard A. Spreng during various phases of the research process, and the data collection efforts of Donnel A. Briley, Moussa F’touh, Katarina Lagerstr¨om, and Vicky Stogiannis.

Appendix A

Measurementsa

Service quality

1. Their employees offer the personal attention I need from them (SQ1) 2. The behavior of their employees instills confidence in me (SQ2) 3. Their employees are courteous (SQ3)

4. I receive enough individual attention from their employees (SQ4) 5. I can depend on receiving prompt service from their employees*

6. I feel safe conducting business with their employees* 7. Their employees are able to answer my questions*

8. Their employees are never too busy to respond to my requests* 9. Their employees have my best interests at heart*

10. Their employees understand my specific needs* Value

11. Their products are an excellent value (VAL1)

12. At this organization, you get a great deal for your money (VAL2) 13. What I get from this organization, and its cost, makes it a great value

(VAL3)

Satisfaction

14. I am satisfied with the service I receive from the organization (SAT1) 15. I am happy with the service I receive from the organization (SAT2) 16. I am delighted with the service I receive from the organization (SAT3) Sacrifice

17. Their prices are low (SAC1)

18. The time needed to make a purchase at this organization is low (SAC2) 19. The effort required to make a purchase at this organization is low* Behavioral intentions

20. I would classify myself as a loyal customer of this organization (BI1) 21. If asked, I would say good things about their organization (BI2) 22. I would recommend their organization to a friend (BI3) 23. My usage of this organization has been high*

All items were measured on nine-point scales anchored by 1 = strongly dis-agree to 9 = strongly dis-agree.

a _{An item marked with “*” was deleted during the measurement}

purifica-tion process.

References

Ajzen, Icek, & Fishbein, Martin. (1980). Understanding attitudes and

predicting social behavior. Englewood Cliffs, NJ: Prentice-Hall.

Anderson, Erin W., & Sullivan, Mary. (1993). The antecedents and con-sequences of customer satisfaction for firms. Marketing Science, 12, 125–143.

Anderson, Eugene W., & Fornell, Claes. (1994). A customer satisfaction research prospectus. In R. T. Rust & R. L. Oliver (Eds.), Service

qual-ity: New directions in theory and practice (pp. 241–268). Thousand

Oaks, CA: Sage Publications.

Anderson, Eugene W., Fornell, Claes, & Lehmann, Donald R. (1994). Customer satisfaction, market share, and profitability: Findings from Sweden. Journal of Marketing, 58(3), 53–66.

Anderson, James C. (1987). An approach for confirmatory measurement and structural equation modeling of organizational properties.

Man-agement Science, 33(April), 525–541.

Anderson, James C., & Gerbing, David W. (1988). Structural equation modeling in practice: A review and recommended two step approach.

Psychological Bulletin, 103(May), 411–423.

Andreassen, Tor Wallin. (1998). Customer loyalty and complex services.

International Journal of Service Industry Management, 9(1), 178–194.

Athanassopoulos, Antreas D. (2000). Customer satisfaction cues to sup-port market segmentation and explain switching behavior. Journal of

Business Research, 47, 191–207.

Bagozzi, Richard P. (1992). The self regulation of attitudes, intentions, and behavior. Social Psychology Quarterly, 55, 178–204.

Bagozzi, Richard P., & Phillips, Lynn W. (1982). Representing and testing organizational theories: A holistic construal. Administrative Science

Quarterly, 27(September), 459–489.

Bagozzi, Richard P., & Yi, Youjae. (1988). On the evaluation of structural equation models. Journal of the Academy of Marketing Science, 16(2), 74–94.

Bearden, William O., Sharma, Sharma, & Teel, Jesse E. (1982). Sample size effects of Chi square and other statistics used in evaluating causal models. Journal of Marketing Research, 19(November), 425–430. Bitner, Mary Jo. (1990). Evaluating service encounters: The effects of

physical surrounding and employee responses. Journal of Marketing,

54(2), 69–81.

Bitner, Mary Jo, & Hubbert, A. R. (1994). Encounter satisfaction versus overall satisfaction versus quality. In T. R. Roland & R. L. Oliver (Eds.), Service quality: New directions in theory and practice (pp. 72–84). New York: Sage Publications, Inc.