• No results found

Phase 1: data reliability and validity test results

4.1 Results of Research Phase 1

4.1.2 Phase 1: data reliability and validity test results

The test results for data reliability and validity tests are discussed according to the PLS-SEM technique, addressed earlier in subsection 3.2.2.

Data was divided into three categories: “All data”, “EA data”, and “MM data” and is tested first for evaluation criteria for the reflective measurement model. Appendixes 15–17 illustrate the outer loading, AVE, composite reliability and HTMT test results in more detail for the All, EA, and MM data sets.

There is clearly some level of rephrasing of the questions in the original TAM3 model (Venkatesh and Bala, 2008). As TAM3 model is applied in this robotics FFE study, also same rephrasing in the questions takes place. There are a few exceptions in the data set that exceed the 0.90 value for the internal consistency reliability composite reliability test, but none of the values exceed the 0.95 limit. The results for the internal consistency reliability composite reliability test are presented in Appendix 16.

For the reflective variables RES4 (I would have difficulty explaining why using the robot may or may not be beneficial), CSE3 (I could investigate the lunch menus using the robot . . . if someone showed me how to do it first), CSE4 (. . . if I had used similar robots before this one to do the same task), and CPLAY4 (The following questions ask you how you would characterize yourself when you use robots: . . . unoriginal), the outer loadings (Appendix 15) were too small (<0.400); in such a case, PLS-SEM convergent validity guidelines recommended omitting these variables from the model, provided such variables are not significant for the model. It is evident in Appendix 9, that even though the variables RES4, CSE3, CSE4, and CPLAY4 are omitted from the TAM3 construct, the variables “Result demonstrability”, “Robot self-efficacy,” and “Robot playfulness” continue being very well explained by the remaining RES, CSE, and CPLAY variables. After eliminating the RES4, CSE3, CSE4, and CPLAY4 variables, all other loadings measure above 0.400 for all data sets (All, EA, and MM).

In addition, variable CANX1 (Robots do not scare me at all) was interfering the EA data, thereby causing AVE and composite reliability tests to perform below recommended values. After the elimination of the CANX1 variable, the TAM3 model construct performed reliably in the EA data set as well as with All and MM data sets. The composite reliability AVE test results are presented in Appendix 16.

In order to determine discriminant validity, the HTMT test was run for all three data sets under evaluation. All data passed the set criteria, except one sample “PEOUàBI” in the EA data set with a value of 1.017, which is only slightly above

the <1.00 criteria. Discriminant validity HTMT test results are presented in Appendix 17.

Table 12 below summarizes the test criteria and results for the three different TAM3 data sets in the PLS-SEM reflective measurement model.

Table 12. Summary of the reflective measurement model

Convergent Validity Internal Consistency Discriminant Validity

Loadings AVE Composite Reliability HTMT confidence interval >0.4 >0.5 0.60–0.90 preferred, not >0.95 <1.00

All data OK OK OK, exceptions: OK

CANX = 0.901, BI = 0.927

EA data OK OK OK, exceptions: OK, exception:

BI = 0.917, PE = 0.948 PEOUàBI = 1.017 MM data OK OK, exception: RES = 0.492 OK, exceptions: OK

BI = 0.927, CANX = 0.908

AVE = Average Variance Explained, HTMT = Heterotrait-monotrait ratio, CANX = Robot Anxiety, BI = Behavioral Intention, PE = Perceived Enjoyment, PEOU = Perceived Ease of Use, RES = Result Demonstrability.

All outer loadings are above the 0.40 recommended limit. All AVE values are above the 0.50 recommended limit, except the “Result demonstrability” construct in the MM data set that has a value of 0.49, which is very close to the recommended limit. For the composite reliability test, all data is below the strongly recommended 0.95 limit value. A vast majority of the data values are between the preferred range of 0.60–0.90, with only a few exceptions. In the HTMT test, all data meets the <1.00 criteria, except one sample (“Perceived Ease of Use” à “Behavioral Intention”) that has a value of 1.017, which is just slightly above the given limit. Thus, it can be stated that the data fits the model.

Now that the data has been tested for the reflective measurement model evaluation criteria, the next step is to evaluate the TAM3 structural models. This includes studying the explained variance, predictive relevance, size, and significance of path coefficients and effect sizes.

The collinearity check was performed by investigating the VIF values. Detailed outer model and inner model VIF values are listed in Appendixes 18 and 19. All VIF values clearly pass the <5.00 criteria. Further, 5.000 rounds bootstrapping was run for all the data sets. The significance level was tested using the two-tailed test by assessing the t and p values. All the values that crossed the 5% significance criteria

(t value > 1.96 and p value < 0.05) are highlighted in Appendixes 20–22 among path coefficients, outer loadings, and coefficients of determination (R2).

Table 13 below summarizes the values of the coefficients of determination (R2)

for all data sets to describe the predictive power of the structural model. Table 13. Coefficients of determination (R2)

Construct All data EA data MM data

Behavioral Intention 0.437 (*) 0.5 (**) 0.415 (*)

Image 0.16 0.167 0.158

Perceived Ease of Use 0.45 (*) 0.551 (**) 0.424 (*)

Perceived Usefulness 0.43 (*) 0.619 (**) 0.352 (*)

There are no hard rules for acceptable R2 values, as these depend on model complexity and research discipline. Marketing research considers 0.75 substantial

(***), 0.50 moderate (**), and 0.25 weak (*) (Hair et al., 2017).

According to the data illustrated in Table 13, there are some differences between the coefficients of determination (R2) values among All, EA, and MM data. The

perceived usefulness has the largest difference in its R2 value, varying between the

MM 0.35 and EA 0.62 values. Overall, all the other constructs than the Image are rather well (>0.4) explained for all data sets. The coefficients of determination are further discussed in subsection 4.1.3.

Table 14 below illustrates the effect size f2 values for all data sets.

Table 14. f2 effect size

All data EA data MM data

Image -> Perceived Usefulness 0.099 (*) 0.171 (**) 0.068 (*)

Output Quality -> Perceived Usefulness 0.138 (*) 0.175 (**) 0.164 (**) Perceived Ease of Use -> Behavioral Intention 0.021 (*) 0.116 (*) 0.004 (†) Perceived Ease of Use -> Perceived Usefulness 0.002 (†) 0.068 (*) 0.000 (†) Perceived Enjoyment -> Perceived Ease of Use 0.411 (***) 0.553 (***) 0.215 (**) Perceived Usefulness -> Behavioral Intention 0.448 (***) 0.210 (**) 0.539 (***) Result Demonstrability -> Perceived Usefulness 0.075 (*) 0.324 (**) 0.031 (*) Robot Anxiety -> Perceived Ease of Use 0.057 (*) 0.023 (*) 0.116 (*) Robot Playfulness -> Perceived Ease of Use 0.049 (*) 0.017 (†) 0.003 (†) Robot Self-Efficacy -> Perceived Ease of Use 0.035 (*) 0.059 (*) 0.021 (*) Subjective Norm -> Behavioral Intention 0.019 (†) 0.059 (*) 0.009 (†)

Subjective Norm -> Image 0.191 (**) 0.201 (**) 0.188 (**)

Subjective Norm -> Perceived Usefulness 0.000 (†) 0.000 (†) 0.002 (†)

The f2 effect sizes lower than 0.02 indicate that there is no effect (†), 0.02 indicates

a small effect (*), 0.15 indicates a medium effect (**), and 0.35 indicates a large effect (***). All the f2 values are in line with previous findings related to t and p values

(Appendixes 20–22). Subsequently, in subsection 4.1.3, f2 values are assessed when

comparing the TAM3 construct differences between EA and MM category representatives.

Table 15 below presents the predictive accuracy as construct cross-validated redundancy for all three data sets.

Table 15. Q2 values

All data EA data MM data

Behavioral Intention 0.321 0.328 0.296

Image 0.087 0.096 0.068

Perceived Ease of Use 0.220 0.288 0.170

Perceived Usefulness 0.288 0.392 0.226

As evident from Table 15 above, all four endogenous constructs have Q2 values

clearly above zero, which supports the model’s predictive relevance regarding the endogenous latent variables.

Table 16 below summarizes the q2 effect sizes for all three data sets. Appendix

23 presents the effect size q2 for all three data sets. All values below 0.02 threshold

Table 16. q2 effect size values

BI IMG PEOU PU

All EA MM All EA MM All EA MM All EA MM

BI IMG 0.06 0.07 0.04 OUT 0.07 0.07 0.09 PEOU 0.01† 0.05 0.00† -0.01† 0.02 -0.01† ENJ 0.15 0.21 0,07 PU 0.28 0.12 0.33 RES 0.04 0.12 0.02† CANX 0.02 0.00† 0.04 CPLAY 0.02† 0.00† 0.00† CSE 0.01† 0.01† 0.00† SN 0.01† 0.02† -0.01† 0.10 0.11 0.07 0.00† -0.01† 0.00†

† = fails the test threshold criteria. BI = Behavioral Intention, IMG = Image, PEOU = Perceived Ease of Use, PU = Perceived Usefulness, OUT = Output Quality, ENJ = Perceived Enjoyment, RES = Result Demonstrability, CANX = Robot Anxiety, CPLAY = Robot Playfulness, CSE = Robot Self-Efficacy, SN = Subjective Norm.

A summary of the q2, f2, t, and p values is presented in Appendix 24.

When comparing the “failing” q2 effect sizes (Table 16 above) to the

corresponding path coefficients (Appendixes 20–22), it is evident that the links with the q2 effect size below the 0.02 threshold are the same links that have low path

coefficient values, typically <0.20.

Now that all the data reliability and validity checks have been performed, the data results are presented and the three research questions answered in the following subsections.