Statistical Methods - Research Methods - Determinants of Firm’s Centrality

4 Determinants of Firm’s Centrality–based Network Capability: An Empirical Perspective

4.2 Research Methods

4.2.3 Statistical Methods

As stated in Hypothesis 1, a firm’s efficiency is predicted to raise its centrality– based capacity in the pharmaceutical biotechnology industry. A firm with higher efficiency to choose collaboration partners may have a larger capability to place itself in a central position within the network. In this relationship, from the econometrics

point of view, it is necessary to consider the potential existence of “reverse causality”, where the efficiency of a company is likely determined simultaneously along with its centrality–based partnering capability. Some authors (e.g., Hagedoorn and Duysters, 2002; Hagedoorn et al., 2006) suggest that a company with a central position in an inter–firm network would have more information about the position of other firms in the network and their information flows, which enables it to use its centrality–based capability to delete duplicating partners, thus gaining high efficiency. Therefore, firm’s ability to obtain a central position within the network may partially determine the level of its efficiency to select suitable partnerships. If the traditional ordinary least squares (OLS) method in a linear regression model11_:

= +

i i i

y X ß u (10)

is adopted in the presence of reverse causality between firm’s efficiency and its centrality–based capability, it is highly likely that this model will be inconsistent, meaning that due to the endogeneity of a firm’s efficiency, changes in its efficiency level are associated not only with changes in the dependent variable of firm’s centrality– based capability but also with changes in the error term of the model (see equation (10)). What is needed in this case is typically a method to generate only exogenous variation for firm’s efficiency level. An obvious way to do this is through an experiment. The application of instrumental variables (IV) method provides a way to obtain consistent parameter estimates if suitable instruments exist. There are various instrumental variable methods that could be applied to a model’s endogeneity problems. In this study, the two–stage least squares (2SLS) and the optimal generalized method of moments (GMM) were used, the latter of which was named also after the two–step feasible efficient GMM.

The 2SLS estimator, as its name indicates, is obtained by two consecutive OLS regressions: In the first stage, the fitted value of a firm’s efficiency is obtained from the OLS regression of the firm’s efficiency on 1, included explanatory variables and instrumental variables. In the second stage, we run another regression by OLS of dependent variable of the firm’s centrality–based capability on 1, included explanatory

y denotes the centrality–based partnering capability of a focal firm i , X_i denotes the matrix of the regressors including firm’s efficiency level and other explanatory variables, ß is the coefficient estimator which ranges from 0 to 1, and u_i is the error term.

variables and fitted value of the firm’s efficiency that was obtained in the first stage, which gives a consistent 2SLS estimator. Although the 2SLS estimator is consistent, it could be inefficient in the presence of heteroskedasticity, which indicates that the error terms do not have constant variance. This problem can be partially addressed through the use of robust standard errors in the 2SLS model, in which a consistent estimate of the variance–covariance matrix for the error term could be derived (Baum et al., 2003 and 2007). Another solution, introduced by Hansen (1982), is the optimal GMM method, which makes use of the orthogonality conditions to allow for efficient estimation in the presence of heteroskedasticity of unknown form. If heteriskedasticity is indeed present, the optimal GMM estimator is more efficient than the 2SLS estimator, whereas if in fact the errors are homoskedastic, the 2SLS estimator would be more preferable than the estimator of optimal GMM (Angrist and Pischke, 2009; Cameron and Trivedi, 2005). For this reason, a test for the presence of heteroskedasticity is necessary for deciding whether the 2SLS model or the optimal GMM is called for. In the model that was chosen here, the presence of heteroskedasticity is not completely clear since the standard tests such as Breusch– Pagan/Godfrey/Cook–Weisberg and White/Koenker test statistics show that heteriskedasticity is present in the model, while the Pagan–Hall test statistic (Pagan and Hall, 1983) indicates the error terms are homoskedastic. Therefore, both 2SLS estimator and GMM estimator will be applied successively to the model. The results from the methods of 2SLS, 2SLS with robust standard errors and optimal GMM will be presented in this paper.

In order to better exhibit the methods that will be used in the research, 2SLS estimator and optimal GMM estimator will be given in their original matrix form as follows. We now consider the model in equation (10) again, where X presents a n K× matrix of regressors with n being the number of observations. As discussed above, firm’s efficiency to select partnerships is considered to be endogenous, so we need to generate exogenous variation for firm’s efficiency level by choosing a n L× matrix of instrumental variables Z . In this case, the number of instruments excluded from the equation exceeds the number of included endogenous variables (L >K ), so the applied

model is overidentified. 2SLS is a common procedure for overidentified model. Given a set of K instruments: ˆ ₌ _{′ ′}₍ ₎−1 _′ ₌

X Z Z Z Z X P X where P_Zdenotes the projection matrix

−

′ 1 ′ ( )

Z Z Z Z , the 2SLS estimator can be written as

− − ′ ′ ′ ′ = 1 = 1 2 ˆ ₍ˆ ₎ ˆ ₍ ₎ SLS Z Z

The standard 2SLS estimator can be regarded as a special case of GMM estimator and under the exogeneity assumption of instrumental variable, E Z u( _i _i) 0= , we now shortly derive a linear GMM model. GMM requires that a certain number of moment conditions were specified for the model, such as L instruments generating a set of L moment conditions, g ß_i( )ˆ =Z u_i′ˆ_i =Z y_i′( _i −X ß_i ), where

g is L×1 matrix. These

moment conditions are functions of the model parameters and the data, such that their expectation is zero at the true values of the parameters: E g ß( ( )) 0_i = . Since each of the

L moment equations corresponds to a sample moment, we can write these L sample

moments as = ′ ′ =

_∑

− = 1 1 1 ˆ ˆ _ˆ ( ) n _i( _i _i ) i g ß Z y X ß Z u

n n (Wooldridge, 2002). The GMM method then minimizes a certain norm of the sample averages of the moment conditions,

′ =

ˆ ˆ ˆ

( ) ( ) ( )

J ß ng ß Wg ß where W is a L L× symmetric weighting matrix (Angrist & Pischke, 2009). GMM estimator is consistent for any symmetric positive definite weighting matrix W , however, the efficiency of this estimator is not guaranteed for an arbitrary W , which possibly leads to an inefficient estimator in GMM. Hansen (1982) chooses optimal weighting matrix _W =_S−1_{(S is the covariance matrix of the moment} conditions to produce the most efficient estimator) to produce the most efficient or optimal GMM estimator, which can be written as ˆ =₍ ′ ˆ−1 ′ ₎−1 ′ ˆ−1 ′

OGMM

ß X ZS Z X X ZS Z y

(Cameron and Trivedi, 2005; Wooldridge, 2002).

Even when 2SLS and optimal GMM is judged to be the appropriate estimation technique, we may still question its validity in a given application. “Good instruments” should be both valid and relevant (Baum et al., 2003). In order to evaluate the validity of the instruments, we may cast some lights on whether the instruments are independent from an unobservable error process in the context of an overidentified model. If orthogonality conditions were satisfied to the extent that instruments are uncorrelated with the error term, the instrumental variables would be valid. To test this validity of the instruments, we could make use of the overidentification test: In terms of GMM, the overidentifying restrictions may be tested via the commonly employed J statistic of Hansen (1982), while Sargan’s statistic (Sargan, 1958), which uses an estimate of the error variance from the IV or 2SLS regression with the full set of overidentifying restrictions, is largely used for 2SLS method. By computing the difference–in–Sargan test or difference–in–J statistics, the endogeneity test, which is essentially testing whether the instrumental variable method is required to estimate the

model, can be implemented (Baum et al., 2003). This test is equivalent to estimating the same regression but treating the regressor as exogenous, and then testing the corresponding orthogonality conditions. The null hypothesis of this test is that the specified endogenous regressor can actually be treated as exogenous. To address the relevance of the instrumental variable, we need to consider whether the instruments are correlated with the endogenous regressor. If the instruments are both correlated with the endogenous variable and orthogonal to the error process, the 2SLS estimator or GMM estimator will be consistent. However, to ensure an indeed good performance of the IV estimator, it should be considered whether the instruments are weak. The concept of a weak instrument is that the correlations between the endogenous regressor and the excluded instruments are nonzero but small (Cameron and Trivedi, 2005). Thus, if low correlation exists between the instrument and the endogenous variable being instrumented, the model is said to be weakly identified. In order to test the presence of weak instruments, the Stock–Yogo test (Stock and Yogo, 2005), which makes use of F–statistic form of the Cragg and Donald (1993) statistic, is commonly carried out.

The statistical methods of 2SLS and GMM and the tests for “good instruments” that are discussed above can be implemented by using Stata 11 (StataCorp, 2009), which is a statistical software package for data management, data analysis, and graphics. In this study, the data that was computed with Ucinet 6 (Borgatti et al., 2002) based on Recombinant Capital database (see Section 4.2.2) was imported into the software Stata, and processed with the relevant commands. In particular, the “ivreg2 package” was applied (Baum et al., 2002), which provides the extension to Stata software and is a suitable package for the panel data.

4.2.4 Choice of Instrumental Variables

As discussed in Section 4.2.3, due to reverse causality, a firm’s efficiency, which is likely to be endogenous, needs to be instrumented with exogenous variables. In order to generate sufficient exogenous variation, a few instruments were considered to solve the endogenous problem. One candidate for the instrument is the firm’s clustering coefficient, which is calculated by the proportion of cooperation that exists between firms and its neighbourhood divided by the number of cooperation that could possibly

exist between them. This concept could be further interpreted as the probability of which two collaborated partners of one firm in the network are connected to each other (Cantner and Rake, 2011), which is simply an indicator of the density of a firm’s local neighbourhood (Hanneman and Riddle, 2005). The clustering coefficient is likely to influence a firm’s efficiency to choose partners in the sense that the dense contacts among a firm’s partnerships may cause redundant information and consequently reduce a firm’s efficiency. This instrument can be calculated with the software Ucinet 6 (Borgatti et al., 2002). The second candidate for the instrument could be the sector dummy of biotechnology. The biotechnology driven by innovation and discovery is largely used in manifold industrial manufacturing processes. Its advanced technical process, which reduces the environmental impact, improves the process efficiency and lowers the production costs, has advantages over traditional pharmaceutical process (EU, 2007). In a study on the Canadian biotechnology industry, Baum et al. (2000) found that biotechnology firms that were better able to leverage alliances, in particular R&D alliances, grew at higher rates than others. Similar results were found in a comprehensive EU study on the biotechnology industry in Europe (EU, 2002). The question, whether the sector is biotechnology is important to the firm’s efficiency, because the biotechnology firms with newly updated, non–redundant information could efficiently choose the suitable partners by themselves in a rapidly developing technological environment.

Besides, a firm’s national–geographic origin may also influence its efficiency to the extent that firms from various countries may have different levels of efficiency in gaining technological information. For instance, in the US, biotechnology is characterised by a high degree of concentration of firms in a restricted number of geographic regions. A similar process of clustering has taken place across Europe, with examples such as the biotech–region Munich and the Medicon Valley shared by Sweden and Denmark. However, in comparison with the US company structure, the majority of European biotechnology clusters do not seem to be big enough to compete effectively with those in the US (EU, 2007). In order to control for these effects, a set of national origin dummies was considered: US, England, Germany, Denmark, Switzerland, Sweden and Ireland. Among these national dummies, US, Germany and Denmark indicate closer correlations to the efficiency of a firm. Therefore, these three country dummies were included into the instrumental variable list. Another candidate for the instrumental variable could be the firm’s age, which can be simply calculated from the

firm’s foundation year. Firm’s age is related to the firm’s efficiency in the way that older firms with a higher number of cooperative arrangements are more experienced in the industry than younger firms (Rothaermel, 2002). Hence, older firms are more likely to detect opportunities for building up non–redundant contacts in the network and are thus more efficient in choosing partnerships.

Therefore, clustering coefficient, firm’s age, sector dummy of biotechnology, and national dummies of US, Germany and Denmark were used in this study as instrumental variables to generate exogenous variation for the firm’s efficiency, which is considered to be endogenous. All these instruments are likely to influence a firm’s efficiency level, but will not directly determine the dependent variable of a firm’s centrality–based partnering capability.

In order to obtain data for instrumental variables, we collected information on the national origin, foundation year and industrial sector provided by each firm in our population. Various sources of information were used such as the Institute for Biotechnology Information (BioSpace, BioCentury, and Funding Universe), US Small Business Innovation Research (SBIR) / Small Business Technology Transfer (STTR), Bloomberg Businessweek and Washington Post’s Linkages.

4.3 Results

Table 5 provides descriptive statistics of explanatory variables for the 160 observations in the sample and a correlation matrix. The variances were relatively low on all variables since the sample only represents the prominent firms in the pharmaceutical biotechnology industry over years, which would not cause huge data differences between observations. And as would be expected, the dependent variable (the logarithm form of betweenness centrality) was highly correlated with the normalized number of firm’s partnerships. Table 6 displays the estimation results of instrumental variable panel models using Stata 11(StataCorp, 2009). In model 1 and model 2 the standard 2SLS procedure was used, in which the independent error terms are assumed to be homoskedastic, while in model 3 and model 4 the 2SLS estimator was also used but the error terms of the models are robust to heteroskedasticity. Optimal GMM method, which allows for efficient estimation in the presence of heteroskedasticity, was used for model 5 to model 7. It can be seen from Table 6 that

the p–value in the endogeneity test was less than 1% in all models, so we can reject the null hypothesis that the firm’s efficiency may be treated as exogenous. Thus, firm’s efficiency is endogenous and instrumental variable methods are the appropriate estimation technique in our setting. As Table 6 also shows, the p–values in the overidentification test for model 1—model 7 were all larger than 10%, hence, we cannot reject the null hypothesis that the instrumental variables are uncorrelated with the residuals, which implies the instrumental variables that we chose are valid. However, the F statistics in Stock–Yogo test (Stock and Yogo, 2005) suggest that these instrumental variables could probably be weak instruments and the models are therefore weakly instrumented.

In model 1, model 3 and model 5, we estimated firm’s centrality–based partnering capability as a function of its efficiency level, its dependency on complementary resource, its experience at managing partnerships, its duration in the partnerships and time effects (1996–1998). As time effects of 1996 and 1998 did not seem to affect firm’s ability to be central (see Table 6), we dropped both of them in model 2, model 4 and model 6. After dropping time effects of 1996 and 1998, firm’s efficiency did not appear to significantly influence firm’s partnering capability anymore in model 6, even though it exerted influence in model 5. So we reestimated model 6 by excluding the clustering coefficient and country dummy Denmark from the instrumental variable list. As a result, the significance level of firm’s dependency on complementary resources changed from level 10% to 5%, and firm’s efficiency had impact again on its centrality–based capability as shown in model 7 (Table 6). Also, as we respecified the instruments list, the p–value in the overidentification test became larger when comparing model 7 to model 6. With the same set of variables, the models using methods of 2SLS and 2SLS with robust standard errors had exactly the same coefficient estimates, but different standard errors due to the potential presence of heteroskedaticity. However, after we reestimated the overidentified model using the optimal GMM method, the coefficients’ point estimated changed slightly and standard errors decreased (Table 6), which generally indicates the model to be more efficient.

We hypothesized that firm’s efficiency level is expected to have a positive impact on firm’s centrality–based network capability (Hypothesis 1). The estimates of the indicator for firm’s efficiency (efficiency size) were positive and significantly different from zero in the 2SLS models, 2SLS with robust standard error models and also in the optimal GMM models (only model 5 and 7). Thus, the instrumental variable methods

provide evidences that support a firm’s efficiency to choose suitable partners as an important factor to determine its centrality–based partnering capability. Hypothesis 2 argues that the more a firm is dependent on its complementary resources, the higher the centrality–based network capability of this firm is. Model 1—model 7 all present that the estimates of the indicator for dependency (hierarchy measure12_{) have a} significant, negative effect on firm’s betweenness centrality. So a firm’s dependency on its complementary resources was identified in the present study as another crucial factor for determining a firm’s centrality–based partnering capability. Hypothesis 3 predicted that a firm with more experiences at managing partnerships tend to have a larger centrality–based network capability. The estimates of the indicator for firm’s partnering experiences (normalized number of firm’s partnerships) in all of the models in Table 6 are positive and differ significantly from zero which implies a positive impact on a firm’s ability to act centrally. Firm’s partnering experience is therefore also an essential determinant for its centrality–based network capability. In sum, all three hypotheses are largely supported by the instrumental variable panel models and the results clearly indicate that a firm’s efficiency, its dependency on its complementary resources and its experience at managing its partnerships are relevant determinative factors for a firm’s centrality–based partnering capability.

4.4 Discussion

Firms that are centrally positioned within a network can better control and exploit worthwhile opportunities for obtaining information through links to other firms, and in turn gain competitive advantages in the marketplace. This central position of a firm plays an especially important role in the high technology industry with substantial innovation and knowledge transfer between different sectors. From a managerial perspective, positioning the firm centrally requires capable managers to improve their information efficiency and their skills to choose suitable partners. The results of the present study suggest that it is beneficial for managers to get access to information through a number of diverse contacts, avoiding duplicating contacts which leads to inefficient networks. Besides, the manager may also keep in mind that a partner may

12_{Hierarchy measure is negatively related to firm’s dependency on its complementary resources as} discussed in Section 4.2.2.

either unpredictably free–ride by limiting its dedication in an inter–firm cooperation or simply adopt opportunistic behaviours, which could cause informational hurdles in the cooperation network (Gulati, 1995). Thus, a good access to market information is essential to find an appropriate partner. Firms can learn about potential partner’s capability and reliability from many sources, one of which is their network of prior collaborators, which enhances trust both by providing information about each other’s

In document Inter-firm R&D networks in pharmaceutical biotechnology : what determines firm's centrality-based partnering capability? (Page 34-48)