The sample frame for this research on Scottish high technology was developed by the method of multi-stage sampling technique and involved four stages. In assembling this data
set, an attempt was made to confine the sample to the hi-tech sectors, since the focus of this research was on hi-tech performance, innovation and networks. The first stage thus comprised cluster sampling, which was conducive to streamlining the research focus on to the key hi-tech sectors from within the hi-tech population in Scotland as a whole. The selection of the optoelectronics, microelectronics, life sciences, digital media and software was done in the light of Scottish Enterprise’s cluster policy initiatives on key technology clusters, (refer chapter 4, section 4.4; chapter 1, section 1.5 for a detailed account). This method was quite efficient considering the cost of search and also the fact that these five clusters were in many ways the exemplars of Scottish high technology.
Next, five databases of the complete list of firms in each of these five sectors were constructed. This represented the five strata, and the proportions of firms in each of these five strata were noted. Thus the variable ’sector’ was used as the stratification factor. In the next stage, the firms in each of the five databases were further stratified to a number of different sub-strata depending on each sector composition. This is illustrated in section 6.3. This was made possible by contacting various organisations such as Biotech Scotland, Scottish Optoelectronics Association (SOA), Scottish Microelectronics, Scotland IS for software, Interactive Tayside for digital media etc. A number of other websites also provided the details.
In the final step technology-based firms alone were extracted from the databases. Firms in SIC-based sectors were isolated using the SIC codes that comes under hi-tech definition by R. L. Butchart 1987 and C. Thompson 1987. The SIC codes are presented in Appendix 5. For non-SIC based sectors, other sources such as the Department of Trade and Industry (DTI 2000, 2001) were used. In the end, 836 technology-based firms constituted the sample frame for this empirical study.
6.2.1 Sampling Problems in constructing the Sample Frame
A number of problems were addressed in the process of identification of firms. In chapter 1, section 1.3, the cluster definition, description and their components were described. It is seen that clusters are essentially ‘a concentration of competing, collaborating and interdependent companies and institutions which are connected by a system of market and non-market links’, (DTI 1998). Thus there exist a number of varied components like academic institutions, research organisations, trade bodies etc. In this research, the focus is
only on firms or businesses in these clusters or sectors. Hence, the first step involved cleaning the database of all other components by deleting other components and extracting only the relevant firms. Moreover, all the firms that were extracted from the five hi-tech sectors were not necessarily technology-based firms, as the firms that were left included both technology-based firms as well as non-technology based firms, such as recruitment consultancies, marketing firms etc. Hence some steps had to be taken to isolate the technology-based ones. The second problem in sampling thus involved isolating technology-based firms from the rest. For their research, Butchart’s new definition of UK high technology (Butchart 1987) was based on UK SIC codes and Thompson’s definition of high technology was based on US SIC codes (Thompson 1987). In both cases the 4-digit level were used (see Appendix 5).
But even at the 5-digit level, SIC categories were not able to capture the activities of firms in certain hi-tech sectors. The selection was not much of a problem for optoelectronics, microelectronics and digital media, as most of the firms satisfy these criteria. But the selection for life sciences and software was more time consuming, as only a fraction of the firms in these two sectors were in fact technology-based. Various sources and information were referred to, and had to be included, such as the methods used by DTI for the identification in the non–SIC-based clusters. The DTI, in their cluster mapping used local information and other sources (Dun & Bradstreet) to identify the non-SIC-based ones, such as optoelectronics, biotechnology etc (DTI 2001 vol. 3), (see chapter 4, section 4.3).
Moreover, in the case of software, the industry is a complex phenomenon that encompasses a very broad range of industrial classifications. Software sector activity itself is subsumed under a wide range of industries. Currently there is no single specific definition or formal industrial classification of the ‘software industry’ (McNicoll & Kelly 2003). Given the complexities surrounding any definition of software sector activity, too narrow an identification runs the risk of considerable areas of ‘software sector’ activity being omitted. Conversely, too broad an identification can end up making any definition derived meaningless, since it will in effect comprise the whole economy as software is used, adapted or developed in almost every industry in one way or another. In this research the report to Scottish Enterprise (SE), ‘Economic Frameworks for policy relevant analysis of the Software Sector in Scotland’ (McNicoll & Kelly 2003) has been suggestive for identifying the software firms.