• No results found

Evaluating a Sampling Frame of Private Security Firms

N/A
N/A
Protected

Academic year: 2021

Share "Evaluating a Sampling Frame of Private Security Firms"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

Evaluating a Sampling Frame of Private Security Firms

Bonnie E. Shook-Sa

1

and Marcus Berzofsky

1

1

RTI International, PO Box 12194, Research Triangle Park, NC 27709

Abstract

This paper presents an approach to evaluate the quality and coverage of a frame of business establishments. Typically coverage cannot be assessed for surveys that utilize commercial databases of establishments because no gold standard list exists outside of the Census Bureau. However, this study was able to utilize a state-based list that served as a proxy gold standard. As a part of a design study for the Bureau of Justice Statistics, we developed and evaluated a frame of contract security firms for the National Private Security Survey. This frame combined lists of private security industry companies from D&B and InfoUSA into a single “superframe.” Frame quality was evaluated though the percent of establishments that were incorrectly classified as security companies and by the number of duplicate companies. Over 81% of superframe companies were unique companies and were correctly classified. Coverage of the target population was evaluated using licensing lists from 17 states that required contract security firms to be licensed. We discuss difficulties of comparing lists from multiple sources and attempts to resolve these difficulties and calculate a coverage rate.

Key Words:

coverage, frame quality

1. Introduction

Private security is essential to ensuring the security and safety of persons and property, as well as intellectual property and sensitive corporate information. Private security officers are responsible for protecting many of the nation’s institutions and critical infrastructure systems, including industry and manufacturing, utilities, transportation, and health and educational facilities.

Some components of the private security industry have not

been studied in detail, whereas others have been studied but the existing data are

either inconsistent or outdated (Strom et al. 2010). Given this, there is a clear need

for additional research in this area.

I

n order to expand its data collections on law enforcement to include private security statistics, the Bureau of Justice Statistics (BJS), the statistical arm of the U.S. Department of Justice, conducted the design phase for the National Private Security Survey (NPSS). If implemented, the goal of the NPSS would be to provide much-needed information on various types of security professionals, including contract security officers, employers of proprietary security officers and the officers themselves, and contract security personnel who provide primary functions other than guarding.

One of the primary goals of the design phase of the NPSS was to develop and evaluate a sampling frame of private security companies. Developing a sampling frame of private security firms is challenging because no gold standard list of companies exists outside of

(2)

the Census Bureau’s business register, and the resources available to evaluate the sampling frame are limited. Therefore, our analysis focused on two key questions:

1. What are the potential frame sources that would allow inference to the target population?

2. What is the quality and coverage of the frames that meet our inference requirements?

Our findings from these two questions helped inform our recommendation to BJS on how the frame for any implementation of the NPSS should be constructed.

2. Target Population and Sampling Population

While private security officers work in a wide range of industries, the target population for this analysis was restricted to contract security companies whose primary function is to provide security guard services, that have been in business for 4 or more years, and that have five or more total employees, and the private security officers that work for them. We determined, based on past experience with establishment surveys, sampling companies that have been in business for less than 4 years and have fewer than 5 employees would require significant effort for relatively small returns. According to the Bureau of Labor Statistics (2012) only half of new businesses survive for 4 or more years. Because one goal of the NPSS is to track estimates over time, we want to create a target population that will remain relatively stable, allowing comparisons to be made about changes in private security rather than changes in the composition of the target population. Furthermore, companies with small employee sizes pose several data

collection challenges that increase cost. These companies are more likely to be ineligible due to frame error and are less likely to participate.

Although making these restrictions on the target population increases the efficiency of the sample design, it does not greatly diminish the coverage of the private security

population. Based on counts from the Dun and Bradstreet (D&B) frame, restricting the target population based on the size of the company and the number of years in business reduces the coverage of companies to 33.1% of all contract security firms; however, it reduces the coverage of private security officers only to 88.7% of all private security officers (i.e., there are a lot of small companies, but they employ a relatively small number of the private security officers employed in the U.S.).

The sampling population for this study is all companies whose primary industry is the North American Industrial Classification System (NAICS) code 561612 (security guard and patrol services) who have been in business for 4 or more years, have five or more employees, are not identified as a branch location, and are not considered a subsidiary.

3. Determining Potential Frame Sources

The NPSS sampling frame needed to accommodate the goals of the study while maintaining reasonable levels of coverage and efficiency, and controlling costs. We evaluated several potential frames based on data quality, accessibility of data, and the potential to accommodate the sampling goals.

(3)

To compare potential frame sources, a set of evaluation factors was developed based on measures that could affect the quality of the data (data sources, data updates, and coverage of target population), the ease of accessibility of the data (whether or not the source would allow us to draw our own samples), and the potential to accommodate various sampling approaches. These factors included the following:

 whether the frame was organized by the NAICS or Standard Industrial Classification (SIC) system,

 whether the frame would allow us to subset to only headquarters and single location companies (i.e., exclude branches),

 whether the frame allowed for stratification by company size, and  whether the frame provided detailed contact information for sampled

companies.

Potential sources were identified through the literature review, by a review of frames from other establishment studies, and extensive Web searches. The final set of sources that were considered were Dun & Bradstreet (D&B), InfoUSA, Experian, American Business Information (ABI), Mailing Lists Direct, Martin Worldwide, Ward’s Business Directory, and state licensing board lists.

Except for the state licensing boards, all the frame sources are commercial companies that maintain databases covering all industries. The state licensing boards maintain and either publish or release lists of licensed security firms in the state. State licensing board lists are only available in the states that license contract security firms and are willing to release this information. In addition, these lists contain varying levels of contact

information for licensed companies, and most do not include the size of the business or the number of years licensed. Therefore, licensing boards do not have adequate coverage of the target population. However, because they provide a comprehensive list of contract security guarding companies in states that furnish licensing lists, they can be used to evaluate the coverage of a potential sampling frame (see Section 4.3 for further information).

Most of the commercial vendors considered as potential frame sources for the NPSS obtained national lists of companies from similar sources, including telephone books, credit reports, public records, and annual reports. Database updates ranged from bimonthly to annually. This is an important consideration in that more frequent updates are desirable when selecting a frame. The most up-to-date lists are less likely to contain companies that are no longer in business and more likely to obtain newly formed companies.

The biggest difference between the commercial sources is their ability to accommodate various sampling approaches. Of all of the frames considered, only D&B, InfoUSA, and Ward’s Business Directory allow for sampling by NAICS code. Sampling by NAICS rather than SIC code is more desirable since the NAICS industries are more clearly defined and since they coincide with BLS and Census published estimates. Additionally, most sources would allow us to subset to headquarters and single location companies only. This allows for the exclusion of branch offices from the frame. With the exception of Ward’s Business Directory, all of the commercial sources allow for stratification by

(4)

company size. Furthermore, all commercial sources can provide detailed contact information for sampled companies.

Based on an evaluation of the frame sources, the two most appropriate vendors for this study’s sampling frame are D&B and InfoUSA. These two vendors allow for sampling at the NAICS industry level. They also allow exclusion of branch locations, companies with fewer than five employees, companies with fewer than 4 years in business, and

companies outside of the target industries. Data from these sources are nationally representative and have been used as sampling frames in other national establishment studies such as the U.S. Department of Labor’s O*NET Data Collection Program. Due to the increased speed in which businesses open and close, no frame, commercial or otherwise, can perfectly cover the target population. All frames will contain under coverage (i.e., businesses in the target population, but not identified as such on the frame), over coverage (i.e., business not in the target population, but identified as such on the frame), and duplicate businesses (i.e., multiplicities). In the case of the NPSS

examples of these types of frame error include the number of employees or the number of years the business has been operating being misspecified or the same company being listed under two different names. In order to minimize frame error, we considered three potential frame options:

1. D&B only 2. InfoUSA only

3. A ‘superframe’ that takes the union of the D&B and InfoUSA frame

If the overlap between the D&B and InfoUSA frames is small, a superframe would be the best option for the NPSS, as it would increase the coverage of the target population while introducing few multiplicities when combining the frames. However, if the overlap is large, it would not be cost effective or efficient to combine the frames. A complex matching process would be required and multiplicities would be introduced with little coverage gain. Our evaluation allowed us to select the most appropriate sampling frame for the NPSS based on frame quality, the overlap between the two frames, and frame coverage.

4. Frame Evaluation Methods

We evaluated the quality and coverage of the potential NPSS frame sources. From D&B and InfoUSA we purchased the list of companies whose primary industry was contract security guarding. We limited the lists to the 18 states1 we expected to obtain state licensing lists with detailed contact information (name, addresses, and phone number). The comparison of the commercial lists and the state licensing lists would be used to evaluate frame coverage.

(5)

4.1 Frame Quality Evaluation

Because company frames are known to have some misclassifications (i.e., companies classified in one industry that are actually in another), the business names of the purchased companies were evaluated and classified based on their eligibility for the study:

1) Guard Companies

2) Non-Guard Security Companies 3) Non-Security Companies 4) Duplicates

Guard companies were those with names that were either clearly contract-guard

companies or where their primary industry could not be determined based on the business name. Non-guard security companies were those with names associated with detective agencies, armored car companies, or security systems companies. Non-security

companies were those whose names indicated that they were not in the security industry (e.g., Sally’s Beauty Salon).

Duplicates were identified by removing all blanks and hyphens from the business names and then evaluating which companies had the exact same business name. While this approach catches most of the potential duplicates, it misses duplicate companies with abbreviations in one entry and the name spelled out in another, or companies that are spelled slightly differently in different records on the frame.

In a true implementation of the study, both guarding companies and non-guard security companies would be fielded. However, companies that are clearly non-security

companies would not be fielded. Duplicates would be removed from the frame to reduce multiplicities.

After classifying the companies on the D&B and InfoUSA frames into these four categories, we applied the distribution based on the 18 states evaluated to the national frame count to obtain national totals of the number of companies we would expect in each category. We then compared the quality of the two potential frames based on these rates.

4.2 Frame Overlap

After evaluating frame quality separately for each source, we combined the D&B and InfoUSA lists into a single superframe for the states in the evaluation. The amount of overlap indicates whether a single or dual frame would be most appropriate for this study. While combining two frames that are not entirely overlapping increases coverage of the target population, it also increases inefficiencies by introducing duplicate companies and ineligibles from both frames. An extensive matching procedure and frame cleaning process would be necessary to remove the duplicates and ineligibles, so the added coverage benefits would need to be substantial to justify the increased costs of creating a superframe. If the overlap between the frames is significant (e.g., 80% to 90%), a single frame should be selected because the addition of the second frame would not add much additional coverage while increasing costs significantly. However, if the overlap is small, a dual frame should be considered because neither frame alone has adequate coverage of the target population.

(6)

Matching was done independently by each source, and then discrepancies in the matching were resolved manually. Companies were matched based on the name, address, and phone number provided by each source. Companies were matched to the entire frame (not just those in the guarding industry) so that differences in classification could be evaluated as well. If either source indicated that the company was eligible based on industry, size, or years in business, the company was included on the superframe, regardless of how it was classified on the other frame. Companies identified as non-security companies or duplicates in the frame quality evaluation were excluded from the superframe. After combining the sources, we evaluated the level of overlap between the frames and determined whether a single frame source could be used or whether the superframe would be required.

4.3 Frame Coverage

For many establishment surveys it is impossible to assess frame error because there is no gold standard (i.e., complete) list of companies that exists. However, in the case of private security, 39 states require that all contract security guard or patrol companies be licensed by the state. Thirty-three of these states provide a list of these companies through their state website or through an information request. Since all companies operating in these states are required to be licensed, these lists represent the complete set of security guard and patrol companies legally operating in those states. These lists can be used as a gold standard comparison to a frame source that has all the necessary information to conduct data collection.

Frame coverage was evaluated by matching the sampling frame to state licensing lists for 17 states2. State licensing lists were first cleaned by removing duplicates (based on company name), companies with expired licenses, branch offices, inactive licenses, and non-guard security companies. Because the size and years in business for companies on the state licensing lists was not provided and could not be determined, we had to evaluate superframe coverage of companies with any number of employees or years in business (not just those in the target population). Because of the error inherent in this matching process, the match rate between the licensing lists and the superframe was treated as a lower bound for the coverage of the superframe. An upper bound for the coverage of the superframe was calculated based on the frame counts.

5. Frame Evaluation Results

5.1 Frame Quality Evaluation

Frame Source A3 contained approximately 4,000 companies in the target population (i.e., five or more employees and 4 or more years in business4). Approximately 65% (2,600) of these companies were contained in the 18 states used in the frame quality evaluation. As discussed in Section 4.1, these companies were evaluated and classified based on their eligibility for the study: guard companies, non-guard security companies, non-security

2 NY was included in the frame quality evaluation but not the coverage evaluation due to the

availability of the state licensing list.

3 The identities of the D&B and InfoUSA frames are masked to avoid direct comparisons between

the sources.

(7)

companies, and duplicate companies. The distribution of these 2,600 companies by classification is presented in Figure 1 below. In a true implementation of the study, 82% of companies on the Source A frame would be eligible to be fielded. The duplicates and non-security companies would be excluded prior to sample selection.

Figure 1. Distribution of Source A Companies by Eligibility Status

Frame Source B contained approximately 3,100 companies in the target population, 2,000 of which were contained in the 18 states used in the frame quality evaluation.

Like Source A, these companies were evaluated and classified into the four eligibility categories. The distribution of these 2,000 companies by classification is presented in Figure 2 below. In a true implementation of the study, 88% of companies on Source B would be eligible to be fielded. The duplicates and non-security companies would be excluded prior to sample selection.

Guard

69%

Non-guard

security

13%

Non-security

9%

Duplicates

9%

(8)

Figure 2. Distribution of Source B Companies by Eligibility Status

5.2 Frame Overlap

After evaluating the quality of the D&B and InfoUSA frames individually, we then combined the frames and evaluated the level of overlap, as discussed in Section 4.2. At the national level, the superframe is the union of the Source A companies and the Source B companies identified in the frame evaluation as either guarding or non-guard security. To determine the level of overlap between the frames, we matched D&B companies in the target population to the entire InfoUSA frame, and InfoUSA companies in the target population to the entire D&B frame. For each company on the D&B frame that matched to the InfoUSA database, the two frames could either agree or disagree on the industry classification (i.e., guard vs. non-guard) and on the eligibility classification based on the number of employees and years in business. Using the match results, we classified each company into one of five categories:

 Source B only (category 1)

 Source B classified as guard and Source A classified as non-guard (category 2)  Source B and Source A classified as guard (category 3)

 Source A classified as guard and Source B classified as non-guard (category 4)  Source A only (category 5)

The distribution of companies across the five categories is depicted in Figure 3. The superframe contains approximately 5,000 guard and non-guard security companies that

Guard

76%

Non-guard

security

12%

Non-security

8%

Duplicates

4%

(9)

would be eligible to be sampled. Of the guard and non-guard security companies, 61% of the companies on the superframe are present in both databases. However, only 27% of superframe companies are classified as guards on both frames. This means that if only companies in the guarding industries were purchased and combined, there would only be a 27% overlap between the two frames.

Figure 3. Distribution of D&B and InfoUSA Companies by Frame Classification

5.3 Frame Coverage

The state licensing lists for the 17 states evaluated were combined to obtain a list of approximately 12,500 licensed contract security companies. The list of state-licensed companies were matched to the D&B and InfoUSA frames based on the company name, address, and phone number using D&B and InfoUSA’s matching algorithm. For each match, D&B and InfoUSA provided the frame ID number corresponding to each matched state company so that we could map state companies to the superframe.

After obtaining the results of the matching, duplicate and ineligible companies from the state list were removed. Duplicates were identified based on the name of the company and through the D&B and InfoUSA matching process. Ineligible companies included those whose licenses had expired as of the date we obtained the licensing list, those identified either by the states or by D&B or InfoUSA as branches, those that were not active or subject to review, and those that were identified by the states as non-guard security companies. After removing duplicates and ineligible companies, there were a total of approximately 6,500 eligible licensed contract security guarding companies associated with the 17 states.

The superframe associated with these 17 states was composed of approximately 8,400 companies (including guarding companies of all sizes and years in business). After resolving differences between the D&B and InfoUSA match results, a final set of

(10)

matches from the state lists to the superframe was obtained. The results are presented in Figure 4.

Figure 4. Overlap Between the State Licensing Lists and the Superframe

D&B and InfoUSA identified 1,880 matches between the state licensing lists and the superframe. However, these counts are subject to errors associated with the matching process. Because the companies being matched came from different sources, slight discrepancies in the company name and address (e.g., abbreviations or misspellings) could lead to mismatches between companies that are actually matches. For this reason, the coverage calculated based on the matching can be considered a lower bound for the true superframe coverage. This lower bound for the coverage was 28.8% (1,880 / (4,644 + 1,880)).

Because of the potential for matching error, the frame evaluation of the superframe was used to determine which of the 6,500 superframe companies that could not be matched were potentially contract security guard companies listed on the state licensing lists. Companies identified as guarding companies were considered potential matches to the state license list. Those that matched to expired or ineligible state licensed companies and those that were identified during the evaluation as non-guard security or non-security companies were classified as over coverage. Figure 5 below breaks down the 6,541 companies on the superframe that did not match to the state lists by potential match status.

1,880

4,644

6,541

(11)

Figure 5. Overlap between the State Licensing Lists and the Superframe by Potential

Match Status

Based on the frame evaluation, 4,387 of the 6,541 superframe companies that could not be matched to the state license list are potentially security guard companies that

correspond to the 4,644 unmatched companies on the state license lists, but failed to do so due to matching error. If all 4,387 companies truly match to the state licensing lists, the superframe coverage would be 96.1% ((1,880 + 4,387) / (1,880 + 4,644)). This can be considered the upper bound for the superframe coverage in the 17 states evaluated. Therefore, superframe coverage of contract security guarding companies is between 28.8 and 96.1%.

Even without any matching error, there is over coverage on the superframe. At a minimum, the over coverage is 25.6% (2,154 / (1,880 + 6,541)). However, 1,306 of the 2,154 companies (60.6%) that are classified as over coverage would not be in the sampling population due to size or years in business. This suggests that over coverage of the target population (i.e., companies in business 4 or more years that have at least five employees) would be lower than the over coverage of companies of all sizes and years in business.

1,880

4,644

Overcoverage

2,154

State Licensing Lists

Superframe

Potential

matches

(12)

6. Conclusions

If the NPSS is implemented, we recommend developing a superframe based on the combination of the D&B and the InfoUSA frames. The overlap between the D&B and InfoUSA frames is quite low, so the superframe provides a significant increase in coverage compared to the D&B frame alone or the InfoUSA frame alone.

We recommend purchasing the entire D&B and InfoUSA frames for the industries of interest for the defined target population. The D&B list will then be matched against the entire InfoUSA database and the InfoUSA list will be matched against the entire D&B database in order to correctly define the overlap between the frames and remove duplicates. After combining the frames, the superframe will be cleaned extensively to identify potential misclassifications within security frames and companies that are not part of the security industry.

Unfortunately, due to the inherent error in the matching procedure, a definitive coverage rate could not be determined for the superframe. Instead, we established upper and lower bounds for coverage based on the 17 state evaluation. Better matching techniques are necessary not only to establish a reliable coverage estimate, but also to minimize multiplicities on the superframe when combining the D&B and InfoUSA frames.

Acknowledgements

The authors would like to acknowledge the contributions of Lynn Langton and Brian Reaves of the Bureau of Justice Statistics, U.S. Department of Justice for their contributions to this research. The authors would also like to acknowledge RTI staff members Susan Kinsey and Amanda Lewis-Evans.

References

Bureau of Labor Statistics. (2012). Survival of private sector establishments by opening

year. Retrieved from http://www.bls.gov/bdm/us_age_naics_00_table7.txt

Strom, Kevin, Marcus Berzofsky, Bonnie Shook-Sa, Kelle Barrick, Crystal Daye, Nicole Horstmann, and Susan Kinsey. 2010. “Private Security Industry: A Review of the Definitions, Available Data Sources, and Paths Moving Forward.” National Criminal

Justice Reference Service. Retrieved from

References

Related documents

By formulating the video denoising problem to a low-rank matrix completion problem, our proposed algorithm does not assume any specific statistical properties

Graphs showing the percentage change in epithelial expression of the pro-apoptotic markers Bad and Bak; the anti-apoptotic marker Bcl-2; the death receptor Fas; the caspase

It was decided that with the presence of such significant red flag signs that she should undergo advanced imaging, in this case an MRI, that revealed an underlying malignancy, which

The national health priority areas are disease prevention, mitigation and control; health education, promotion, environmental health and nutrition; governance, coord-

The ethno botanical efficacy of various parts like leaf, fruit, stem, flower and root of ethanol and ethyl acetate extracts against various clinically

The paper assessed the challenges facing the successful operations of Public Procurement Act 2007 and the result showed that the size and complexity of public procurement,

There are infinitely many principles of justice (conclusion). 24 “These, Socrates, said Parmenides, are a few, and only a few of the difficulties in which we are involved if

19% serve a county. Fourteen per cent of the centers provide service for adjoining states in addition to the states in which they are located; usually these adjoining states have