Analyzing Qualitative Data with Computer Software

(1)

Analyzing

Qualitative

Data

with

Computer

Software

Eben A.

Weitzman

Objective.

Toprovide healthservicesresearchers withanoverviewofthequalitative data analysis process and the role of software within it; to provide a principled

approach tochoosingamongsoftwarepackagestosupportqualitative dataanalysis;

toalert researchers tothepotentialbenefits andlimitationsofsuchsoftware;andto provide an overview of thedevelopmentstobeexpectedinthe fieldinthenearfuture. Data Sources, Study Design, Methods. This article does notinclude reports of empiricalresearch.

Condlusions.Software forqualitativedataanalysiscanbenefit the researcherinterms ofspeed,consistency, rigor, andaccess to analytic methodsnotavailable by hand. Software,however,is not areplacementformethodologicaltraining.

Key Words. Qualitative research, qualitative data analysis, software, CAQDAS,

mixed-methods research

As health services researchers increasingly turn to qualitative and niixed

(qualitative

and

quantitative)

researchmethods, there is increasing interest in finding and using new tools and methods for the analysis of qualitative data. Asrecentlyasthemid-1980s,mostqualitativeresearcherswerecarrying out the mechanics of their analyses by hand: typing up field notes and interviews, photocopying them, marking them upwith markers or pencils, cutting and pasting themarkedsegmentsontofile cards, sorting and shuffling thecards, and typing up their analyses. Some were beginning to use word processors for the typing work, and just a few were

beginnng

to experiment with database programs for storing and accessing their text. Most qualitative methods textbooks at the time (e.g., Bogdan and Biklen 1982; Goetz and LeCompte 1984; Lofland and Lofland 1984; Miles and Huberman 1984) madelittle, ifany, reference to the use of computers. A couple of programs designed specificallyfor the analysisofqualitative data were justbeginning toappear

(Drass

1980; Seidel and Clark 1984; Shelly and Sibert 1985).

(2)

1242 HSKHealthServicesResearch34:5 Part II

(Decmber

1999)

But thelandscape has changed dramatically. There has been an out-pouringofjournalarticles,aseriesofinternational conferencesoncomputers and qualitative methodology, thoughtful books on the topic (Fielding and Lee 1991, 1998; Kelle 1995; Tesch 1990; Weitzman and Miles 1995b), and specialjournalissues(Mangabeira1996;Tesch1991).Atthetimethelate Matt Miles andI wrote Computer Programsfor QualitativeDataAnalysis

(Weitzman

and Miles 1995b), we reviewed no fewer than 24 different programs that wereuseful for analyzingqualitative data. Half ofthoseprograms had been developedspecificallyfortheanalysisofqualitativedata,whilethe other half had beendeveloped for moregeneral-purposeapplications suchas textsearch and storage.Since then, the field hascontinued to grow

rapidly.

Programsare being revisedat aregular rate, andnewprograms appearonthescene atthe rateof one or two a year. Yetthe software varieswidely, andit remainsvery much the case that there is no one bestprogram.

Inthiscontext,aresearcher needstobe able tomakea

principkd

choice of software:onethat matches thecapabilitiesof the softwaretothespecific needs ofthe researcher and the project To thatend,this articleprovidesanoverview ofthe range ofusesof computer softwareinqualitativeresearch; describesthe major types of softwareavailable; provides an approach to needs assessment thatcanmatch software choice tospecificresearchers and projects; addresses some common questions about whether and why researchers should use software; andlaysout someofthefuturedevelopments that are either hoped fororclearlyonthe horizon.

THE

ROLE OF COMPUTERS

IN

QUALITATIVE

RESEARCH

In order to be able to discuss the role of computers in qualitative research, it will be helpful to identify the major steps in the analysis process. The modelinFigure 1, and the description that follows, are intended as a general

Portions ofthis artideare adapted from Miles and Weitzman(1996)and Weitzman and Miles

(1995b).Most of the specific programs alluded to here are reviewed in detail(sometimes in

slightly earlier versions)inWeitzman and Miles(1995b).

AddresscorrespondencetoEben A. Weitzman, Ph.D., Assistant Professor, Graduate Programs inDispute Resolution, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, MA 02125. Thisarticle,submitted toHealthSericesRescardlonFebruary 8, 1999, was revised and accepted forpublication onJune24, 1999.

(3)

Analyzing QualitativeDatawithComputer

Software

representation of those steps, one that sees many and wide variations in practice,andnot asanysortof prescriptivemodel.

THE ANALYSIS PROCESS: FROM

RESEARCH QUESTIONS TO

CONCLUSIONS

Ingeneral, researchersbegin with a set ofresearch questions andmovetoward reaching conclusions. Data are collected in order to answer the research questions, and in qualitative studies the data are often voluminous. The researcher then faces the task of somehowreducing the data into a form inwhich it can beexamined for pattems andrelationships.

In mostapproaches, the researcher will goabout this through develop-ing somesortofcodingscheme:asetoftagsorlabels representing theconceptual categories intowhichto sortthe data. These may bedevelopedeitherapriori fromthe conceptualframeworkdriving the study,inductivelyastheanalysis proceeds and the analyst begins to identify issues in thedata, or by some combination of thetwo.Segmentsofthedataaremarked with relevant codes

(coded).

In many cases, researchers write memos as they code, recording emergingideas and early conclusions about both theory and methods. As insights accrue, it often becomes useful to search back through the data for places where specific wordsor phrases are used, and to locate related phenomenainthetext, inorder to both code these newchunks and check the validity of emerging conclusions. It may sometimes beuseful to create pointers

(or links)

between differentplacesinthetextwhere thesameissues arise, as,forexample,wheninone interview apatientdescribes an episode thatis elsewherealso described by his or her caregiver. This whole process willinmany cases give rise to modification of any a priori coding scheme, and manyresearchers follow an iterative process, makingrepeated coding passes

through

thedata.

The next step in reducing the data is often to retrieve the chunks of text associated with particular codes, reading these passages to refine the understanding ofthat conceptual category. It may also be desirable to retrieve textaccording to combinations ofcodes-forexample, tosee where aconcepthavingtodowith aparticular caregiver attitude, say,"empathizes with patient,"coincides with a particular context, say,"extended one-on-one caregiver-patient contact."Theresearcher would then be able to read all of thechunks

(text passages)

whereboth ofthese codes had beenapplied, to

(4)

1244 HSR HealthServicesResearch34:5Part II(December 1999)

Figure 1: FromResearchQuestionstoConclusions

RESFARCHQUESMIONS7 I- CONCNEPIuAL DATA FRAMEWORK COLLECION RETRIEVE CHUNKS \ CODING SCHEME

4

"__ CODE CHUNKS M MOING *~ WORDSEARCHING - DATALINKING REDUCE DATA ENTER INDISPIAYS

Conditions Users Administrators

Commitment Understanding Mastery etc. etc. DRAWCONCLUSIONS/VERIFY CopyrightC) 1995byEben A. Weitzman andMatthewB. Miles

seewhat,if any,relationship might

exist

between thetwo.Dependingonthe natureofthestudy,itmight also be desirabletoidentify all of the caseswhere bothofthese codesapply,evenifthetwoconceptsdonothappento emerge fromthesame textchunk.

Inthisway,theresearcherbeginstobe ableto write summaries ofthe mainconceptualissuesthatappearinthe data. These preliminarywrite-ups havethevirtueofbeingmuch smallerinphysical lengththantheoriginal tran-scripts,makingthem that muchmoreaccessibletothe researcher'sscrutiny.

Commitment

Mastery

(5)

AnabzingQyalitativeDatawithComputer

SoJfware

They also have the disadvantage ofbeingonestep removed from theoriginal data. It istherefore important for the researchertohave the

ability

toexamine and re-examine the underlying data as analysis and write-up proceeds, in order to continually check interpretations against the data. Finally, many researchers enter their summaries into displays, forexample, text matrices ornetworkdiagrams,that aid inidentfying patternsandrelationshipsinthe data

(Miles

and Huberman

1994).

In Figure 1, the displaysare offered as an example, and the categories are adapted from the work ofHuberman and Miles (Huberman andMiles 1983, 1984; Miles and Huberman

1994).

Theexamplematrix iscomposed of rows representing several conditions for success of a school innovation

(commitment,

understanding,

mastery)

and columnsrepresenting somekeystakeholdergroups: users oftheinnovation

(teachers)

and the school administrators. Thecells ofthe matrix would be filled with summaries ofeachstakeholder group's views of eachcondition, allowing the researcher to look forpatterns across the data. The network diagram, similarly,shows the researcher's representation oftheconnections

among

the differentconditions.

From these summaries ofthe data, which may now exist in memos, code "definitions," mini write-ups, and/or displays, the researcher draws conclusions. Dependingon themethodologybeing employed,theresearcher may then embark on anew round ofsearchingthrough the data to verify whether the conclusions reached are in fact supported by the data. And, finally,areport isproduced.

In dosing this discussion, it is important to emphasize that it is not intended as a substitute for methodological training, but as a generalized and quite simplified description. The researcher embarking on his or her first qualitative study will need to seek out training and readings on the

methodology

involvedinboth datacollection andanalysis in order to

fully

understandthe content and process of the steps alluded to.

USES

OF

COMPUTER

SOFTWARE

IN

QUALITATIVE

ANALYSIS

Perhaps obviously,many of the tasks just described can be greatly assisted bythejudicioususeofcomputers.Computerscanhelpinvirtually all phases ofthisprocess, fromtakingnotes towriting reports. For example,computers canbe used for:

1. Makingnotesinthefield;

2. Writingup ortranscribing field notes;

(6)

1246

HSR.

HealthServicesResearch 34:5PartII(December1999) 3. Editing:correcting, extending,orrevisingfieldnotes;

4. Coding: attachingkey words or tags to segments of text to permit laterretrieval;

5. Storage: keeping text in anorganizeddatabase;

6. Searchandretrieval: locatingrelevantsegments of textandmaking them available forinspection;

7. Data '"inking":connecting relevantdata segmentstoeach other to formcategories, clusters, or networks of information;

8. Memoing: writing reflective commentaries on some aspect of the data as a basis fordeeperanalysis;

9. Content analysis: counting frequencies, sequences, or locations of words andphrases;

10. Data display: placing selected or reduced data in a condensed, organizedformat, such as a matrix ornetwork,forinspection; 11. Conclusion-drawing andverfication: aiding the analyst to interpret

displayeddataand to test or confirmfindings;

12. Theory-building: developing systematic, conceptually coherent ex-planationsoffindings and testing hypotheses;

13. Graphicmapping: creatingdiagrams thatdepict findings ortheories; and

14. Report-writing: interim and final

(adapted

from Miles and Huber-man 1994: 44).

But there is no one program that will do all of these things best. Therefore, thebestchoicewilldepend on your project, yourmethodology, your own style ofanalysisand personal preferences, and your team if you have one.Furtheronyouwillfinda set ofquestions to guide you through an assessment ofyour needs. First,however, it will be helpful to offer a typology ofsoftwaretypes.

SOFrWARE TYPE

FAMILIES

Thisis aroughsorting of available software into types. There is naturally quite abit ofoverlapamongcategories, with individualprogramshavingfunctions that would seem to belong to more than one type. However, it is possible to focus on the "heartand

soul"

of aprogram: what itmainly is intended for. This categorization scheme was first presentedin Weitzman and

Miles

(7)

AnavzingQualitativeDatawith Computer

Software

Text Retrievers

Text retrievers specialize in finding all instances ofwords and phrases in text, in one or several files. They typically allow you to search for places where two or more words orphrases coincide withinaspecifieddistance (a number ofwords, sentences, pages,

etc.)

and allow youto sorttheresulting passages into different outputfiles and reports.Theymay dootherthingsas well,such as content analysis functions likecounting, displayingkeywordsin context,orcreatingconcordances

(organized

lists of all words andphrasesin their

contexts),

orthey may allow you to attachannotationsor evenvariable values (for thingslike demographics or source

information)

topointsinthe text. Examplesoftextretrievers areSonarProfessional,The TextCollector, andZyINDEX.

TextbaseManagers

Textbase managers are database programs specialized for storing text in more orless organized fashion. They aregood at holding texttogetherwith informationabout it, and atallowing you to quickly organize and sort your datainavarietyofways,and retrieveitaccording to different criteria. Some arebetter suitedtohighlystructured data thatcanbeorganizedinto"records"

(i.e.,

specific

cases)

and"fields"

(variables:

informationthat appears foreach

case),

while others easily manage "free-form" text. They may allow you to definefieldsin thefixed manner of a traditional database such asMicrosoft

Access®or

FileMaker

Pro®,

orthey may allowsignificantlymoreflexibility, forexample,allowing different records to have different field structures. Their search operationsmaybe as good as, or sometimes evenbetter than those of some text retrievers. Examples oftextbase managers are askSam, Folio Views,

Idealist,

InfoTree32XT,andTEXTBASEALPHA.

Code-and-Retrieve

Programns

Code-and-retrieve programs areoften developed by qualitative researchers specificallyforthepurpose ofqualitative data analysis. The

programs

inthis categoryspecializeinallowingtheresearcherto apply categorytags

(codes)

topassages oftext, andlater retrieve and display the text according to the researcher's coding. These programs have at least some search capacity, al-lowingyou to search either for codes or for words and phrases in the text. They mayhave a capacityto store memos. Even the weakest of theseprograms represent a quantum leapforward from the old scissors-and-paper approach: they're more systematic, more thorough, less likely to miss

tiings,

more

(8)

1248 HSRHealth

Service

Research34:5Part II(December1999)

flexible, and much, much faster. Examples of code-and-retrieve programs areHyperQual2,Kwalitan, QUALPRO, Martin,andThe DataCollector.

Code-based

Theory

Builders

Most of these programs are also based on a code-and-retrieve model, but theygo beyond the functions of code-and-retrieve programs. Theydo not, nor wouldyou want themto,buildtheoryforyou.Rather, theyhavespecial features or routines thatgo beyond those ofcode-and-retrieve programs in supporting your theory-building efforts. For example, theymay allow you to represent relations among codes, build higher-order classifications and categories,or formulate and testtheoreticalpropositions about the data.They mayhave morepowerful memoingfeatures

(allowing

you, for

example,

to categorize or code your

memos)

or more sophisticated search-and-retrieval functions than do code-and-retrieve programs. They may also offer capa-bilities for "system closure," allowingyou to feed results ofyour analyses

(such

as searchresults or

memos)

back into thesystem as data. Examplesof code-basedtheorybuilders are

AFTER,

AQUAD,ATLAS/ti,

Code-A-Text,

HyperRESEARCH,NUD-IST, QCA, The Ethnograph, and winMAX. Two ofthese

programs,

AQUADandQCA,support cross-caseconfigural analysis

(Ragin 1987),

with QCA being dedicated wholly to this method and not havinganytext-coding

capabilities.

Conceptual Network Builders

These

programs

emphasize the creation and analysis of network displays. Some ofthemarefocusedonallowingyou to create networkdrawings, that is, graphic representations ofthe relationships among concepts. Examples of these are Inspiration, MetaDesign, and Visio. Others are focused on the analysisofcognitive or semanticnetworks,forexample,theprogram MECA. Still others offer some combination of the two approaches, for example, SemNet and Decision Explorer. Finally, ATLAS/ti, a program also listed undercode-based theory builders, has, in

addition,

a fine graphical network builder connected to theanalyticworkyou do with your text and codes.

Summary

Inconcluding our discussion of the five mainsoftwarefamily types, it is im-portant to emphasize that

functions

often crosstype boundaries. For

example,

Folio

Views can code andretrieve, and has an

excellent

text searchfacility. ATLAS/ti, NUD*IST, The

Ethnograph,

and winMaxgraphicallyrepresent

(9)

AnalyzingQualitativeDatawithComputer

Soflware

therelationshipsamong codes,although amongthese,onlyATLAS/tiallows you towork with andmanipulatethedrawing.TheEthnographand winMAX each have a systemforattaching variable values

(text,

date,numeric,

etc.)

to textfiles and/orcases.Theimplication:donotdecidetooearlywhich

family

youwant tochoosefrom.Instead, stayfocusedonthe functions youneed.

CONCEPTUAL ASSUMPTIONS

Oneconcern manyresearchers haveisthatamethodologicalorconceptual approach will be imposed by the software. In

fact,

developers do bring assumptions, conceptualframeworks, and sometimes even methodological and theoreticalideologiestothedevelopmentoftheirproducts. Thesehave importantimplications fortheimpactthat using a programwillhave on your analyses. However, you need not, andinfactshould not, betrappedbythese assumptionsandframeworks: thereareoften ways ofbendingaprogramto your ownpurposes. For example, aprogram mayallow youto defineonly hierarchicalrelationsamongcodes. Youmight work around this by creating redundantcodesindifferent parts ofthehierarchy,orbykeepingtrackofthe extrarelationshipsyouwanttodefinewithmemosandnetworkdiagrams.

The fact thatdevelopersbringconceptual assumptions to their work is infactoneofthestrengthsofthefield.Many of thedevelopers, particularly of code-and-retrieve and code-based theory-builderprograms, areresearchers themselves. Theyhave investedenormousintellectual energyinfinding the right tools for analyses of different types, and the user can benefitgreatly.

It isalso true thatthedesign of the software can have an impact on anal-ysis. Forexample, differentprogramswork with different

"metaphors,"

that is,different ways of presenting therelationships amongcodes and between codes andtextKwalitan,NUD-IST,TheEthnograph, and winMAX all allow hierarchical relationsamongcodes. For studiesinwhich you are

organizng

your conceptual categories hierarchically, these programs offer significant strengths. If you want to represent non-hierarchical relationships, even if youchoosetotrytowork aroundthismetaphor, it may be less comfortable then using aprogram likeATLAS/ti that explicidy supports more flexible networks. HyperRESEARCH emphasizes the relationship between codes and cases,ratherthancodes andchunksof text. Whenyou code a chunk of

text,

you create anentry onwhat looks like an index card for a particular case. Thisstrongly supports and encouragesthinking that stressescase-wise andcross-casephenomena,butitmakesithardertolook for and

tiink

about relationships

among

codeswithinatext.

(10)

1250 HSR. HealthServicesResearch34.5Part II(December 1999)

Finally, Code-A-Text offers a quite different set ofcoding metaphors: (1) codes arranged into "scales"

(you

can assign only one code from each scale to a segment oftext, which is useful if you want to code your text chunks by making mutually exclusive judgments on a variety of factors);

(2)

codesautomatically assignedaccordingtowordsinthe text; and

(3)

open-ended"interpretations" youwriteabout eachtextchunk. Workingwith this collection ofcoding metaphors could be expected to lead the analyst to consider his orher text in somewhat different ways than with one of the othermetaphors described earlier.

Similar issues exist inthe choice of other types ofsoftware, such as textbase managers.InfoTree32 XT allows youtoarrangetexts in ahierarchical tree,and lets youdragtextsaroundtorearrange them.Folio Views also allows youtocreateahierarchicaloutlinebut gives you theadditionalcapabilityof creatingmultiple, non-sequential "groupings" oftextsthatyoucan activate anytime youwish. Folio Views andaskSam let youinsertfields whenever andwherever youwant inanygivenrecord, while InfoTree32 XT requires that you use astandardfield set

(of

yourown

design)

throughoutthe whole database. Idealistnotonlylets you customize the fieldsetforeachrecord,but youcanalso create avariety of record types, each withits own setoffields, andmixthemin thesame database. Finally, whilemostof theseprograms show you justonerecordat atime,FolioViewsshows youaword processor-like view in which records appear one right after the other asparagraphs. Clearly, each of these programs shows you a quite different view ofyour data, may encourage different ways of

thinking

about your data, and has differentstrengths and weaknesses. Itisalso true thata clever userwill be able to bend each of these flexible packages to awide variety of different tasks, overcoming many of the differences amongthem.

Eachofthese assumptionsisbothabenefit, for some modes of analysis, and alsoa constraint.Thekey,then,isnotto gettrapped by the assumptions ofthe program. Ifyou are aware of what they are, you can be clever and work around them. Theprogramshould serve your analytic needs, goals, and assumptions, nottheother way around. Researchers interestedinempirical work on theimpactof different

programs

onresearch are encouraged to refer toFielding andLee

(1998).

CHOICE ISSUES

Uptothis point, thisarticlehas stressedrepeatedly that a choice must be made

carefully

to meet the needs ofboth project and researcher. Such a choice isbased on principles such as that the software should be matched to the

(11)

AnalyzingQualitativeDatawithComputerSoftware

structureofthe dataset, to the

style

and mechanicsof the

analysis

planned, and to the researcher's level of comfort with computers in

general.

How then togo aboutmakingsuchaprincipled, customized choice? Fourbroad questions, along with two cross-cutting issues, can be asked that represent theseprinciplesandshouldguidetheresearchertosuchachoice (Weitzman and Miles1995a,b).Theseguidelinesforchoicehaveseenwideuse inpractice sincetheiroriginalformulation,andhaveproven to be effective forguiding researchers toappropriate choices.

Specifically,therearefourkeyquestions toaskandanswer asyou move towardchoosing one or more softwarepackages:

1. What kind of computer useramI?

2. Am Ichoosingforone project orthenext fewyears?

3. What kind of

project(s)

and

database(s)

will I beworkingon? 4. Whatkind ofanalyses am I planning to do?

Inaddition to these fourkey questions, there are two cross-cutting issues tobear in mind:

* Howimportantis it toyouto maintain a sense of"closeness"toyour data?

* What areyour financial constraints when buying software, and the hardwareitneedsto runon?

With these basic issuesclear, you will be able to look at specific programs in a more active, deliberate way, seeing what does or does not meet your needs.

(See

Table1, adapted from Weitzman and Miles 1995b, for a worksheet to organize such information. Table 2 provides contactinformation for the programs mentioned in this article and their demoversions.) Youcan work yourwayfromanswering questions,tothe

implications

of thoseanswersfor program choice, to candidateprograms. Forexample,

if

you areworkingon a complexevaluationstudy,with acombinationofstructured interviews,focus groups, and casestudies,youwillneedstrong tools for tracking cases through different documents.You might find goodsupport for this in codestuctures

(particularly programs

withhighly structured code systems,

like

NUD-IST, and to alesserdegreein

programs

withflexiblecodesystems

like

ATLAS/ti), orthrough the use ofspeakeridentifiers in programs like The Ethnograph,

AFTER,

orCode-A-Text

Question 1: WhatKindofaComputer User Are You?

Yourpresent level of computer use is animportant factorin yourchoice of aprogram. Ifyou are new tocomputers, your best bet is probably to choose

(12)

1252

Table 1:

HSR.- Health Services Research 34:5 Part II (December 1999)

ProgramChoiceWorksheet

Issue Answer Implications/Notes CandidateProgram(s)

What Kind of aComputer UserAreYou?(level 1-4)

Are YouChoosingfor

OneProject or the Next Few Years?

Data SourcesPer Case:

Singlevs.Multiple

Single vs.Multiple Cases Fixed Records vs. Revised Structured vs.Open Uniform vs.Diverse Entries SizeofDatabase What Kind ofAnalysisis

Anticipated?

Exploratory vs.

Confirmatory

CodingScheme Firm at

Startvs.Evolving

Multiple vs.Single

Coding

Iterative vs.One Pass Fineness ofAnalysis Interest inContext of

Data

Intentions forDisplays Qualitative Only,or

Numbers Included Closeness to thedata important?

Costconstraints

(13)

AnavzngQualitative Datawith Computer

Soflware

Table 2: SoftwareContact Information

ProgramName Source Phone WWWAddress

AFTER NOVA Research (301)986-1891 www.novaresearch.com

Company

ANSWR U.S. Centers for www.cdc.gov

DiseaseControl

AQUAD Giinter L. Huber, www.uni-tuebingen.de/uni/sei/a-Universityof ppsy/aquad/aquad.htm

Tubingen, Germany

askSam askSamSystems (800) 800-1997 www.asksam.com ATLAS/ti Scolari (805) 499-0721 www.atlasti.de

Code-A-Text Scolari (805)499-0721 www.codeatextu-net.com

Decision Explorer Scolari (805)499-0721 www.banxiacom

EZ-Text U.S. Centersfor www.cdc.gov

DiseaseControl

FolioViews OpenMarket www.folio.com HyperQual2 Raymond V. Padilla (602)892-9173

HyperRESEARCH Researchware, Inc. (617)961-3909 www.researchware.com Idealist BlackstoneSoftware

InfoTree32XT NextWord Software www.nextword.com Inspiration Inspiration Software (800)877-4292 www.inspiration.com

Kwalitan VincentPeters www.kun.nl/methoden/kwalitan

Martin SimmondsCenter (608)263-5336 for Instructionand

ResearchinNursing

MECA DepartmentofSocial (412)268-3225

and Decision Sciences

MetaDesign MetaSoftware (617)576-6920 www.metasoftware.com Corporation

NUD*IST Scolari (805)499-0721 www.scolari.com QCA Kriss Drass (702)895-0247

QUALPRO BernardBlacknan (904)668-9865 SemNet KathleenFiher (619)594-4453

Sonar Professional VirginiaSystems (804)739-3200 www.virginiasystems.com

TEXTBASE BoSummerlund www.psy.aau.dk/bos/betahtm

ALPHA/BETA

TheDataCollector Intelimation (805)968-2291

TheEthnograph Scolari (805)499-0721 www.scolari.com TheTextCollector Dennis V. O'Neill (415)398-2255

WmMAX Scolari (805)499-0721 www.winmax.de ZyINDEX Zylab (800)544-6339 www.zylab.com

(14)

1254 HSR HealthService Research34:5 Part II(Decmber1999)

a word processing program with advice from friends, and begin using

it,

learning to use your computer'soperating system(e.g.,MS-DOS, Windows, orMac)andgetting comfortable with the idea of creating text, moving around in

it,

and revising it. That would bring you to whatwe'll call Level 1. Or you may beacquainted with several different programs, use your operating systemeasily, and feelcomfortablewith the idea ofexploringandlearning newprograms

(Level 2).

Or you may be a person with active interestinthe ins and outs of howprograms work

(Level 3),

and feel easywith

customization,

writing macros, etc.

(We

willnot deal here with the"hacker,"aLevel4person who lives andbreathescomputing.)

Beingmore of a novice does not mean you have to choose a"baby" program, oreven that you shouldn't choose a very complex program. It does mean,though, allowingfor extralearning time, perhaps placingmore em-phasis on userfriendliness,andfindingsourcesofsupport for yourlearning, such as friends orcolleagues, or on-line discussion groups on the Internet People at different levels seem tohave quite differentreactions to thesame programs. So, forexample, aperson atLevel 2 or 3 mightlike aprogram that puts themaximum informationon one screenbecausethisallowsher to quicklyfind what she wants and learn theprogram veryquickly.Aperson at Level 1 might findall thatinformationoverwhelmingat

first,

andmight take alitdelonger tolearnthatprogram becauseofit.

But,

once theprogramwas learned,would probably benefit from the layout in thesameway as the more advancedcomputer user.

Question2: AreYou

Choosingfor

One

Proect

ortheNext Few Years?

Awordprocessor does not care what you are writing about, so most people pick one and stick with it until something better comes along and they feel motivated to learn it. But particular qualitative analysis programs tend to begoodforcertain

types

ofanalyses. Switchingwill costyoulearning time andmoney. Think about whetheryou should choose the best program for this project, or the

program

thatbest covers the kinds ofprojects you are considering over the next fewyears. For example, a particular code-and-retrieveprogram might lookadequate forthecurrentproject and be cheaper orlook easier to learn. But, ifyou arelikelyto need amorefully featured code-basedtheory-builder down the road, itmight

make

more sense to get started withone ofthosenow

(assuming

youchoose one that includes good code-and-retrieve

capabilities).

(15)

Analyzing Qualitative Data

wit/

Computer

Software

Question3: WhatKindofDatabaseandProject Are You Working With?

As youlook at detailed softwarefeatures, you needto play them against a series ofdetailedissues.

Data Sources: Singk versusMultipk. You may be collecting data on a case from manydifferentsources (sayyourcase isdefinedas apatient, and you talk with several nurses, the patient's parents and friends, the doctor, andthe patientherself. Someprograms are specificallydesigned tohandle data organized like this, others are not designed this way but can handle multiple sources prettywell, and some really donothave theflexibilityyou'll need. Also, look for programsthat are good atmaking links, such as those with hypertext capability, and that attach "source tags" telling you where informationiscomingfrom.

Singkversus

Multipk

Cases. If youhavemultiple cases, youusuallywill want to sort them outaccording to differentpatterns or configurations, and/or work withonly some ofthe cases, and/or do cross-case comparisons. Multi-casestudies cangetcomplicated.Forexample, yourcasesmightbe patients

(and

you might have data from multiple sources for each

patient).

Your patients might all be "nested in"

(grouped

by) care units, which might be nested within hospitals, which in turn might be nested in cities. Look for softwarethatwilleasilyselectdifferent portions of thedatabase, and/or that

will

doconfigurational analysis across your cases; software that can help you createmultiple-casematrixdisplaysisalso useful.

FixedRecords versusRevised.

Wil

yoube workingwith datathat are fixed

(such

asofficial documents orsurvey

responses),

ordata thatwill be revised

(with

corrections, addedcodes,

annotations,

memos,

etc.)?

Some programs make databaserevisioneasy,and others are quiterigid:revising can use up a lot oftimeand energy. Somewill let youreviseannotationsandcoding easily, butnottheunderlyingtext,and some wfll let you revise both.

StructuredversusOpen. Are your datastrictly organized

(e.g.,

responses to astandard questionnaire or

interview),

orfree-form

(mnning

field notes, participant observation,

etc.)?

Highly organized data can usually be more easily,

quickdy,

and

powerfiully

managedinprograms setup to accommodate them-for example, those with well-defined "records" for each case and "fields"

(or variables)

with data for each record. Free-form text demands a moreflexibleprogram.Thereareprogramsthatspecializeinone or the other typeofdata, andsomethat workfairlywell with either.

Uniform

versusDiverseEntries. Your data may all come from interviews.

(16)

1256 HSR Health ServicesResearch34:5 PartII(December1999)

Or you may have information of many sorts:documents, observations, ques-tionnaires,pictures,audiotapes, videotapes(thisissueoverlapswithsingle ver-susmultiplesources,above).Someprograms handle diverse data typeseasily, and others are narrow and stem intheir requirements. Ifyou will have diverse entries, look for software designed to handlemultiple sources and types of data, with good source-tags, andgoodlinkingfeaturesinahypertextmode. The ability to handle "off-line" data-referringyoutomaterial notactually loaded into yourprogram,such as whole books or manuals-is a plus. Many programscanbe tricked into doingthisifyou are clever about it. Forexample, you mightsetup an empty document on your computercorrespondingas aproxy to each "off-line" document. You could enterchapter numbers, or some otherindexinginformation intothis blank proxydocument.Then, as you read theoff-linedocument, you could code thecorrespondingindexing information

(e.g.,

chapter

number)

intheproxy. When you do a search, the program will retrieve the reference in the proxy document, and you will know where to look in theoffline document.

(This

strategy is adapted from onesuggested in the manual for

NUDIST.)

Size

ofDatabase. A program's database capacity may be expressed in terms of numbers of cases, numbers of data documents

(files),

size of individualfiles,and/or totaldatabase size, oftenexpressedinkilobytes

(K)

or megabytes

(MB).

(Roughly,

consider that asingle-spacedpage of printed text is about 2 to

3K).

Estimate your total size in whateverterms theprogram's limitsareexpressed, and at least double it. Mostprogramstoday are generous interms oftotal database size. A few are stillstingy when it comes to the size of individual texts. Forexample,in someprogramswhen a document goes beyond about ten pages, theprogram will insist onbreakingitintosmaller chunks, orwillopen it onlyina"read-onlymode" browser.

Qyestion4:

What

KindofAnalysisIsAnticipated?

Your choice ofsoftware also depends on howyou expect to go about the analysis. This doesnot meanadetailedanalysisplan,butageneral sense of thestyle and approach you are expecting to use.

Fkploratory

versusConfirmatory. Areyoumainly planning to poke around inyour data to see what they are like, evolving your ideas inductively? Or doyou have somespecific hypotheses in mind linked to anexistingtheory to testdeductively?Iftheformer, it is especially

important

tohave features of fastandpowerful search and retrieval, and easy coding and revision, along withgood text and/or graphic display.

(17)

Analyzing QualitativeDatawith Computer

Software

If, on the other hand,you have a beginning theory and want to test somespecific hypotheses, programswithstrongtheory-buildingand testing features are more appropriate. Look forprograms thattestpropositions or thosethathelpyoudevelop and extend conceptualnetworks.

CodingSchemeFirmatStart versusEvolving. Doesyourstudyhave a fairly well definedapriori schemeforcodes

(categories, keywords),

perhaps

theory-derived, thatyouwillapplyto yourdata? Or will such ascheme evolve as you go, in agrounded theory style,usingthe"constant comparative"method (Strauss and Corbin 1998). Ifthe latter, it is especially important to have on-screen coding

(rather

than having to code on hard copy) and to have features supporting easy orautomatedrevision ofcodes. Hypertextlinking capabilitiesarealsohelpful here. "Automated"coding(inwhich theprogram applies acodeaccordingto aruleyou set up,suchaswhena certainphrase, or acombination of othercodes,

exists)

canbe helpfulineithercase.

Multipk versus Singk Coding. Some programs let you assign several different codes to the same segment of text, including higher-order codes, andmayletyouoverlapornestcoded"chunks"

(the

rangesof text youapply codes

to).

Others are stem: one chunk, one code. Still otherprograms will letyouapplymorethanonecodeto achunk, butwillnot"know" that there are multiple codeson thechunk: they willtreat itliketwo chunks,onefor each code.

Iterative versus OnePass. Do youwant-and do youhave the time-to keep walkingthroughyour data several times, takingdifferent and revised cuts?Orwillyoulimityourselfto one pass?An iterative intent should point you toprogramsthatareflexible,inviterepeatedruns,makecoding revision easy,havegood search andautocodingfeatures,andcanmakealogof your workas you go. (See alsothe question of whether your records are fixed or revisable duringanalysis.)

FinenessofAnalysis. Willyouranalysis focus on specific words? Or lines of text? Or sentences? Paragraphs? Pages? Whole files? Look to see what the program permits (orrequires, orforbids)you to do. How flexible is it? Canyoulook at varying sizesofchunksin your data? Canyoudefine

free-fonr

segmentswithease?Someprogramsmake you choose the size of your codable segments when you first import the data; others let you mix and match chunksizes asyougo.

Interest in Context ofData. When theprogram pullsout chunks of text

inresponse to yoursearchrequests,how muchsurrounding information do you want tohave?Do you needonly the word, phrase,orline itself? Do you wantthe preceding and followinglines/sentences/paragraphs? Do you want

(18)

1258 HSR:HealthServicesResearch34:5Part II(December 1999)

to seethe entirefile? Do you need to be abletojumprightto thatplacein

the fileand do some work onit(e.g.,code, edit, annotate)?Do you wantthe information to be marked with a"source

tag"

thattells you whereit came from

(e.g.,

"Interview 3withJaniceChang, page 22, line6")?Programs vary widely on this.

Intentionsfor Displays. Analysis goes much better when you can see organized,compressed information in one place ratherthaninpage after page ofunreducedtext. Some programsproduce outputin listform (lists oftext segments,hits,codes,

etc.).

Somecanhelpyouproducematrixdisplays.They may list text segments orcodes foreach cell ofamatrix,although you will have toactually arrange themin amatrix fordisplay.Lookforprograms that let youedit, reduce, or summarize hits, before you put theminto atext-filled matrixwith your word processor.Someprogramscangive you quantitative data (generally

frequencies)

in a matrix. Others cangive you networks or hierarchicaldiagrams,the other majorform of data display.

QualitativeOnly,orNumbers

Included.

Ifyour data, and/or youranalyses include thepossibilityofnumber-crunching, look to see whether the program will count

tiings,

and/orwhether it can share information with quantitative analysis programs such as SPSS or SAS. Think carefully about what kind ofquantitative analysisyou'llbe doing, and makesuretheprogram you are thinkingabout can arrange the data appropriately. Consider, too, whether programs canlinkqualitative and quantitative datain ameaningfiul way

(in

termsof theanalytical approach you are

taling).

CROSS-CUrlING ISSUES

The two main cross-cutting issues are closeness to the data and financial resources. Let usdispensequickdywith the latterquestion first. Software varies dramaticallyinprice.The range of prices for theprogramswereviewed was

$0

to

$1,644

peruser(WeitzmanandMiles

1995b).

That isstill the range. In addition,programsvary a lot in the hardware they require to run efficiently. Youobviouslycannotuse a program ifit is tooexpensive for you, ifit requires a machine you cannotafford, or ifit runs on the wrongplatform,,say, PC instead of Mac. Happily,theU.S. Centers for Disease Controland Prevention now distributes twoprograms free of charge via the Web: EZ-Text for qualitative surveys andAnswrforunstructured text.

The remaining issue, closeness to the data, is more complex. Many researchersfearthatworking with qualitative dataon acomputer will have

(19)

Analyzing QualitativeData withComputer

Software

the effect of"distancing"them fromtheir data. Thiscanin factbe thecase. You maywind uplooking at only small chunksof text at atime,or evenjust atline-number referencesto textlocations.Thisis afarcryfromthefeeling ofdeepimmersion inthe data thatcomesfromreadingandffipping through piles of paper.

Butotherprogramsminimizethiseffect They typicallykeepyourdata files on screenin front of youatall times; show you your searchresults by scrollingtothehit,sothat yousee it in itsfull context; and allow youto execute most, orallactions fromthesame data-viewingscreen. Programsthat allow youtobuildinhypertext links between different pointsinyourdata, provide good facilities forkeepingtrack of your locationinthedatabase, displayyour coding and memoing, and allow youtopulltogether related dataquickly,can in somewayshelp you get even closerto thedata than youcanwith paper transcripts. Ifyouchoose withthis consideration inmind, softwarecan help rather thanhinder your work at staying closetothedata.

However, having software that enhances a sense of closeness to the data maynotbeacrucialissueforeveryone.Some researchers donotmind relying heavily on printed transcriptstogetafeelingofcloseness, while others thinkthatsuchheavyreliancedefeats the purposeofqualitative data analysis (QDA) software. Furthermore, some projects simplydo notrequire intense closeness to the data. You may bedoingmoreabstract work and,infact, want to moveawayfromtherawdata.

BENEFITS

AND

LIMITATIONS

OF

SOFTWARE

USE

Theuseofgood QDAsoftware canenhance,inawide variety of ways, both thespeed ofthe processand the quality of the results of a qualitative or mixed-methods study. But there are also some important limitations to what even thebestsoftware will do for you.

Benefits

One ofthe first benefits you will realize is that using software speeds up analysis tasks aloLThere is simply no way to search for important phrases, create textsegments andcode them, cross-reference memos to text and codes, createcross-referencesbetween parts of the text, and so on, nearly as fast by hand as you can with a computer. Qualitative research can be enormously time-consuming, of course, even with the use of a computer: you

will

still 1259

(20)

1260 HSR.Health Services Research34:5 Part II(December 1999)

have toread and thinkcarefully aboutyourdata.Butthetimesavings inthe mechanicaltaskscansubstantiallylighten the burden.

Second, software can make itpossibletodoanalysesthat are not feasible by hand. For example, with the right software it is easy to go back and recode and then look again for the new results ofcombined-code searches, or quicldy gather together all the intersections ofcodes in your "Themes" listwith codes in your "Diagnoses" list to look forpatterns of association. You can link multimedia, such as audio or videorecordingsof interviewsand focus groups, to your transcripts so that you can code for non-verbal as well asverbalphenomena. And the"Booleanminimization" logicthatunderlies Ragin's

(1987)

cross-caseconfigural analysiswould bedauntingwithout the helpof aprogramlikeQCAorAQUAD.

Software can also help you to ensure consistency and comprehen-siveness. Once you have realizedhalfwaythrough youranalyses thatsome patients make frequent references to death,you can go back and search for all of the synonyms and euphemisms for death, and code them all. That also means that once you have reached a tentative conclusion, you have much greater power to go back and verify itagainst the actual data. Your likelihoodof finding the data that contradict, or lend even better support to your emerginghypothesis,isnowmuchgreater.

Finally, software makes it possible to track many facets of the data in connection with each other. For example, ifyou need to combine demo-graphic data, scale scores, theme codes, andimportant keywords in the text to find the relevant chunks of text, you can find software that will help you pull all of those

tiings

together.

Limitations

However, importantlimitationsalso exist.First and foremost is that software will not doqualitative analysis. It can findphrases; it can find text with codes you haveapplied. It can count occurrences ofthose things.Some software, likeSPSSTextSmart,willattempt a quantitatively based categorization of text responses. But there is no software that canmake the kinds of interpretive

judgments

that areinherent inqualitativeanalysis.

Second, software isno substitute for methodological training. A dis-turbing number of papers and grantapplicationshave begun to cite a QDA softwarepackageas theiranalytic method. This is both absurd and misleading, and it isdangerous. The choice of a softwarepackageinnoway determines the analytic operations thatwill becarried out, the basis that will

underlie

decisions aboutcoding, the rationale that will be employed for determining

(21)

Analzing QyalitativeDatawith Computer

Software

relationshipsamongconcepts,orthe waysinwhich the

resulting

conclusions willbeevaluated.

Finally,using software forqualitative dataanalysisrequiresnew leam-ing. Particularly the first time youadoptaqualitative analysis

package,

you willbelearningawhole new kind of program. Thinkback to what it was like tofirstlearnWordStarorMacWrite

(or

whatever wordprocessor you started on),andcompare that with what it is like now, when youupgradeorlearna new one. You know what a word processormainlydoes:you just have to find the new buttons/commandsand then incorporateanunderstandingofafew new features. Now you will be starting at thebeginningagain.

TRENDS AND

DEVELOPMENTS

Some very encouraging trends arebeingtracedin ongoingQDA software development. These fall into two rough categories: Catch-Up andForging Ahead.

Catch-Up

One of the main trendsintheworld ofQDA software is thatdevelopershave beenworkingfeverishlytomakeup for theweaknessesintheirprograms rela-tive to others. Thoseprogramsthathavebeen found by reviewers and users to be wanting in searchfacilities, flexibilityintherelationships between text and codes, memoing, data linking, coding display, and qualitative-quantitative

linking,

are

quickly

catching

up.

Forging Ahead

Inaddition, several programs, such asCode-A-Textand ATLAS/ti are devel-oping exciting facilities for incorporatingmultimedia(bothaudio andvideo) foranalysis alongwithtext.Variousprogramsare moving toward supporting formatted text, the exchange of data

among

QDA packages, supporting collaboration,anddeveloping new metaphorsforproviding access tothedata.

HOPES

Whathopes should we have for futuredevelopment?Here are a few: * Continued catch-up so that essentialfeatures are available in all

pro-grams;

(22)

1262

HSR.

Health ServicesResearch34.5PartII (December1999)

*

Moreflexibility

inconfiguringprogramsto meetspecific researchneeds; * More "Pencil-lekvl Richness"-continuing the trend of giving the re-searcher thesame kind of intimacy withthedatathatyou get from workingwith paper, allowing the researcher to see not justcodesbut memosand annotationsbesidethe text aswell,asinATLAS/ti; * More data exchange amongQDA packages, so thatitbecomes more

feasible tocombine complementaryprograms; * Moremultimediasupport;

* Moreease-of-use,lower

learning

curves, and bettertutorials;

* More supportfor quantitative and demographic

variables,

and other facilitiesfortrackingindividualsthroughmulti-source data sets; * Moresupportforthe needsofnarrativeanalysis,such as features totrack

structures in the text; and

* More andbetter toolsforcollaboration, such as tools for merging and tracking thework ofteammembers,and forthesharingofdata

(e.g.,

over theWorld WideWeb).

CONCLUSION

Software,ifchosenappropriately,canbeasubstantial and important advan-tage forqualitative research projects. It needs to be selectedwithcarefulregard for the needs ofindividual researchers intermsofworking

style,

structure of thedata, andanalytic plans.Thepotentialbenefitsincludespeed, consistency, rigor, and access to analytic methods notavailable by hand.

REFERENCES

Bogdan, R. C., and S. K Biklen. 1982. QyalitativeResearch in Education. Boston: Allyn andBacon.

Drass, K A. 1980. "TheAnalysis of Qualitative Data: A ComputerProgram." Urban

Life9: 322-53.

Fielding, N. G., and R M. Lee. 1998.

Computer

Ana4sis

and Qualitative Research.

London:Sage.

1991. UsingComputers in

Qualitative

Research.London: Sage.

Goetz,J. P.,and M. D.LeCompte. 1984. Ethnography andQyalitative

Design

in

Educa-tionalResearch.Orlando,FL:AcademicPress.

Huberman,A.M., and M. B.Miles. 1984. InnovationUpClose:HowSchool

Improvmen

Works.NewYork:Plenum.

. 1983. Innovation Up Cose: A Feld

Sudy

in 12

School

Settings. Andover, MA:

(23)

Analyzing QualitativeDatawith Computer

Software

Kelle, U., ed. 1995. Computer-Aided Qualitative Data Analysis: Theory, Methods and

Practice.London:Sage.

Lofland, J., and L. H. Lofland. 1984.Analyzing Social Settings:A Guideto Qualitative Observation andAnalysis,2nded.Belmont, CA: Wadsworth.

Mangabeira,W,ed. 1996. "Qualitative SociologyandComputer Programs: Advent and Diffusion ofCAQDAS" (specialissue) Current Sociolog 44 (3).

Miles, M. B., and A. M. Huberman. 1994. QualitativeData Analysis:AnExpanded

Sourcebook,

2nd ed. Thousand Oaks, CA: Sage.

. 1984. QualitativeDataAnalysis:ASourcebookofNew Methods. Newbury Park,

CA:Sage.

Miles, M. B., and E. A. Weitzman. 1996. "The State ofQualitative DataAnalysis

Software: What Do WeNeed?" Current Sociology44(3):206-24.

Ragin,C. C. 1987. TheComparativeMethod:MovingBeyond QualitativeandQuantitative Strategies.Berkeley, CA: University ofCaliforniaPress.

Seidel,J. V.,andJ.A. Clark. 1984. "The Ethnograph:AComputerProgramforthe Analysis ofQualitativeData."QualitativeSociology7(1,2): 110-25.

Shelly, A., and E. Sibert. 1985. The Qualog Users Manual. Syracuse, NY: Syracuse University, School ofComputerandInformationScience.

Strauss,A.L,andJ.Corbin. 1998.BasicsofQualitativeResearch:Techniques andProcedures forDevelpingGroundedTheory,2nd ed. Thousand Oaks, CA: Sage.

Tesch, R, ed. 1991. "ComputersandQualitative Data II"(specialissues, parts1 and

2) QualitativeSociology 14, no. 3 and no. 4(fallandwinter).

. 1990. QualitativeResearch:Analysis Types and

Software

Tools. New York: The

Falmer Press.

Weitzman, E. A., and M. B. Miles. 1995a."ChoosingSoftware forQualitative Data

Analysis:AnOverview." CulturalAnthropologyMethods 7(1): 1-5.

. 1995b. Computer ProgramsforQualitative Data Analysis: A

Software

Sourcebook.

Thousand Oaks, CA: Sage.t