Analyzing
Qualitative
Data
with
Computer
Software
Eben A.
Weitzman
Objective.
Toprovide healthservicesresearchers withanoverviewofthequalitative data analysis process and the role of software within it; to provide a principledapproach tochoosingamongsoftwarepackagestosupportqualitative dataanalysis;
toalert researchers tothepotentialbenefits andlimitationsofsuchsoftware;andto provide an overview of thedevelopmentstobeexpectedinthe fieldinthenearfuture. Data Sources, Study Design, Methods. This article does notinclude reports of empiricalresearch.
Condlusions.Software forqualitativedataanalysiscanbenefit the researcherinterms ofspeed,consistency, rigor, andaccess to analytic methodsnotavailable by hand. Software,however,is not areplacementformethodologicaltraining.
Key Words. Qualitative research, qualitative data analysis, software, CAQDAS,
mixed-methods research
As health services researchers increasingly turn to qualitative and niixed
(qualitative
andquantitative)
researchmethods, there is increasing interest in finding and using new tools and methods for the analysis of qualitative data. Asrecentlyasthemid-1980s,mostqualitativeresearcherswerecarrying out the mechanics of their analyses by hand: typing up field notes and interviews, photocopying them, marking them upwith markers or pencils, cutting and pasting themarkedsegmentsontofile cards, sorting and shuffling thecards, and typing up their analyses. Some were beginning to use word processors for the typing work, and just a few werebeginnng
to experiment with database programs for storing and accessing their text. Most qualitative methods textbooks at the time (e.g., Bogdan and Biklen 1982; Goetz and LeCompte 1984; Lofland and Lofland 1984; Miles and Huberman 1984) madelittle, ifany, reference to the use of computers. A couple of programs designed specificallyfor the analysisofqualitative data were justbeginning toappear(Drass
1980; Seidel and Clark 1984; Shelly and Sibert 1985).1242 HSKHealthServicesResearch34:5 Part II
(Decmber
1999)But thelandscape has changed dramatically. There has been an out-pouringofjournalarticles,aseriesofinternational conferencesoncomputers and qualitative methodology, thoughtful books on the topic (Fielding and Lee 1991, 1998; Kelle 1995; Tesch 1990; Weitzman and Miles 1995b), and specialjournalissues(Mangabeira1996;Tesch1991).Atthetimethelate Matt Miles andI wrote Computer Programsfor QualitativeDataAnalysis
(Weitzman
and Miles 1995b), we reviewed no fewer than 24 different programs that wereuseful for analyzingqualitative data. Half ofthoseprograms had been developedspecificallyfortheanalysisofqualitativedata,whilethe other half had beendeveloped for moregeneral-purposeapplications suchas textsearch and storage.Since then, the field hascontinued to growrapidly.
Programsare being revisedat aregular rate, andnewprograms appearonthescene atthe rateof one or two a year. Yetthe software varieswidely, andit remainsvery much the case that there is no one bestprogram.Inthiscontext,aresearcher needstobe able tomakea
principkd
choice of software:onethat matches thecapabilitiesof the softwaretothespecific needs ofthe researcher and the project To thatend,this articleprovidesanoverview ofthe range ofusesof computer softwareinqualitativeresearch; describesthe major types of softwareavailable; provides an approach to needs assessment thatcanmatch software choice tospecificresearchers and projects; addresses some common questions about whether and why researchers should use software; andlaysout someofthefuturedevelopments that are either hoped fororclearlyonthe horizon.THE
ROLE OF COMPUTERS
INQUALITATIVE
RESEARCH
In order to be able to discuss the role of computers in qualitative research, it will be helpful to identify the major steps in the analysis process. The modelinFigure 1, and the description that follows, are intended as a general
Portions ofthis artideare adapted from Miles and Weitzman(1996)and Weitzman and Miles
(1995b).Most of the specific programs alluded to here are reviewed in detail(sometimes in
slightly earlier versions)inWeitzman and Miles(1995b).
AddresscorrespondencetoEben A. Weitzman, Ph.D., Assistant Professor, Graduate Programs inDispute Resolution, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, MA 02125. Thisarticle,submitted toHealthSericesRescardlonFebruary 8, 1999, was revised and accepted forpublication onJune24, 1999.
Analyzing QualitativeDatawithComputer
Software
representation of those steps, one that sees many and wide variations in practice,andnot asanysortof prescriptivemodel.
THE ANALYSIS PROCESS: FROM
RESEARCH QUESTIONS TO
CONCLUSIONS
Ingeneral, researchersbegin with a set ofresearch questions andmovetoward reaching conclusions. Data are collected in order to answer the research questions, and in qualitative studies the data are often voluminous. The researcher then faces the task of somehowreducing the data into a form inwhich it can beexamined for pattems andrelationships.
In mostapproaches, the researcher will goabout this through develop-ing somesortofcodingscheme:asetoftagsorlabels representing theconceptual categories intowhichto sortthe data. These may bedevelopedeitherapriori fromthe conceptualframeworkdriving the study,inductivelyastheanalysis proceeds and the analyst begins to identify issues in thedata, or by some combination of thetwo.Segmentsofthedataaremarked with relevant codes
(coded).
In many cases, researchers write memos as they code, recording emergingideas and early conclusions about both theory and methods. As insights accrue, it often becomes useful to search back through the data for places where specific wordsor phrases are used, and to locate related phenomenainthetext, inorder to both code these newchunks and check the validity of emerging conclusions. It may sometimes beuseful to create pointers(or links)
between differentplacesinthetextwhere thesameissues arise, as,forexample,wheninone interview apatientdescribes an episode thatis elsewherealso described by his or her caregiver. This whole process willinmany cases give rise to modification of any a priori coding scheme, and manyresearchers follow an iterative process, makingrepeated coding passesthrough
thedata.The next step in reducing the data is often to retrieve the chunks of text associated with particular codes, reading these passages to refine the understanding ofthat conceptual category. It may also be desirable to retrieve textaccording to combinations ofcodes-forexample, tosee where aconcepthavingtodowith aparticular caregiver attitude, say,"empathizes with patient,"coincides with a particular context, say,"extended one-on-one caregiver-patient contact."Theresearcher would then be able to read all of thechunks
(text passages)
whereboth ofthese codes had beenapplied, to1244 HSR HealthServicesResearch34:5Part II(December 1999)
Figure 1: FromResearchQuestionstoConclusions
RESFARCHQUESMIONS7 I- CONCNEPIuAL DATA FRAMEWORK COLLECION RETRIEVE CHUNKS \ CODING SCHEME
4
"__ CODE CHUNKS M MOING *~ WORDSEARCHING - DATALINKING REDUCE DATA ENTER INDISPIAYSConditions Users Administrators
Commitment Understanding Mastery etc. etc. DRAWCONCLUSIONS/VERIFY CopyrightC) 1995byEben A. Weitzman andMatthewB. Miles
seewhat,if any,relationship might
exist
between thetwo.Dependingonthe natureofthestudy,itmight also be desirabletoidentify all of the caseswhere bothofthese codesapply,evenifthetwoconceptsdonothappento emerge fromthesame textchunk.Inthisway,theresearcherbeginstobe ableto write summaries ofthe mainconceptualissuesthatappearinthe data. These preliminarywrite-ups havethevirtueofbeingmuch smallerinphysical lengththantheoriginal tran-scripts,makingthem that muchmoreaccessibletothe researcher'sscrutiny.
Commitment
Mastery
AnabzingQyalitativeDatawithComputer
SoJfware
They also have the disadvantage ofbeingonestep removed from theoriginal data. It istherefore important for the researchertohave the
ability
toexamine and re-examine the underlying data as analysis and write-up proceeds, in order to continually check interpretations against the data. Finally, many researchers enter their summaries into displays, forexample, text matrices ornetworkdiagrams,that aid inidentfying patternsandrelationshipsinthe data(Miles
and Huberman1994).
In Figure 1, the displaysare offered as an example, and the categories are adapted from the work ofHuberman and Miles (Huberman andMiles 1983, 1984; Miles and Huberman1994).
Theexamplematrix iscomposed of rows representing several conditions for success of a school innovation(commitment,
understanding,mastery)
and columnsrepresenting somekeystakeholdergroups: users oftheinnovation(teachers)
and the school administrators. Thecells ofthe matrix would be filled with summaries ofeachstakeholder group's views of eachcondition, allowing the researcher to look forpatterns across the data. The network diagram, similarly,shows the researcher's representation oftheconnectionsamong
the differentconditions.From these summaries ofthe data, which may now exist in memos, code "definitions," mini write-ups, and/or displays, the researcher draws conclusions. Dependingon themethodologybeing employed,theresearcher may then embark on anew round ofsearchingthrough the data to verify whether the conclusions reached are in fact supported by the data. And, finally,areport isproduced.
In dosing this discussion, it is important to emphasize that it is not intended as a substitute for methodological training, but as a generalized and quite simplified description. The researcher embarking on his or her first qualitative study will need to seek out training and readings on the
methodology
involvedinboth datacollection andanalysis in order tofully
understandthe content and process of the steps alluded to.
USES
OF
COMPUTER
SOFTWARE
IN
QUALITATIVE
ANALYSIS
Perhaps obviously,many of the tasks just described can be greatly assisted bythejudicioususeofcomputers.Computerscanhelpinvirtually all phases ofthisprocess, fromtakingnotes towriting reports. For example,computers canbe used for:
1. Makingnotesinthefield;
2. Writingup ortranscribing field notes;
1246
HSR.
HealthServicesResearch 34:5PartII(December1999) 3. Editing:correcting, extending,orrevisingfieldnotes;4. Coding: attachingkey words or tags to segments of text to permit laterretrieval;
5. Storage: keeping text in anorganizeddatabase;
6. Searchandretrieval: locatingrelevantsegments of textandmaking them available forinspection;
7. Data '"inking":connecting relevantdata segmentstoeach other to formcategories, clusters, or networks of information;
8. Memoing: writing reflective commentaries on some aspect of the data as a basis fordeeperanalysis;
9. Content analysis: counting frequencies, sequences, or locations of words andphrases;
10. Data display: placing selected or reduced data in a condensed, organizedformat, such as a matrix ornetwork,forinspection; 11. Conclusion-drawing andverfication: aiding the analyst to interpret
displayeddataand to test or confirmfindings;
12. Theory-building: developing systematic, conceptually coherent ex-planationsoffindings and testing hypotheses;
13. Graphicmapping: creatingdiagrams thatdepict findings ortheories; and
14. Report-writing: interim and final
(adapted
from Miles and Huber-man 1994: 44).But there is no one program that will do all of these things best. Therefore, thebestchoicewilldepend on your project, yourmethodology, your own style ofanalysisand personal preferences, and your team if you have one.Furtheronyouwillfinda set ofquestions to guide you through an assessment ofyour needs. First,however, it will be helpful to offer a typology ofsoftwaretypes.
SOFrWARE TYPE
FAMILIES
Thisis aroughsorting of available software into types. There is naturally quite abit ofoverlapamongcategories, with individualprogramshavingfunctions that would seem to belong to more than one type. However, it is possible to focus on the "heartand
soul"
of aprogram: what itmainly is intended for. This categorization scheme was first presentedin Weitzman andMiles
AnavzingQualitativeDatawith Computer
Software
Text RetrieversText retrievers specialize in finding all instances ofwords and phrases in text, in one or several files. They typically allow you to search for places where two or more words orphrases coincide withinaspecifieddistance (a number ofwords, sentences, pages,
etc.)
and allow youto sorttheresulting passages into different outputfiles and reports.Theymay dootherthingsas well,such as content analysis functions likecounting, displayingkeywordsin context,orcreatingconcordances(organized
lists of all words andphrasesin theircontexts),
orthey may allow you to attachannotationsor evenvariable values (for thingslike demographics or sourceinformation)
topointsinthe text. Examplesoftextretrievers areSonarProfessional,The TextCollector, andZyINDEX.TextbaseManagers
Textbase managers are database programs specialized for storing text in more orless organized fashion. They aregood at holding texttogetherwith informationabout it, and atallowing you to quickly organize and sort your datainavarietyofways,and retrieveitaccording to different criteria. Some arebetter suitedtohighlystructured data thatcanbeorganizedinto"records"
(i.e.,
specificcases)
and"fields"(variables:
informationthat appears foreachcase),
while others easily manage "free-form" text. They may allow you to definefieldsin thefixed manner of a traditional database such asMicrosoftAccess®or
FileMakerPro®,
orthey may allowsignificantlymoreflexibility, forexample,allowing different records to have different field structures. Their search operationsmaybe as good as, or sometimes evenbetter than those of some text retrievers. Examples oftextbase managers are askSam, Folio Views,Idealist,
InfoTree32XT,andTEXTBASEALPHA.Code-and-Retrieve
Programns
Code-and-retrieve programs areoften developed by qualitative researchers specificallyforthepurpose ofqualitative data analysis. The
programs
inthis categoryspecializeinallowingtheresearcherto apply categorytags(codes)
topassages oftext, andlater retrieve and display the text according to the researcher's coding. These programs have at least some search capacity, al-lowingyou to search either for codes or for words and phrases in the text. They mayhave a capacityto store memos. Even the weakest of theseprograms represent a quantum leapforward from the old scissors-and-paper approach: they're more systematic, more thorough, less likely to miss
tiings,
more1248 HSRHealth
Service
Research34:5Part II(December1999)flexible, and much, much faster. Examples of code-and-retrieve programs areHyperQual2,Kwalitan, QUALPRO, Martin,andThe DataCollector.
Code-based
Theory
BuildersMost of these programs are also based on a code-and-retrieve model, but theygo beyond the functions of code-and-retrieve programs. Theydo not, nor wouldyou want themto,buildtheoryforyou.Rather, theyhavespecial features or routines thatgo beyond those ofcode-and-retrieve programs in supporting your theory-building efforts. For example, theymay allow you to represent relations among codes, build higher-order classifications and categories,or formulate and testtheoreticalpropositions about the data.They mayhave morepowerful memoingfeatures
(allowing
you, forexample,
to categorize or code yourmemos)
or more sophisticated search-and-retrieval functions than do code-and-retrieve programs. They may also offer capa-bilities for "system closure," allowingyou to feed results ofyour analyses(such
as searchresults ormemos)
back into thesystem as data. Examplesof code-basedtheorybuilders areAFTER,
AQUAD,ATLAS/ti,Code-A-Text,
HyperRESEARCH,NUD-IST, QCA, The Ethnograph, and winMAX. Two ofthese
programs,
AQUADandQCA,support cross-caseconfigural analysis(Ragin 1987),
with QCA being dedicated wholly to this method and not havinganytext-codingcapabilities.
Conceptual Network Builders
These
programs
emphasize the creation and analysis of network displays. Some ofthemarefocusedonallowingyou to create networkdrawings, that is, graphic representations ofthe relationships among concepts. Examples of these are Inspiration, MetaDesign, and Visio. Others are focused on the analysisofcognitive or semanticnetworks,forexample,theprogram MECA. Still others offer some combination of the two approaches, for example, SemNet and Decision Explorer. Finally, ATLAS/ti, a program also listed undercode-based theory builders, has, inaddition,
a fine graphical network builder connected to theanalyticworkyou do with your text and codes.Summary
Inconcluding our discussion of the five mainsoftwarefamily types, it is im-portant to emphasize that
functions
often crosstype boundaries. Forexample,
Folio
Views can code andretrieve, and has anexcellent
text searchfacility. ATLAS/ti, NUD*IST, TheEthnograph,
and winMaxgraphicallyrepresentAnalyzingQualitativeDatawithComputer
Soflware
therelationshipsamong codes,although amongthese,onlyATLAS/tiallows you towork with andmanipulatethedrawing.TheEthnographand winMAX each have a systemforattaching variable values
(text,
date,numeric,etc.)
to textfiles and/orcases.Theimplication:donotdecidetooearlywhichfamily
youwant tochoosefrom.Instead, stayfocusedonthe functions youneed.CONCEPTUAL ASSUMPTIONS
Oneconcern manyresearchers haveisthatamethodologicalorconceptual approach will be imposed by the software. In
fact,
developers do bring assumptions, conceptualframeworks, and sometimes even methodological and theoreticalideologiestothedevelopmentoftheirproducts. Thesehave importantimplications fortheimpactthat using a programwillhave on your analyses. However, you need not, andinfactshould not, betrappedbythese assumptionsandframeworks: thereareoften ways ofbendingaprogramto your ownpurposes. For example, aprogram mayallow youto defineonly hierarchicalrelationsamongcodes. Youmight work around this by creating redundantcodesindifferent parts ofthehierarchy,orbykeepingtrackofthe extrarelationshipsyouwanttodefinewithmemosandnetworkdiagrams.The fact thatdevelopersbringconceptual assumptions to their work is infactoneofthestrengthsofthefield.Many of thedevelopers, particularly of code-and-retrieve and code-based theory-builderprograms, areresearchers themselves. Theyhave investedenormousintellectual energyinfinding the right tools for analyses of different types, and the user can benefitgreatly.
It isalso true thatthedesign of the software can have an impact on anal-ysis. Forexample, differentprogramswork with different
"metaphors,"
that is,different ways of presenting therelationships amongcodes and between codes andtextKwalitan,NUD-IST,TheEthnograph, and winMAX all allow hierarchical relationsamongcodes. For studiesinwhich you areorganizng
your conceptual categories hierarchically, these programs offer significant strengths. If you want to represent non-hierarchical relationships, even if youchoosetotrytowork aroundthismetaphor, it may be less comfortable then using aprogram likeATLAS/ti that explicidy supports more flexible networks. HyperRESEARCH emphasizes the relationship between codes and cases,ratherthancodes andchunksof text. Whenyou code a chunk of
text,
you create anentry onwhat looks like an index card for a particular case. Thisstrongly supports and encouragesthinking that stressescase-wise andcross-casephenomena,butitmakesithardertolook for andtiink
about relationshipsamong
codeswithinatext.1250 HSR. HealthServicesResearch34.5Part II(December 1999)
Finally, Code-A-Text offers a quite different set ofcoding metaphors: (1) codes arranged into "scales"
(you
can assign only one code from each scale to a segment oftext, which is useful if you want to code your text chunks by making mutually exclusive judgments on a variety of factors);(2)
codesautomatically assignedaccordingtowordsinthe text; and(3)
open-ended"interpretations" youwriteabout eachtextchunk. Workingwith this collection ofcoding metaphors could be expected to lead the analyst to consider his orher text in somewhat different ways than with one of the othermetaphors described earlier.Similar issues exist inthe choice of other types ofsoftware, such as textbase managers.InfoTree32 XT allows youtoarrangetexts in ahierarchical tree,and lets youdragtextsaroundtorearrange them.Folio Views also allows youtocreateahierarchicaloutlinebut gives you theadditionalcapabilityof creatingmultiple, non-sequential "groupings" oftextsthatyoucan activate anytime youwish. Folio Views andaskSam let youinsertfields whenever andwherever youwant inanygivenrecord, while InfoTree32 XT requires that you use astandardfield set
(of
yourowndesign)
throughoutthe whole database. Idealistnotonlylets you customize the fieldsetforeachrecord,but youcanalso create avariety of record types, each withits own setoffields, andmixthemin thesame database. Finally, whilemostof theseprograms show you justonerecordat atime,FolioViewsshows youaword processor-like view in which records appear one right after the other asparagraphs. Clearly, each of these programs shows you a quite different view ofyour data, may encourage different ways ofthinking
about your data, and has differentstrengths and weaknesses. Itisalso true thata clever userwill be able to bend each of these flexible packages to awide variety of different tasks, overcoming many of the differences amongthem.Eachofthese assumptionsisbothabenefit, for some modes of analysis, and alsoa constraint.Thekey,then,isnotto gettrapped by the assumptions ofthe program. Ifyou are aware of what they are, you can be clever and work around them. Theprogramshould serve your analytic needs, goals, and assumptions, nottheother way around. Researchers interestedinempirical work on theimpactof different
programs
onresearch are encouraged to refer toFielding andLee(1998).
CHOICE ISSUES
Uptothis point, thisarticlehas stressedrepeatedly that a choice must be made
carefully
to meet the needs ofboth project and researcher. Such a choice isbased on principles such as that the software should be matched to theAnalyzingQualitativeDatawithComputerSoftware
structureofthe dataset, to the
style
and mechanicsof theanalysis
planned, and to the researcher's level of comfort with computers ingeneral.
How then togo aboutmakingsuchaprincipled, customized choice? Fourbroad questions, along with two cross-cutting issues, can be asked that represent theseprinciplesandshouldguidetheresearchertosuchachoice (Weitzman and Miles1995a,b).Theseguidelinesforchoicehaveseenwideuse inpractice sincetheiroriginalformulation,andhaveproven to be effective forguiding researchers toappropriate choices.Specifically,therearefourkeyquestions toaskandanswer asyou move towardchoosing one or more softwarepackages:
1. What kind of computer useramI?
2. Am Ichoosingforone project orthenext fewyears?
3. What kind of
project(s)
anddatabase(s)
will I beworkingon? 4. Whatkind ofanalyses am I planning to do?Inaddition to these fourkey questions, there are two cross-cutting issues tobear in mind:
* Howimportantis it toyouto maintain a sense of"closeness"toyour data?
* What areyour financial constraints when buying software, and the hardwareitneedsto runon?
With these basic issuesclear, you will be able to look at specific programs in a more active, deliberate way, seeing what does or does not meet your needs.
(See
Table1, adapted from Weitzman and Miles 1995b, for a worksheet to organize such information. Table 2 provides contactinformation for the programs mentioned in this article and their demoversions.) Youcan work yourwayfromanswering questions,totheimplications
of thoseanswersfor program choice, to candidateprograms. Forexample,if
you areworkingon a complexevaluationstudy,with acombinationofstructured interviews,focus groups, and casestudies,youwillneedstrong tools for tracking cases through different documents.You might find goodsupport for this in codestuctures(particularly programs
withhighly structured code systems,like
NUD-IST, and to alesserdegreeinprograms
withflexiblecodesystemslike
ATLAS/ti), orthrough the use ofspeakeridentifiers in programs like The Ethnograph,AFTER,
orCode-A-TextQuestion 1: WhatKindofaComputer User Are You?
Yourpresent level of computer use is animportant factorin yourchoice of aprogram. Ifyou are new tocomputers, your best bet is probably to choose
1252
Table 1:
HSR.- Health Services Research 34:5 Part II (December 1999)
ProgramChoiceWorksheet
Issue Answer Implications/Notes CandidateProgram(s)
What Kind of aComputer UserAreYou?(level 1-4)
Are YouChoosingfor
OneProject or the Next Few Years?
Data SourcesPer Case:
Singlevs.Multiple
Single vs.Multiple Cases Fixed Records vs. Revised Structured vs.Open Uniform vs.Diverse Entries SizeofDatabase What Kind ofAnalysisis
Anticipated?
Exploratory vs.
Confirmatory
CodingScheme Firm at
Startvs.Evolving
Multiple vs.Single
Coding
Iterative vs.One Pass Fineness ofAnalysis Interest inContext of
Data
Intentions forDisplays Qualitative Only,or
Numbers Included Closeness to thedata important?
Costconstraints
AnavzngQualitative Datawith Computer
Soflware
Table 2: SoftwareContact InformationProgramName Source Phone WWWAddress
AFTER NOVA Research (301)986-1891 www.novaresearch.com
Company
ANSWR U.S. Centers for www.cdc.gov
DiseaseControl
AQUAD Giinter L. Huber, www.uni-tuebingen.de/uni/sei/a-Universityof ppsy/aquad/aquad.htm
Tubingen, Germany
askSam askSamSystems (800) 800-1997 www.asksam.com ATLAS/ti Scolari (805) 499-0721 www.atlasti.de
Code-A-Text Scolari (805)499-0721 www.codeatextu-net.com
Decision Explorer Scolari (805)499-0721 www.banxiacom
EZ-Text U.S. Centersfor www.cdc.gov
DiseaseControl
FolioViews OpenMarket www.folio.com HyperQual2 Raymond V. Padilla (602)892-9173
HyperRESEARCH Researchware, Inc. (617)961-3909 www.researchware.com Idealist BlackstoneSoftware
InfoTree32XT NextWord Software www.nextword.com Inspiration Inspiration Software (800)877-4292 www.inspiration.com
Kwalitan VincentPeters www.kun.nl/methoden/kwalitan
Martin SimmondsCenter (608)263-5336 for Instructionand
ResearchinNursing
MECA DepartmentofSocial (412)268-3225
and Decision Sciences
MetaDesign MetaSoftware (617)576-6920 www.metasoftware.com Corporation
NUD*IST Scolari (805)499-0721 www.scolari.com QCA Kriss Drass (702)895-0247
QUALPRO BernardBlacknan (904)668-9865 SemNet KathleenFiher (619)594-4453
Sonar Professional VirginiaSystems (804)739-3200 www.virginiasystems.com
TEXTBASE BoSummerlund www.psy.aau.dk/bos/betahtm
ALPHA/BETA
TheDataCollector Intelimation (805)968-2291
TheEthnograph Scolari (805)499-0721 www.scolari.com TheTextCollector Dennis V. O'Neill (415)398-2255
WmMAX Scolari (805)499-0721 www.winmax.de ZyINDEX Zylab (800)544-6339 www.zylab.com
1254 HSR HealthService Research34:5 Part II(Decmber1999)
a word processing program with advice from friends, and begin using
it,
learning to use your computer'soperating system(e.g.,MS-DOS, Windows, orMac)andgetting comfortable with the idea of creating text, moving around init,
and revising it. That would bring you to whatwe'll call Level 1. Or you may beacquainted with several different programs, use your operating systemeasily, and feelcomfortablewith the idea ofexploringandlearning newprograms(Level 2).
Or you may be a person with active interestinthe ins and outs of howprograms work(Level 3),
and feel easywithcustomization,
writing macros, etc.
(We
willnot deal here with the"hacker,"aLevel4person who lives andbreathescomputing.)Beingmore of a novice does not mean you have to choose a"baby" program, oreven that you shouldn't choose a very complex program. It does mean,though, allowingfor extralearning time, perhaps placingmore em-phasis on userfriendliness,andfindingsourcesofsupport for yourlearning, such as friends orcolleagues, or on-line discussion groups on the Internet People at different levels seem tohave quite differentreactions to thesame programs. So, forexample, aperson atLevel 2 or 3 mightlike aprogram that puts themaximum informationon one screenbecausethisallowsher to quicklyfind what she wants and learn theprogram veryquickly.Aperson at Level 1 might findall thatinformationoverwhelmingat
first,
andmight take alitdelonger tolearnthatprogram becauseofit.But,
once theprogramwas learned,would probably benefit from the layout in thesameway as the more advancedcomputer user.Question2: AreYou
Choosingfor
OneProect
ortheNext Few Years?Awordprocessor does not care what you are writing about, so most people pick one and stick with it until something better comes along and they feel motivated to learn it. But particular qualitative analysis programs tend to begoodforcertain
types
ofanalyses. Switchingwill costyoulearning time andmoney. Think about whetheryou should choose the best program for this project, or theprogram
thatbest covers the kinds ofprojects you are considering over the next fewyears. For example, a particular code-and-retrieveprogram might lookadequate forthecurrentproject and be cheaper orlook easier to learn. But, ifyou arelikelyto need amorefully featured code-basedtheory-builder down the road, itmightmake
more sense to get started withone ofthosenow(assuming
youchoose one that includes good code-and-retrievecapabilities).
Analyzing Qualitative Data
wit/
ComputerSoftware
Question3: WhatKindofDatabaseandProject Are You Working With?
As youlook at detailed softwarefeatures, you needto play them against a series ofdetailedissues.
Data Sources: Singk versusMultipk. You may be collecting data on a case from manydifferentsources (sayyourcase isdefinedas apatient, and you talk with several nurses, the patient's parents and friends, the doctor, andthe patientherself. Someprograms are specificallydesigned tohandle data organized like this, others are not designed this way but can handle multiple sources prettywell, and some really donothave theflexibilityyou'll need. Also, look for programsthat are good atmaking links, such as those with hypertext capability, and that attach "source tags" telling you where informationiscomingfrom.
Singkversus
Multipk
Cases. If youhavemultiple cases, youusuallywill want to sort them outaccording to differentpatterns or configurations, and/or work withonly some ofthe cases, and/or do cross-case comparisons. Multi-casestudies cangetcomplicated.Forexample, yourcasesmightbe patients(and
you might have data from multiple sources for eachpatient).
Your patients might all be "nested in"(grouped
by) care units, which might be nested within hospitals, which in turn might be nested in cities. Look for softwarethatwilleasilyselectdifferent portions of thedatabase, and/or thatwill
doconfigurational analysis across your cases; software that can help you createmultiple-casematrixdisplaysisalso useful.FixedRecords versusRevised.
Wil
yoube workingwith datathat are fixed(such
asofficial documents orsurveyresponses),
ordata thatwill be revised(with
corrections, addedcodes,annotations,
memos,etc.)?
Some programs make databaserevisioneasy,and others are quiterigid:revising can use up a lot oftimeand energy. Somewill let youreviseannotationsandcoding easily, butnottheunderlyingtext,and some wfll let you revise both.StructuredversusOpen. Are your datastrictly organized
(e.g.,
responses to astandard questionnaire orinterview),
orfree-form(mnning
field notes, participant observation,etc.)?
Highly organized data can usually be more easily,quickdy,
andpowerfiully
managedinprograms setup to accommodate them-for example, those with well-defined "records" for each case and "fields"(or variables)
with data for each record. Free-form text demands a moreflexibleprogram.Thereareprogramsthatspecializeinone or the other typeofdata, andsomethat workfairlywell with either.Uniform
versusDiverseEntries. Your data may all come from interviews.1256 HSR Health ServicesResearch34:5 PartII(December1999)
Or you may have information of many sorts:documents, observations, ques-tionnaires,pictures,audiotapes, videotapes(thisissueoverlapswithsingle ver-susmultiplesources,above).Someprograms handle diverse data typeseasily, and others are narrow and stem intheir requirements. Ifyou will have diverse entries, look for software designed to handlemultiple sources and types of data, with good source-tags, andgoodlinkingfeaturesinahypertextmode. The ability to handle "off-line" data-referringyoutomaterial notactually loaded into yourprogram,such as whole books or manuals-is a plus. Many programscanbe tricked into doingthisifyou are clever about it. Forexample, you mightsetup an empty document on your computercorrespondingas aproxy to each "off-line" document. You could enterchapter numbers, or some otherindexinginformation intothis blank proxydocument.Then, as you read theoff-linedocument, you could code thecorrespondingindexing information
(e.g.,
chapternumber)
intheproxy. When you do a search, the program will retrieve the reference in the proxy document, and you will know where to look in theoffline document.(This
strategy is adapted from onesuggested in the manual forNUDIST.)
Size
ofDatabase. A program's database capacity may be expressed in terms of numbers of cases, numbers of data documents(files),
size of individualfiles,and/or totaldatabase size, oftenexpressedinkilobytes(K)
or megabytes(MB).
(Roughly,
consider that asingle-spacedpage of printed text is about 2 to3K).
Estimate your total size in whateverterms theprogram's limitsareexpressed, and at least double it. Mostprogramstoday are generous interms oftotal database size. A few are stillstingy when it comes to the size of individual texts. Forexample,in someprogramswhen a document goes beyond about ten pages, theprogram will insist onbreakingitintosmaller chunks, orwillopen it onlyina"read-onlymode" browser.Qyestion4:
What
KindofAnalysisIsAnticipated?Your choice ofsoftware also depends on howyou expect to go about the analysis. This doesnot meanadetailedanalysisplan,butageneral sense of thestyle and approach you are expecting to use.
Fkploratory
versusConfirmatory. Areyoumainly planning to poke around inyour data to see what they are like, evolving your ideas inductively? Or doyou have somespecific hypotheses in mind linked to anexistingtheory to testdeductively?Iftheformer, it is especiallyimportant
tohave features of fastandpowerful search and retrieval, and easy coding and revision, along withgood text and/or graphic display.Analyzing QualitativeDatawith Computer
Software
If, on the other hand,you have a beginning theory and want to test somespecific hypotheses, programswithstrongtheory-buildingand testing features are more appropriate. Look forprograms thattestpropositions or thosethathelpyoudevelop and extend conceptualnetworks.
CodingSchemeFirmatStart versusEvolving. Doesyourstudyhave a fairly well definedapriori schemeforcodes
(categories, keywords),
perhapstheory-derived, thatyouwillapplyto yourdata? Or will such ascheme evolve as you go, in agrounded theory style,usingthe"constant comparative"method (Strauss and Corbin 1998). Ifthe latter, it is especially important to have on-screen coding
(rather
than having to code on hard copy) and to have features supporting easy orautomatedrevision ofcodes. Hypertextlinking capabilitiesarealsohelpful here. "Automated"coding(inwhich theprogram applies acodeaccordingto aruleyou set up,suchaswhena certainphrase, or acombination of othercodes,exists)
canbe helpfulineithercase.Multipk versus Singk Coding. Some programs let you assign several different codes to the same segment of text, including higher-order codes, andmayletyouoverlapornestcoded"chunks"
(the
rangesof text youapply codesto).
Others are stem: one chunk, one code. Still otherprograms will letyouapplymorethanonecodeto achunk, butwillnot"know" that there are multiple codeson thechunk: they willtreat itliketwo chunks,onefor each code.Iterative versus OnePass. Do youwant-and do youhave the time-to keep walkingthroughyour data several times, takingdifferent and revised cuts?Orwillyoulimityourselfto one pass?An iterative intent should point you toprogramsthatareflexible,inviterepeatedruns,makecoding revision easy,havegood search andautocodingfeatures,andcanmakealogof your workas you go. (See alsothe question of whether your records are fixed or revisable duringanalysis.)
FinenessofAnalysis. Willyouranalysis focus on specific words? Or lines of text? Or sentences? Paragraphs? Pages? Whole files? Look to see what the program permits (orrequires, orforbids)you to do. How flexible is it? Canyoulook at varying sizesofchunksin your data? Canyoudefine
free-fonr
segmentswithease?Someprogramsmake you choose the size of your codable segments when you first import the data; others let you mix and match chunksizes asyougo.Interest in Context ofData. When theprogram pullsout chunks of text
inresponse to yoursearchrequests,how muchsurrounding information do you want tohave?Do you needonly the word, phrase,orline itself? Do you wantthe preceding and followinglines/sentences/paragraphs? Do you want
1258 HSR:HealthServicesResearch34:5Part II(December 1999)
to seethe entirefile? Do you need to be abletojumprightto thatplacein
the fileand do some work onit(e.g.,code, edit, annotate)?Do you wantthe information to be marked with a"source
tag"
thattells you whereit came from(e.g.,
"Interview 3withJaniceChang, page 22, line6")?Programs vary widely on this.Intentionsfor Displays. Analysis goes much better when you can see organized,compressed information in one place ratherthaninpage after page ofunreducedtext. Some programsproduce outputin listform (lists oftext segments,hits,codes,
etc.).
Somecanhelpyouproducematrixdisplays.They may list text segments orcodes foreach cell ofamatrix,although you will have toactually arrange themin amatrix fordisplay.Lookforprograms that let youedit, reduce, or summarize hits, before you put theminto atext-filled matrixwith your word processor.Someprogramscangive you quantitative data (generallyfrequencies)
in a matrix. Others cangive you networks or hierarchicaldiagrams,the other majorform of data display.QualitativeOnly,orNumbers
Included.
Ifyour data, and/or youranalyses include thepossibilityofnumber-crunching, look to see whether the program will counttiings,
and/orwhether it can share information with quantitative analysis programs such as SPSS or SAS. Think carefully about what kind ofquantitative analysisyou'llbe doing, and makesuretheprogram you are thinkingabout can arrange the data appropriately. Consider, too, whether programs canlinkqualitative and quantitative datain ameaningfiul way(in
termsof theanalytical approach you are
taling).
CROSS-CUrlING ISSUES
The two main cross-cutting issues are closeness to the data and financial resources. Let usdispensequickdywith the latterquestion first. Software varies dramaticallyinprice.The range of prices for theprogramswereviewed was
$0
to$1,644
peruser(WeitzmanandMiles1995b).
That isstill the range. In addition,programsvary a lot in the hardware they require to run efficiently. Youobviouslycannotuse a program ifit is tooexpensive for you, ifit requires a machine you cannotafford, or ifit runs on the wrongplatform,,say, PC instead of Mac. Happily,theU.S. Centers for Disease Controland Prevention now distributes twoprograms free of charge via the Web: EZ-Text for qualitative surveys andAnswrforunstructured text.The remaining issue, closeness to the data, is more complex. Many researchersfearthatworking with qualitative dataon acomputer will have
Analyzing QualitativeData withComputer
Software
the effect of"distancing"them fromtheir data. Thiscanin factbe thecase. You maywind uplooking at only small chunksof text at atime,or evenjust atline-number referencesto textlocations.Thisis afarcryfromthefeeling ofdeepimmersion inthe data thatcomesfromreadingandffipping through piles of paper.
Butotherprogramsminimizethiseffect They typicallykeepyourdata files on screenin front of youatall times; show you your searchresults by scrollingtothehit,sothat yousee it in itsfull context; and allow youto execute most, orallactions fromthesame data-viewingscreen. Programsthat allow youtobuildinhypertext links between different pointsinyourdata, provide good facilities forkeepingtrack of your locationinthedatabase, displayyour coding and memoing, and allow youtopulltogether related dataquickly,can in somewayshelp you get even closerto thedata than youcanwith paper transcripts. Ifyouchoose withthis consideration inmind, softwarecan help rather thanhinder your work at staying closetothedata.
However, having software that enhances a sense of closeness to the data maynotbeacrucialissueforeveryone.Some researchers donotmind relying heavily on printed transcriptstogetafeelingofcloseness, while others thinkthatsuchheavyreliancedefeats the purposeofqualitative data analysis (QDA) software. Furthermore, some projects simplydo notrequire intense closeness to the data. You may bedoingmoreabstract work and,infact, want to moveawayfromtherawdata.
BENEFITS
ANDLIMITATIONS
OF
SOFTWARE
USETheuseofgood QDAsoftware canenhance,inawide variety of ways, both thespeed ofthe processand the quality of the results of a qualitative or mixed-methods study. But there are also some important limitations to what even thebestsoftware will do for you.
Benefits
One ofthe first benefits you will realize is that using software speeds up analysis tasks aloLThere is simply no way to search for important phrases, create textsegments andcode them, cross-reference memos to text and codes, createcross-referencesbetween parts of the text, and so on, nearly as fast by hand as you can with a computer. Qualitative research can be enormously time-consuming, of course, even with the use of a computer: you
will
still 12591260 HSR.Health Services Research34:5 Part II(December 1999)
have toread and thinkcarefully aboutyourdata.Butthetimesavings inthe mechanicaltaskscansubstantiallylighten the burden.
Second, software can make itpossibletodoanalysesthat are not feasible by hand. For example, with the right software it is easy to go back and recode and then look again for the new results ofcombined-code searches, or quicldy gather together all the intersections ofcodes in your "Themes" listwith codes in your "Diagnoses" list to look forpatterns of association. You can link multimedia, such as audio or videorecordingsof interviewsand focus groups, to your transcripts so that you can code for non-verbal as well asverbalphenomena. And the"Booleanminimization" logicthatunderlies Ragin's
(1987)
cross-caseconfigural analysiswould bedauntingwithout the helpof aprogramlikeQCAorAQUAD.Software can also help you to ensure consistency and comprehen-siveness. Once you have realizedhalfwaythrough youranalyses thatsome patients make frequent references to death,you can go back and search for all of the synonyms and euphemisms for death, and code them all. That also means that once you have reached a tentative conclusion, you have much greater power to go back and verify itagainst the actual data. Your likelihoodof finding the data that contradict, or lend even better support to your emerginghypothesis,isnowmuchgreater.
Finally, software makes it possible to track many facets of the data in connection with each other. For example, ifyou need to combine demo-graphic data, scale scores, theme codes, andimportant keywords in the text to find the relevant chunks of text, you can find software that will help you pull all of those
tiings
together.Limitations
However, importantlimitationsalso exist.First and foremost is that software will not doqualitative analysis. It can findphrases; it can find text with codes you haveapplied. It can count occurrences ofthose things.Some software, likeSPSSTextSmart,willattempt a quantitatively based categorization of text responses. But there is no software that canmake the kinds of interpretive
judgments
that areinherent inqualitativeanalysis.Second, software isno substitute for methodological training. A dis-turbing number of papers and grantapplicationshave begun to cite a QDA softwarepackageas theiranalytic method. This is both absurd and misleading, and it isdangerous. The choice of a softwarepackageinnoway determines the analytic operations thatwill becarried out, the basis that will
underlie
decisions aboutcoding, the rationale that will be employed for determiningAnalzing QyalitativeDatawith Computer
Software
relationshipsamongconcepts,orthe waysinwhich the
resulting
conclusions willbeevaluated.Finally,using software forqualitative dataanalysisrequiresnew leam-ing. Particularly the first time youadoptaqualitative analysis
package,
you willbelearningawhole new kind of program. Thinkback to what it was like tofirstlearnWordStarorMacWrite(or
whatever wordprocessor you started on),andcompare that with what it is like now, when youupgradeorlearna new one. You know what a word processormainlydoes:you just have to find the new buttons/commandsand then incorporateanunderstandingofafew new features. Now you will be starting at thebeginningagain.TRENDS AND
DEVELOPMENTS
Some very encouraging trends arebeingtracedin ongoingQDA software development. These fall into two rough categories: Catch-Up andForging Ahead.
Catch-Up
One of the main trendsintheworld ofQDA software is thatdevelopershave beenworkingfeverishlytomakeup for theweaknessesintheirprograms rela-tive to others. Thoseprogramsthathavebeen found by reviewers and users to be wanting in searchfacilities, flexibilityintherelationships between text and codes, memoing, data linking, coding display, and qualitative-quantitative
linking,
arequickly
catchingup.
Forging Ahead
Inaddition, several programs, such asCode-A-Textand ATLAS/ti are devel-oping exciting facilities for incorporatingmultimedia(bothaudio andvideo) foranalysis alongwithtext.Variousprogramsare moving toward supporting formatted text, the exchange of data
among
QDA packages, supporting collaboration,anddeveloping new metaphorsforproviding access tothedata.HOPES
Whathopes should we have for futuredevelopment?Here are a few: * Continued catch-up so that essentialfeatures are available in all
pro-grams;
1262
HSR.
Health ServicesResearch34.5PartII (December1999)*
Moreflexibility
inconfiguringprogramsto meetspecific researchneeds; * More "Pencil-lekvl Richness"-continuing the trend of giving the re-searcher thesame kind of intimacy withthedatathatyou get from workingwith paper, allowing the researcher to see not justcodesbut memosand annotationsbesidethe text aswell,asinATLAS/ti; * More data exchange amongQDA packages, so thatitbecomes morefeasible tocombine complementaryprograms; * Moremultimediasupport;
* Moreease-of-use,lower
learning
curves, and bettertutorials;* More supportfor quantitative and demographic
variables,
and other facilitiesfortrackingindividualsthroughmulti-source data sets; * Moresupportforthe needsofnarrativeanalysis,such as features totrackstructures in the text; and
* More andbetter toolsforcollaboration, such as tools for merging and tracking thework ofteammembers,and forthesharingofdata
(e.g.,
over theWorld WideWeb).CONCLUSION
Software,ifchosenappropriately,canbeasubstantial and important advan-tage forqualitative research projects. It needs to be selectedwithcarefulregard for the needs ofindividual researchers intermsofworking
style,
structure of thedata, andanalytic plans.Thepotentialbenefitsincludespeed, consistency, rigor, and access to analytic methods notavailable by hand.REFERENCES
Bogdan, R. C., and S. K Biklen. 1982. QyalitativeResearch in Education. Boston: Allyn andBacon.
Drass, K A. 1980. "TheAnalysis of Qualitative Data: A ComputerProgram." Urban
Life9: 322-53.
Fielding, N. G., and R M. Lee. 1998.
Computer
Ana4sis
and Qualitative Research.London:Sage.
1991. UsingComputers in
Qualitative
Research.London: Sage.Goetz,J. P.,and M. D.LeCompte. 1984. Ethnography andQyalitative
Design
inEduca-tionalResearch.Orlando,FL:AcademicPress.
Huberman,A.M., and M. B.Miles. 1984. InnovationUpClose:HowSchool
Improvmen
Works.NewYork:Plenum.
. 1983. Innovation Up Cose: A Feld
Sudy
in 12School
Settings. Andover, MA:Analyzing QualitativeDatawith Computer
Software
Kelle, U., ed. 1995. Computer-Aided Qualitative Data Analysis: Theory, Methods and
Practice.London:Sage.
Lofland, J., and L. H. Lofland. 1984.Analyzing Social Settings:A Guideto Qualitative Observation andAnalysis,2nded.Belmont, CA: Wadsworth.
Mangabeira,W,ed. 1996. "Qualitative SociologyandComputer Programs: Advent and Diffusion ofCAQDAS" (specialissue) Current Sociolog 44 (3).
Miles, M. B., and A. M. Huberman. 1994. QualitativeData Analysis:AnExpanded
Sourcebook,
2nd ed. Thousand Oaks, CA: Sage.. 1984. QualitativeDataAnalysis:ASourcebookofNew Methods. Newbury Park,
CA:Sage.
Miles, M. B., and E. A. Weitzman. 1996. "The State ofQualitative DataAnalysis
Software: What Do WeNeed?" Current Sociology44(3):206-24.
Ragin,C. C. 1987. TheComparativeMethod:MovingBeyond QualitativeandQuantitative Strategies.Berkeley, CA: University ofCaliforniaPress.
Seidel,J. V.,andJ.A. Clark. 1984. "The Ethnograph:AComputerProgramforthe Analysis ofQualitativeData."QualitativeSociology7(1,2): 110-25.
Shelly, A., and E. Sibert. 1985. The Qualog Users Manual. Syracuse, NY: Syracuse University, School ofComputerandInformationScience.
Strauss,A.L,andJ.Corbin. 1998.BasicsofQualitativeResearch:Techniques andProcedures forDevelpingGroundedTheory,2nd ed. Thousand Oaks, CA: Sage.
Tesch, R, ed. 1991. "ComputersandQualitative Data II"(specialissues, parts1 and
2) QualitativeSociology 14, no. 3 and no. 4(fallandwinter).
. 1990. QualitativeResearch:Analysis Types and
Software
Tools. New York: TheFalmer Press.
Weitzman, E. A., and M. B. Miles. 1995a."ChoosingSoftware forQualitative Data
Analysis:AnOverview." CulturalAnthropologyMethods 7(1): 1-5.
. 1995b. Computer ProgramsforQualitative Data Analysis: A
Software
Sourcebook.Thousand Oaks, CA: Sage.t