Databases save time and improve the quality of the
design, management and processing of ecopathological
surveys
P Sulpice, F Bugnard, D Calavas
To cite this version:
P Sulpice, F Bugnard, D Calavas. Databases save time and improve the quality of the design,
management and processing of ecopathological surveys. Veterinary Research, BioMed Central,
1994, 25 (2-3), pp.120-126.
<
hal-00902181
>
HAL Id: hal-00902181
https://hal.archives-ouvertes.fr/hal-00902181
Submitted on 1 Jan 1994
HAL
is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not.
The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL
, est
destin´
ee au d´
epˆ
ot et `
a la diffusion de documents
scientifiques de niveau recherche, publi´
es ou non,
´
emanant des ´
etablissements d’enseignement et de
recherche fran¸
cais ou ´
etrangers, des laboratoires
publics ou priv´
es.
Databases
save
time
and
improve
the
quality
of
the
design,
management
and
processing
of
ecopathological
surveys
P
Sulpice
F
Bugnard,
D
Calavas
Centre
d’Écopathologie
Animale, 26, rue de la Baisse, 69100 Villeurbanne, FranceSummary ―
Theexample
of anecopathological
survey onnursing
ewe mastitis shows that data bases have 4complementary
functions: assistanceduring
theconception
of surveys;follow-up
of surveys;management
andquality
control of data; and dataorganization
for statisticalanalysis.
This is madepossible by
the simultaneousconception
of both the data base and the survey, andby
theinte-gration
of computer science into the work of the task group that conducts the survey. Thismethodology
helps
save time andimprove
thequality
of data inecopathological
surveys.ecopathology
/ survey / dataprocessing
/ data base /sheep
Résumé ― Les base de données :
gain
de temps et dequalité
dans laconception,
lagestion
et le traitement desenquêtes
d’écopathologie.
En prenantl’exemple
d’uneenquête d’écopathologie
sur les mammites des brebis aitaitantes, on montre que les bases de données ont 4 fonctions
com-plémentaires :
aide à la conception desenquêtes,
suivi desenquêtes, gestion
et contrôle dequalité
desdonnées,
organisation
des données pourl’analyse statistique.
Cela est rendupossible
par une concep-tion simultanée de la base de données et del’enquête,
etl’intégration
de la matièreinformatique
augroupe de travail
qui
élaborel’enquête.
Cetteméthodologie
concourt à gagner du temps et àamélio-rer la
qualité
des données dans lesenquêtes d’écopathologie.
écopathologie
l enquête
l informatique
/ base de données / ovinsINTRODUCTION
Data bases become necessary in
eco-pathological
surveys because of the volume of the collected data and the structure of the information(Lescourret et al, 1992).
Atfirst,
data bases were used at the Centred’Écopathologie
Animale after thephase
of data collection. This late useproved
to beexpensive
inconception
time,
particularly
for the semanticmodelling
of data.Further-*
Correspondence
andreprints
more, the use of the data base after all the information has been collected allows
only
a late verification of the data without the
possibility
ofreturning
to itsorigin
toimprove
data
quality.
In order not to limit the definition of the function of the data base to the orga-nization of dataprior
to statisticalanalysis,
itsconception
and construction must becar-ried out at the same time as the survey.
Thus,
the data basesconceived,
gener-ated and usedby
the Centred’Ecopatholo-gie
Animale have 4complementary
func-tions: assistanceduring
theconception
of surveysby
acomputer-oriented approach;
follow-up
of the survey in thefarm;
man-agement
and control of the dataquality;
andorganization
of dataprior
to statisticalanaly-sis.
The
example
of the survey onnursing
ewe mastitis(Calavas,
1992)
illustrates these 4 functions andgives
ageneral
idea of how thisapproach
canhelp
save time andimprove
thequality
of theecopatho-logical
surveys.FUNCTIONS OF DATA BASES AT THE CENTRE
D’ECOPATHOLOGIE
ANIMALEAssistance
during conception
of surveysA survey and its data base are
developed
simultaneously
andinteractively.
From the verybeginning,
thecomputer
scientistpar-ticipates
in the work of the task group incharge
of the survey(Calavas, 1992).
This allows thedevelopment
of a synergy andan interaction between the survey and the
database,
as well as the distribution ofcon-ceptual
elements,
such as ahypothesis
scheme
(path analysis),
aproject
of thepro-tocol,
models ofquestionnaires,
and acon-ceptual
data model(fig
1
).
The
conceptual approach
previous
to thegeneration
of data bases(Tardieu
etal,
1983)
has alarge impact
on theconception
of the survey. In some cases, theconceptual
data model built from thehypothesis
scheme,
theprojects
ofquestionnaires
and the data collectionprotocol
all shedlight
upon a non-functionalorganization
of the survey(data coding,
identification ofindi-viduals,
and coherence ofquestionnaires).
Furthermore, peculiarities
that make the progress of the survey easier in thefield,
must be taken into account in the data base. For
example,
in this survey, the ewes wereidentified
by
a collar number(to
facilitatepinpointing
asample
of ewes monitoreddur-ing
the entiresurvey)
and/orby
aperennial
identification
number,
an eartag,
ortattoo-ing
(for
animals affectedby
mastitis and added to thesample
in the course of thesur-vey),
which necessitatedtaking
into account2 different
identifying
attributes(fig
2).
Follow-up
of surveysThe
study
of the information flow allowscomputerization
of certainoperations
of the surveymanagement,
forexample,
letterssent to different
participants
of the survey(reminders
to returnquestionnaires
and surveyagreements)
and tools of the pro-tocolfollow-up
(identification
labels for thebiological
samples
and lists of the ewesexamined
during
the first visit in order tofacilitate the
follow-up
of these animalsdur-ing
subsequent visits).
This also allows the automation of the return of information tothe breeders and the interviewers in the
course of the survey, eg results of
sero-logical analysis
andpreliminary
statisticalanalyses.
Management
and control of dataquality
The fact that the data base is
operational
when the survey starts in the field allows acontinuous
operation
ofrereading/correction
and dataentry
as soon as the first surveydocuments return to the Centre
d’fco-pathologie
Animale.Information control at different levels
(identification,
intrinsiccoherence,
and coherence betweenvariables)
allows the detection of errors and inconsistenciesunno-ticed
during
therereading
ofquestionnaires
(Sulpice, 1992).
This also makes itpossi-ble to go
immediately
back tothe
source ofinformation in the case of error, or
Organization
of dataThe data base constitutes the
protected
andautonomous
primary layer
of the surveydata,
ie includes all collecteddata,
as well as an exhaustive datadictionary.
The data basemanagement system
becomes a server for the data to beexported
to thesta-tistical
analysis
software(eg,
SAS andSPADN).
On their wayback,
thesynthetic
datagenerated by
the statisticalanalysis
(eg,
calculated data and riskfactors)
areintegrated
in asecondary
layer.
In theend,
these 2 layers
of information constitute the referential data base built from the survey.SAVING TIME AND IMPROVING THE QUALITY OF SURVEY DATA
Saving
timeThe synergy
implemented
in the concep-tion of the survey is aimed atsharing
thefeasibility
constraints of the survey(eg,
the way the visits progress, thespecial
organi-zation of a notation document in the form of atable,
or thenecessity
of a double iden-tification of theewes)
and the data base(eg,
division into entities and associationsto be
implemented).
There are no ’head losses’ insofar as
data
processing
isintegrated
in the task group that conducts the survey, computer-ized dataprocessing
being
one of thedis-ciplines
of the task group and thecomputer
scientistrepresenting
one of theprofessions
(Rosner, 1984).
Sharing
conceptual
elementsduring
theconception stage
of the surveyhelps
val-orize the work ofpreparing
theanalysis
doc-uments. For
example,
adiagram
of thephysical
data model is used togive
asyn-thetic view of the collected data and can be used as a work document
during
thecon-ception phase
and as an element of thestudy report
(fig
2).
The automation of the data
management
tasks allows a reduction in the timespent
on the administrative
management
of the survey(eg,
movements of thedocuments,
survey progress, andmail)
andthus, greater
importance
can begiven
to thefollow-up
of the data collection(eg, rereading
thedocu-ments,
respecting
the deadlines of thepro-tocol,
andmaking
inquiries).
The
opportunity
tocontinuously
carry outoperations
ofrereading/correction
and dataentry
allows the statisticalanalysis
to be started at the return of the final document and sometimes even in the course of the survey when the documents aresplit
up. Itis therefore
possible
toconsiderably
reduce the time between the arrival of the infor-mation at the Centred’!copathologie
Ani-male and thebeginning
of the statisticalanalysis,
thushelping
to obtain results muchquicker
(fig
1
).
Improving
thequality
of the survey dataFirst,
due to thecomputer-oriented
approach
preceding
itsgeneration,
the data basehelps improve
thequality
of the data col-lectionduring
theconception stage
of the survey:feasibility
ofprotocols;
coherence of thequestionnaires;
study
of the infor-mation flow.The data base then
helps
collections of databy using
thefollow-up
tools which itgenerates (owing
to thegenerated
lists,
it ispossible
to reduce the number of eweslost
during
the survey because of identi-ficationerrors).
Its main mission at the level of the data collection is toprovide
control of the datareliability
at the time of theirentry
and allow aquick
return to the source of the information in the case of an error. Forexample,
4questionnaires
about theweaning,
flockcontacts)
and 4 598 individ-ual forms filled outduring
the first visit of the survey were verified and entered into the data baseimmediately
afterbeing
received at the Centred’Ecopathologie
Ani-male.Concerning
thequestionnaires,
it wasfound out that 2.82% of the data was
miss-ing
(176
out of 6235).
Reminders were sentimmediately
and 169missing
data itemswere recovered
(to
reach the level of 1 per 1 000missing data).
In what concerns theewes, each animal was described
by
20variables,
totalling
91 960 data items(4 598
x
20).
The rate of themissing
data in all filled-out forms was less than 1 per 1 000(89
out of 91960).
Finally, during
statisticalanalysis,
the data base allows a transfer to the statisti-calsoftware,
both filescontaining
the survey data and a datadictionary (with
the name,label and
description
of the classes ofvari-ables).
The datadictionary
provides,
in anunivocal way, a reference to each variable based on the
precise
wording
of theques-tions;
this excludes all risk ofambiguity
in theanalyzed
variables(no
errorspossible
in thewording
or in the value of eachclass).
Moreover,
allediting
necessary for theanal-ysis
is documented. Thisprovides security
andcomfort,
especially taking
into accountthe
difficulty
ofgiving only
8-characternames to variables
(in
the survey therewere, for
example,
119 individual variables relative to the ewes, of which 59 character-ized the mammaryglands,
with notation of lesions on eachquarter
of the mammarygland
twiceduring
the survey; thus it wasdifficult to have 59
explicit
names based on a 8-characterlabel).
Illustrative
figures
on the survey ofnursing
ewe mastitisSome indicators will
help appreciate
the time saved and thequality improved
in this survey. Theconception
and thegeneration
of the data base with the data base
man-agement system
Dataflex(Data
AccessCor-poration,
1991)
required
15working days
for acomputer
scientist. The database wasoperational
at the time of the data collec-tion(fig
1
the
statisticalanalysis
started inJuly
1993 on the data collectedduring
the firstvisit,
that is to say 34 d after the returnof the last
questionnaire,
andimmediately
after the return of the last reminder. The
quality
of the collected data can be evalu-atedby
the final rate of themissing
data after the reminders(less
than 1 per 1000).
CONCLUSION
The
synergetic
and simultaneousorgani-zation between the
ecopathological
survey and the data base reinforces the role of the data bases. This determines the 4comple-mentary
functions of the databases:as-sistance in the
conception
of surveys; tech-nicalmanagement
of surveys;management
and control of the data
quality;
andorgani-zation of data for statistical
analysis.
Thus,
the data bases set upby
theCen-tre
d’tcopathologie
Animale for each of its surveys conform with:(i)
itsguidelines,
asthey
reduce the time ofprocessing
theeco-pathological
surveys andgenerate
a ref-erential data base built from the survey; and(ii)
the chosenmethology
of datapro-cessing
integrated
into the work of apluridisciplinary
andpluriprofessional
task group that conducts the surveys and theirfollow-up.
REFERENCES
Calavas D
(1992) Conception
de1’enqu6te.
In: Mammites des brebis allaitantes. Centred’Écopathologie
Animale, Villeurbanne, 11,l,pp143
Data Access
Corporation
(1991) Dataflex User’s Guide. Miami, USA, pp 732Lescourret F, Perochon L, Coulon JB,
Faye
B,Landais E
(1992) Modelling
and informationsystem using
the MERISE method foragri-cultural research: the
example
of a database for astudy
onperformances
indairy
cows.Agric Syst 38,
149-173Rosner G
(1984) Rapport
relatif a I’etude de fais-abiiite duprojet
de creation d’un centreregional
d’6copathologie
muiti-especes.
GIE Lait Viande RhoneAlpes, Lyon,
pp 74Vet Res (1994) 25, 126-129
© Elsevier/INRA
Sulpice
P(1992) Appropriation
de l’outilinforma-tique
par desenqu6teurs
au cours d’uneexperi-mentation de saisie sur site. Actes 2! Coll Eur
Agrimatica Informatique
ett6l6matique
agri-coles. Methodes et conduites de
projets
Char-bonnibres-les-Bains, France, 2-3 juillet, Tele Promotion Rurale Rh6ne-Alpes, 263-271 Tardieu H, Rochfeld A, Colleti R
(1983)
Lametho-de MERISE. Tome I. Principes et outils. Les
Editions d’Organisation,
Paris, France, pp 320Deer-herd
health and
production
profiling
in New Zealand. I.
Study design
L
Audigé
PR
Wilson,
RS
Morris
Department
ofVeterinary
Clinical Sciences,Massey
University,
Palmerston North, New ZealandSummary ―
A2-year
observationalstudy
was conducted on 15 commercial reddeer farms in the North Island of New Zealand, toprovide
health andproduction
data, andidentify
risk factors for outcomesin-cluding
health,reproduction,
venison and velvet antlerproduction.
About 2 700 hinds, 2 400 wea-ner deer and 1 500 stags were monitored.Daily
managementpractices,
deerperformance
and healthproblems
were recorded. At3-monthly
visits,samples
were collected from deer, pastures and soils. Datawere
processed by
multivariable statisticaltechniques
toidentify
the most important factorsaffecting
production
and health.epidemiology
/ observationalstudy
/ farmed deer / New Zealand / multivariable statisticalana-lysis
Résumé ― Profil de santé et de
production
dans lesélevages
de cerfsélaphe
enNouvelle-Zélande. 1. Protocole
d’enquête.
Uneenquête
d’observation de 2 années était en cours dans 15élevages
de cerfsélaphe
dans l’ile du Nord en Nouvelle-Zélande afin d’en connaître lesprofils
de santé et deproduction,
et les facteurs de risques associés auxparamètres
de santé, dereproduction,
et deproduction
de venaison et de bois en velours. Environ 2 700 biches,2 400 jeunes
cerfs sevrés et i 500 cerfs étaient individuellement suivis. Lespratiques
d’élevage,
lesperformances
zootech-niques
et lesproblèmes
de santé étaientenregistrés quotidiennement.
Cerfs, patures et sols étaientprélevés
au cours de visites trimestrielles. Les données étaientanalysées
par méthodes statistiques multivariées.épidémiologie lenquête
d’observationlCervidéslNouvelle-Zélandelanalyse statistique
mut-tivariée*