Chapter 2 Methodological and demographical specifications
2.3 Contributions and focus of this thesis
2.4.1 Data collection
2.4.1.1 Questionnaires at birth
After consenting to participate, caregivers received three paper questionnaires to fill in and return to the researcher (or send back by post with a freepost envelope within 35 days from the infant’s birth, to accommodate for more complicated medical and family situations).
2.4.1.2 Testing sessions at the university
The sessions involving experimental measures (5, 13 and 18 months) took place at the School of Psychology, Cardiff University. To accommodate for both families’ and lab’s time needs, as well as looking for a time at which the infants were most likely to be awake and ready to play, we allowed for a window of ±15 days around the date at which each infant was turning 5, 13 or 18 months, considering their chronological age. The laboratory had been designed to be baby friendly. Infant and caregiver were welcomed by the experimenter into a lounge area with comfortable sofas, refreshments and toys. An initial warm up period (usually between 5 and 20 minutes long) helped both the infant and the caregiver to get accustomed to the new situation. After taking the infants’ measurement (weight, length and head circumference - only at the 13 and
18 months visits), the experimenter played with the infant while giving details to the caregiver about the upcoming tasks. Once the experimenter deemed that the infant was alert and settled, everybody walked to a nearby testing room and data collection began. At 13 and 18 months, since the testing sessions were longer, a break was introduced halfway through (after the first 15/20 minutes). The infant and caregiver were walked back to the lounge area, so that the infant could have some refreshments and play freely at his/her own pace for a few minutes, while the testing room was set up for the
upcoming tasks (see Table 2.2).
Table 2.2
Administration order and duration for the 13 and 18 months testing session at the university. warm up (lounge) AnotB non search AnotB search boxes task mouse house task break (lounge) head vs. gaze task ESCS short version freeplay session (lounge)
10 min 5 min 5 min 5 min 5 min 10 min 2 min 10 min 15 min 10 min
Note. The total testing time was ~ 50min, with the total time spent at the university being ~ 1h 30min.
The tasks were always administered in the same order for all participants at both 13 and 18 months.
Design
At both 13 and 18 months sessions, the tasks’ administration order was the same for all participants. This configuration had been chosen so that the experimental tasks that mostly required the establishment of a positive interaction between the
experimenter and the infant were not administered at the beginning of the testing session. Moreover, hypothesised shorter attention span and easier tendency towards fatigue and fussiness from the preterm sample (van de Weijer-Bergsma, Wijnroks, & Jongmans, 2008), dictated the choice of placing the most cognitively demanding tasks at the beginning of the session and the ones with adjustable administration pace at the end.
mixed different counterbalancing orders conditions (for details see the Methods sections describing each task, in the following chapters). Each infant was tested following the same counterbalancing combination both at 13 and 18 months, to allow for more straightforward cross-age comparisons.
Video coding, scoring and reliability
The testing room was equipped with 4 cameras (two Sony Mini DV DCR- PR110E and two Sony HQ1 500 TVL vari-focal bullet cameras) that fed into an
XVision 4 Channel Colour Quad. Audio feed came from a beyerdynamic MPC 66 VC SW boundary microphone into the Phonic MM1202a sound mixer. The room measured 350 x 430 cm and a rug in the centre (133 x 190 cm, padded with a blanket and covered with an easily washable black bed sheet) marked the area that was best captured by the cameras.
Video coding was carried out with the INTERACT software (version 9.1.2; Mangold, 2010) on PCs running Windows XP. A set of general rules established the ground for the coding style to reduce variability and minimise personal interpretation in scoring the behaviours. For example: a) be conservative: if a gesture is not well defined, it may not be rateable. It is better to not rate a gesture than to categorize it haphazardly without sufficient information; b) use the code when the described action begins, without considering the necessary movements before it (i.e., “touching” starts when the finger touches x, not when the hand starts moving towards x).
When trained to use a new set of codes, both coders would work on the same 4 to 6 videos and run an inter-rater reliability check. Thorough discussion would
highlight the misunderstandings towards the wording of the codes’ operational
definitions, potential infant’s behaviours that were unmatched by the codes and similar issues that needed solving before continuing. This phase was repeated until both coders would feel comfortable with the codes and inter-rater reliability was achieved. The
main coder would then work on the entire sample, with the secondary coder working on a randomly chosen 25% of the videos. As the two coders would concurrently work through the sample, regular inter-rater reliability checks would make sure that the codes were still applied in the same fashion by both researchers, avoiding coding drift.
The inter-rater reliability was measured with Cohen’s kappa (Cohen, 1960) using the built-in INTERACT tool for calculating it on different classes of codes. As Hollenbeck (1978) pointed out, since this statistic corrects for chance agreement, it produces lower values than the percentage agreement statistics often reported in the literature.
All Cohen’s kappa (k) values for reliability are reported in the individual tasks’ method sections ahead. According to the characterisation of different k values ranges by Landis and Koch (1977), k > 0.75 may represent excellent, 0.40 < k < 0.75 may represent fair to good and k < 0.40 may represent poor agreement beyond chance.
2.4.1.3 Questionnaires at 5, 13, 18 and 24 months
Questionnaires were collected for each of the three testing sessions carried out at the University. For the 5 months session, the caregiver was given a printed version of the questionnaires and was asked to complete them in the laboratory, after the testing was finished. Since the duration of the session was greater at 13 and 18 months, an online version of the questionnaires was set up to a hosting website (SurveyMonkey at first and later on Qualtrics). Caregivers were sent a link one week before the university appointment, so that they could more comfortably complete them at their convenience. In case caregivers were not able to complete the questionnaires on the day at 5 months, or online at the other visits, they were provided a freepost envelope with the printed questionnaires, so that they could easily return them after the testing session. The same online method was used to collect questionnaires at 24 months, age at which no testing appointment was due.
2.4.1.4 Testing sessions at participants’ homes
At 18 months, a second testing session was scheduled after the one in the university laboratory, usually the subsequent week, in order to administer the Bayley Cognitive Scale (Bayley, 2006). To avoid asking the families to travel twice to the university, this session took place at the participants’ homes. One of three research assistants, or the main experimenter, administered the task usually in the front room of the house, or any room the caregiver considered big enough. A foldable table and chair, plus two toddler size stools, were brought to every home to guarantee that the task was carried out in a comparable environment. A compact camcorder with a wide-angle lens (Veho VCC-005-MUVI-HD7 Muvi HD Mini Camcorder) was used to record the sessions. A set of toys, books and props are supplied with the test.
Scoring and reliability
Scoring is done during the testing, marking a 1 (pass) or a 0 (fail) for each task based on the child’s performance. For the administration to be discontinued, the child needs to fail 5 tasks in a row. The total number of passed tasks constitutes the raw score of the scale. A portion of the test (20% per research assistant) has been secondary marked by the main experimenter and showed a very high correspondence.