• No results found

Chapter 3. Literature Review

4.6 Novel Language Coding: The Language ENvironment Analysis ‘LENA’ Software

Aside from report measures, manual observation and coding schemes of language, an

alternative and primary method of language analysis is transcription analysis of recorded data. Transcription allows for the measurement of language at a more fine-grained level,

identifying characteristics of language and vocabulary used by children. It is essentially the representation of language in written form. It may be considered under the category of observational measurement as it involves researchers transcribing audio data and assigning meaningful codes to the data which relate to specific characteristics of language or the vocabulary spoken by the child. Language used in communicative interaction may also be analysed using transcription in the form of conversation analysis (CA), whereby

conversational turns may be analysed. However, the greatest limitation of transcription is that it is a lengthy, time-consuming process, which subsequently places delays on retrieving research findings.

In response to the limitations of language transcription, novel automated language analysis software, ‘LENA’, has been developed. The database search for coding schemes of language difficulties previously presented in this chapter did not detect any research literature using LENA. This may be because LENA is typically considered as, and referred to as, a ‘language analysis’ system rather than a ‘language coding’ system. In addition, existing LENA research focuses mainly on language environments of typically developing children or on vocalisations of children with Autism Spectrum Disorder, rather than children with primary language impairment or disruptive externalising behaviour. Furthermore, the database search specified research sampling children of primary school age however current LENA research samples younger children between 2 and 36 months. It is yet to be used with older children.

LENA is an automated vocalisation analysis system which has eliminated the need for laborious techniques of human transcription as a primary analysis of audio data and language assessment. The system enables the analysis of fundamental characteristics of language use such as verbosity of children, and frequencies of initiations and responses in language. Therefore, while human observation is useful in determining the nature and context of language, counting words and initiations, for example throughout an interaction scenario, is laborious, which is where the LENA system offers its main advantage.

63

LENA was initially developed in the USA and was designed in response to Hart and Risley’s (1995) ‘Meaningful Differences’ research, which reported that children who hear more words between birth and age 3 develop more sophisticated language and have increased academic achievement at age 9. The LENA system’s initial use in research showed that the technology was able to replicate these findings through the development of advanced speech recognition algorithms and automatic segmentation of audio files (Montgomery et al., 2009). Since then, the software has continued to improve in its performance through continuous development, to become more widely recognised and used globally in child speech and language research; however, the technology remains relatively unknown in the UK.

The LENA system allows for the automatic processing of audio files to provide information about a child’s vocalisation and their environment. It provides count data for child

vocalisations, adult words, conversational turns, and proportion data about the child’s surrounding environment, and electronic (television) and background noise. These data are produced by the use of a small recording device, the ‘Digital Language Processor’ (DLP), which sits neatly in a chest pocket of specifically designed clothing worn by the child. The DLP can record up to 16 hours of continuous audio data, and so is highly efficacious in its ability to capture language use and environment across different contexts of a child’s day. Data from the DLP is uploaded by USB port onto the software on a laptop or computer and processed automatically with no need for human intervention. LENA provides these data via an Audio Processing System which comprises four components: information flow,

information processing, algorithmic processing models, and professional human transcriptions.

Acoustic properties of the audio are segmented by algorithmic models to identify sounds of varying amplitude and intensity. Feature extraction identifies the source of the audio signal through iterative modelling, which results in eight categories of audio signal: the key child (wearing the LENA DLP); other child; adult male and adult female; overlapping sounds (at least one human); noise; electronic sounds (e.g. television/radio); and silence. Statistical fit of each segment to the selected model (which is compared to a silence model) allows for these to be further specified into clear and unclear or quiet/distant noise sub-categories. Statistical fit is based on the likelihood or certainty of the sounds’ initial classification into one of the above eight categories (Ford et al., 2008). LENA categorises these eight audio signals to produce four key reports about a child’s language environment. These four reports are: key child vocalisation count, adult word count, conversational turn count (between adult and key child),

64

and audio environment information (TV and electronic media, proportions of meaningful and distant sound, noise, silence and background noise).

The audio processing models were trained and optimised for accuracy by professional audio transcriptions that identified a variety of segments from audio accurately and reliably so as to differentiate between them, for example between key child speech and adult speech, or non- speech sounds. After these segments are identified, further iterative processing generates key LENA data, distinguishing speech sounds (words, babbles, communicative speech sounds) from non-speech sounds (fixed signals which refer to emotional responses to an event such as screams, and vegetative sounds which result from respiration and digestion). Therefore, a vocalisation in LENA terms refers to meaningful speech sounds made by children. This is important to note as the vocalisation definition given in Chapter 1 (p.3) incorporates both communicative speech sounds and non-speech sounds. Statistical modelling estimates the number and duration of vocalisations produced and conversational turns between child speech and adult speech. Ultimately, within the LENA DLP the audio data are segmented into the key reports by a three-step process: feature extraction, preliminary classification (into

appropriate category) and segment identification (confirmation of classification via statistical fit) (Xu et al., 2008).

In addition to the four key reports produced by the primary LENA software, LENA is

accompanied by additional analysis software called the Advanced Data Extractor or ‘ADEX’ (www.lenafoundation.org). ADEX allows for the selection of finer audio characteristics to be analysed beyond those of the four key reports. For example, within conversational turns it can identify which speaker initiated the turn and who the turn was between. It may also provide duration information for each turn or vocalisation, type of vocalisation such as cries,

proportions of overlapping noise in an audio (co-occurring audio such as multiple speakers), uncertain noise, vegetative noise/signal, proximity of vocalisation (near or far) and average signal level data (db). By exporting data from LENA, ADEX produces an Excel file with this additional audio data for each child.

The LENA system’s reliability was reported by Xu, Yapanel and Gray (2009) by comparing LENA counts with that of human transcription. The system’s differentiation between speaker vocalisations was shown to match human transcribed codes for 82% and 76% adult and child vocalisations respectively. As regards adult word counts, a significant positive correlation is reported between LENA counts and human transcribed counts (r = .92, p < .01). LENA mean adult word count was on average 2% lower than human counts. The identification of a child

65

vocalisation versus a non-speech child sound was then compared, with agreement between LENA and human transcription at 75% and 84% respectively. Each of these key measures of LENA reports good levels of reliability in comparison to human transcription. However, the authors state that the most deviance in agreement between LENA and human transcription occurred in audio segments where there was a significant amount of noise and overlapping speech by multiple people; therefore, this would be an important area for caution in

employing LENA in research, since a quieter environment would perhaps generate more reliable count data.

4.6.1 Exploring the Language ENvironment Analysis (LENA) software in research

The LENA system appears to offer advantages over traditional manual transcription for the measurement of conversational language used by children. In order to begin to identify the LENA profiles of children with different disorders that are currently reported, and explore the degree to which these might differentiate between disorders, existing research using the LENA technology was reviewed.

As the LENA software is still a relatively new development in the research field, there is little existing research using the technology. Investigations using LENA with typical children have very much focused on the influence of language environments on language development (Gilkerson and Richards, 2008; Christakis et al., 2009; Montgomery et al., 2009; Zimmerman et al., 2009). Key findings include that adult–child conversational turns are associated with healthy language development (Zimmerman et al., 2009) and that exposure to audible television is related to reductions in exposure to adult speech and child vocalisations

(Christakis et al., 2009). Gilkerson and Richards (2008) looked at the effects of family socio- economic status, gender, and birth order on language learning of 329 typical children and found children from higher socio-economic households were exposed to more adult words and conversational turns, girls hear more words than boys overall, and first-born sons hear more words and engage in more conversational turns than latter born sons, with no such difference occurring for girls. Such evidence has provided significant insight into the

importance of environment on language development. Use of the LENA system, it seems, has the potential to contribute valuable information to the fields of child development and

language acquisition. Each of these studies of typical children included children of the same age, between 2 and 48 months. Considered altogether, they begin to indicate what an

optimum language learning environment for children during the first three years of life looks like, and what aspects of an environment may be detrimental to learning. It is questionable,

66

however, what the effects of language and environment are on development after these first three years; once the child has acquired language, does their own interaction affect

development and does their environment continue to influence development?

The majority of research using LENA with clinical samples of children has focused on children with ASD (Oller et al., 2010; Warren et al., 2010; Xu et al., 2012). Warren et al. (2010) used LENA key reports to investigate vocalisation differences between children with ASD and typically developing children, aged between 16 and 48 months. Compared to the typically developing group, the ASD sample demonstrated 26% fewer conversational turns and 29% fewer speech-related utterances. The ASD sample also had a higher frequency of speech monologues than the TD sample. This may be related to the fact that children with ASD often display inappropriate initiations of topics in which they are most interested, and can talk repetitively about their interests, meaning that their speech was non-social and responded to less by adults, and so did not become conversational. Furthermore, adult–child turns were negatively correlated to ASD symptom severity; therefore, fewer conversational turns were occurring with children with more severe ASD. This reflects the findings of Zimmerman, Gilkerson, Richards et al. (2009), who suggest conversational turns generate a healthy language learning environment. However, the reduced engagement in conversational turns in children with ASD is likely to be due to difficulty with social communication, rather than, as Gilkerson and Richards (2008) suggest, lower socio-economic status of the families. The families recruited into the Warren et al. (2010) study were all highly educated, a

limitation pointed out by the authors as results cannot be generalised to families with lower educational status. Further limitations of this study included the fact that their typically developing sample recordings were not collected concurrently with the ASD sample. Instead, data from a previous project were used as comparison; therefore, there existed differences in the number and length of recordings between each group. Finally, the data are correlational, and so no causality can be suggested.

Similar investigation using LENA analysis at a finer level looked at the acoustic features of the audio recordings. Oller et al. (2010) used LENA to compare vocalisation syllabifications between children with Autism, children with language delay, and typically developing children aged between 10 and 48 months and found that the system could differentiate between these three groups on the basis of each of their speech syllabification patterns. Likewise, Xu et al. (2009) found that the LENA acoustic vocalisation features could identify children aged 24 months and above who were at risk of ASD. Such findings are the result of

67

investigating LENA recordings at the finest level, which goes beyond its standard automatic generation of key reports about vocalisation. They indicate that children with different diagnoses may be differentiated by their acoustic vocalisation characteristics, which in turn may have implications for the early detection of behaviour and/or language difficulties. This literature on LENA and ASD samples seems to suggest children with ASD may have different vocalisation profiles from typical children, and provides scope for future early detection of ASD, although no research has yet pinpointed the age at which vocalisations become different across groups. Could it be that the same is true for children with behavioural, emotional and social difficulties, that their vocalisations are different from those of typical children? Furthermore, this research, like much of the normative sampling LENA research, has provided insights into the environmental influence of the children, with the number of adult word counts to children being higher in high socio-economic status households. Such research provides novel information on the vocalisation characteristic differences that may exist

between clinical and typical samples of young children. However, in investigating the language ability of children with ASD, we would perhaps expect such a difference, as children with ASD are typically delayed in all areas of language use including pragmatic language, the ability to use language for social communication. Where LENA studies have provided matched controls for children with ASD they have not accounted for this. Yet what the LENA system has allowed for is the quantitative measurement and empirical evidence of such difference.

4.6.2 Summary of the Language ENvironment Analysis (LENA) software

LENA has considerable strengths in measuring the vocal characteristics of typical and clinical samples of children. The vast majority of successive LENA research since Hart and Risley’s (1995) initial study has followed suit and recruited children from 2 to 36 months and

primarily children with ASD. Its application to older children and with those with BESD has not been explored. A limitation of existing LENA research is the lack of consideration of non- verbal behaviour and how this may contribute to communication or language learning; non- verbal behaviour is equally as important as vocalisations in social communication and has the potential to influence learning through non-verbal cues provided by parents or caregivers. Adding the opportunity to record the nature of vocalisations as well as non-verbal behaviour alongside LENAs key reports would enhance the investigation of language and

communication in children from an alternative population (those with BESD), as well as typical children. Furthermore, while the LENA system is able to provide count data for

68

vocalisations, it cannot describe the nature of vocalisations or transcribe language used. It also cannot provide information about linguistic complexity, and so a child who might speak several short but repetitive words may be considered to be at the same developmental stage as a child who speaks using more full complex sentences. Considering the nature of speech alongside count data could improve measurement of language used in a social interaction context.